Call/WhatsApp: +1 914 416 5343

## SAT Scores

The SAT is a standardized college entrance exam taken by many high school students across the United States. In certain regions, the SAT is a popular exam used in college admissions, while in other regions the ACT or other metrics are used for college admissions. Thus, the percentage of students taking the SAT varies greatly by state and region. The file stateSATscores.csvPreview the document contains average SAT scores for each of the 50 states in the USA for the year 1997 along with several other variables (State name; Expenditure – expenditure per pupil in average daily attendance in public elementary and secondary schools; PT.Ratio – the average pupil to teacher ratio in public schools; Salary – the estimated average salary of public school teachers in the state; PercentSAT – the percentage of students electing to take the SAT exam; Verbal – the average Verbal composite score; Math – the average Mathematics composite score; and SAT – the average composite SAT score).

Perform the following with this data:

Construct a scatterplot where the x-axis it the Percent taking the SAT exam and the y-axis is the average composite SAT score. Describe the relationship you see.

Fit a simple linear regression model where the response is the composite SAT score and the predictor is the Percent of students taking the SAT. Construct residual diagnostic plots of that fit. What do you notice in the Residuals vs Fitted plot?

Create a new variable to the dataset that is the square root of the Percent of Students taking the SAT exam.

Fit a multiple regression model where the response is the composite SAT score with two predictors: the percent of students taking the exam and the square root of that percentage (the variable you created in part 3). Construct residual diagnostic plots of that fit. What do you notice in the Residuals vs Fitted plot?

Determine if the fitted model in part 4 is a significant model to predict composite SAT scores. (reference specific output)

What percentage of the variability in composite SAT scores is explained by the model you fit in part 4? (reference specific output)

We are interested in determining if student expenditures, pupil-teacher ratios, and teacher salary influence SAT scores when accounting for the percentage who take the SAT. Fit a multiple regression model where the predictor variables include the percent taking the SAT, the square root of that percentage (created in part 3), the expenditures, pupil-teacher ratio, and teacher salary. Compare/contrast this fitted model to that in part 4. Do you feel that expenditures, pupil-teacher ratios or teacher salary predict SAT scores once accounting for the number of students taking the SAT exam? Reference specific output from the fitted model to address this question.