The following three scenarios are each based on the IQ scores and the SAT Math scores for 26 12 th -graders.
- The following three scenarios are each based on the IQ scores and the SAT Math scores for 26 12 th -graders. For each scenario I have regressed SATM scores on IQ and have provided the relevant scatterplot, the studentized deleted residuals, the leverages and the Cook’s distances. In each of the three scenarios, there is one observation that stands apart from the others in some way.
For each of the three scenarios:
- Identify (in terms of IQ and SATM scores) from the scatterplot, the way in which the observation differs from the others.
(b Indicate which 12 th -grader you identified in part (a) and how the values of these three diagnostic measures for the observations you identified in part (a) correspond to the descriptions you gave there.
(c) Predict how the slope of the least-squares regression line would change (increase in value, stay roughly the same, or decrease in value) if the observation you identified in part (a) was not included in the data set
2. A zoologist recorded the average body weight (in kilograms) and the average brain weight (in grams) for 62 species of animals. The data are included in the data set Species.sav .
- Obtain a scatterplot of brain weight against body weight. How would you characterize the plot? There are three ‘unusual’ observations. Identify the corresponding species.
- Obtain the regression line relating brain weight to body weight. Save the leverages, the studentized deleted residuals, and the Cook’s distances. Explain on the problems caused by the three ‘unusual’ species.
3. Refer again to the Species data.
- Create two new variables, LnBody—which is the natural log of BodyWeight and LnBrain—which is the natural log of BrainWeight. Obtain a scatterplot of LnBrain against LnBody. Comment on the relationship between these two new variables. How does this graph compare with the one you obtained in the last question?
- Obtain the regression line relating LnBrain to LnBody, saving the leverages, the studentized deleted residuals, and the Cook’s distances. Comment on the values of these three measures for the three ‘unusual’ species.
- Use the regression line you obtained in part (b) to predict the brain weight for an adult horse. The average weight of an adult horse is 450 kilograms.
- Check the validity of the three conditions for inference in your model in part (b).
4. Open the TVTimes.sav data set.
- For the linear model relating TVTimes to Education and Companion, check the Normality, linearity and equal standard deviation assumptions.
- For this model, are there any X outliers in the data set? If so identify them? Are there any Y outliers? If so identify them?
5. Open the data set Fja. sav .
- (i) Regress Y1 on X (ii) Regress Y2 on X (iii) Regress Y3 on X (iv) Regress Y4 on X4. What do you notice about the values for b 0 , b 1 , and the value for 100r 2 in the four regressions? [There is no need to show any of the output]
-
Obtain (i) a scatterplot of Y1 against X (ii) a scatterplot of Y2 against X (iii) a scatterplot of Y3 against X, and (iv) a scatterplot of Y4 against X4. In each case, characterize the relation between X and Y.
- For each of the four regressions, obtain the Studentized deleted residuals, the leverages and the Cook’s distance. In each of the four cases, comment on how these values reflect your characterization in part (b). In one case, SPSS will not compute these diagnostic measures. Can you explain why?
[This data set was named after Francis Anscombe who created the data.]
Deliverable: Word Document
