Assignment Note: All answers must completed in SAS. All SAS input and output must pasted in to the solutions.


Assignment

Note: All answers must completed in SAS. All SAS input and output must pasted in to the solutions.

For the following problems use the computer science data. Ple ase use the data set csdata.dat (attached). The variables are: id, a numerical identifier for each student; GPA, the grade point average after three semesters; HSM; HSS; HSE; SATM; SATV, which were all explained in class; and GENDER, coded as 1 for men and 2 for women.

  1. In this exercise you will illustrate some of the ideas related to the extra sums of squares.
  1. Create a new variable called SAT which equals SATM + SATV and run the following two regressions:
  1. predict GPA using HSM, HSS, and HSE;
  2. predict GPA using SAT, HSM, HSS, and HSE.
    Calculate the extra sum of squares for the comparison of these two analyses. Use it to construct the F-statistic - in other words, the general linear test statistic - for testing the null hypothesis that the coefficient of the SAT variable is zero in the model with all four predictors. What are the degrees of freedom for this test statistic?
    b. Use the test statement in proc reg to obtain the same test statistic. Give the statistic, degrees of freedom, p-value and conclusion.
    c. Compare the test statistic and p-value from the test statement with the individual t-test for the coefficient of the SAT variable in the full model. Explain the relationship.
    2. Run the regression to predict GPA using SATM, SATV, HSM, HSE, and HSS. Put the variables in the order given above in the model statement. Use the SS1 and SS2 options in the model statement.
    1. Add the Type I sums of squares for the five predictor variables. Do the same for the Type II sums of squares. Do either of these sum to the model sum of squares? Are there any predictors for which the two sums of squares (Type I and Type II) are the same? Explain why.
    2. Verify (by running additional regressions and doing some arithmetic with the results) that the Type I sum of squares for the variable SATV is the difference in the model sum of squares (or error sum of squares) for the following two analyses:
      1. predict GPA using SATM, SATV;
      2. predict GPA using SATM.
        3. Create an additional variable called HS that is the sum of the three high school scores (HSE + HSS + HSM). Run the regression to predict GPA using a variety of variables, including HS and SAT, as described below. Summarize the results by making a table giving the percentage of variation explained by each of the following models:
        a. SATM as the explanatory variable
        b. SATV as the explanatory variable
    3. HSM as the explanatory variable
    4. HSS as the explanatory variable
    5. HSE as the explanatory variable
    6. SATM and SATV as the explanatory variables
    7. SAT (=SATM+SATV) as the explanatory variable
    8. HSM, HSS, and HSE as the explanatory variables
    9. HS (=HSM+HSS+HSE) as the explanatory variable
    10. SATM, SATV, HSM, HSS, and HSE as the explanatory variables
    11. SAT and HS as the explanatory variables

(Please do not include the SAS output for all these models. Only the value is needed. Note that you can run proc reg with multiple model statements to save typing.)

4. A data set contains 50 observations. There are 4 explanatory variables: A, B, C, and D. Use the following results:

  1. Obtain an 85% confidence interval for 𝛽𝛽1 (the coefficient for A).
  2. You wish to test 𝐻0∶𝛽4 = 0 vs. 𝐻𝑎∶𝛽4 ≠ 0. That is, you wish to determine if variable D provides significant power for Y when variables A, B, and C are already in the model. Obtain the test statistic for this hypothesis test and determine if you would accept or reject the null hypothesis (𝛼 = 0.05). You should give either a critical value or a p-value to support your conclusion.
  3. Obtain an 85% confidence interval for the mean (expected) response when A = 40, B = 20, C = 50, and D = 30.

d. Obtain a 85% prediction interval for a single response when A = 40, B = 20, C = 50, and D = 30

Price: $22.47
Solution: The downloadable solution consists of 11 pages, 1147 words and 9 charts.
Deliverable: Word Document


log in to your account

Don't have a membership account?
REGISTER

reset password

Back to
log in

sign up

Back to
log in