(Solution Library) [20 marks] The file GRP.TXT contains the data on the following variables for a particular region in New Zealand: Name Description NoCons Number


Question: [20 marks]

The file GRP.TXT contains the data on the following variables for a particular region in New Zealand:

Name Description
NoCons Number of building consents issued for new dwellings
ValCons Value of building consents issued for new dwellings (million $)
Unemp Number of registered unemployed
House Number of dwellings sold
Car Number of New Car Registrations (contains missing data)
Exp Value of exports (million $)
Imp Value of imports (million $)
GRP National Bank Index of Gross Regional Product (GRP)
  1. Obtain the correlation matrix for the variables and the matrix plot of the variables. Discuss
    the uses of these two outputs in the context. [4 marks]
  2. Regress GRP on all the explanatory variables and obtain the full regression output. Discuss the statistical significance of the regression coefficients considering the P values. What conclusion would you draw from the Analysis of Variance part of the regression output? Explain your answers in the context. Perform suitable residual diagnostics and discuss the implications. [6 marks]
  3. Carry out a complete forward stepwise regression of GRP on the explanatory variables. Also
    perform a complete backward predictor elimination procedure . Compare the outputs.
    Which step (model) you will recommend for predicting the GRP? Explain your answer.
    [5 marks]
    (d) Explore the appropriateness of polynomial regression in the context. [5 marks]
    (b) The following regression results are obtained:

    The multiple regression model is:
    GRP = 101 - 0.117 NoCons + 1.37 ValCons - 0.000144 Unemp + 0.0319 House
    + 0.0048 Car - 0.0118 Exp + 0.0150 Imp
    This model is significant overall, F(7, 15) = 11.17, p = 0.000 . Also this model explains approximatelty 76.4% of the variation in GRP, which indicates that it is a relatively good model. Notice that only NoCons (p = 0.012), ValCons (p = 0.010) and House (p = 0.020) are individually significant. All the other predictors are not individually significant.

    Notice that the normal probability plot and the histogram of residuals don’t exhibit any clear pattern indicating a lack of normality of residuals. On the other hand, the plot of residuals versus predicted values doesn’t show any pattern indicating a serious heteroskedasticity problem.
    (c) Forward stepwise regression

    Using forward stepwise selection, we find that the best model only includes House as a predictor and the model is
    GRP = 92.02 + 0.0553*House
    This model explains 70.81% of the variation in GRP.
    Backward stepwise regression


    Using forward stepwise selection, we find that the best model only includes House as a predictor and the model is
    GRP = 96.56 – 0.102*NoCons + 1.37*Valcons + 0.0358*House
    This model explains 77.85% of the variation in GRP.
    Based on the standard error and the amount of explained variation, the "best" model is
    GRP = 96.56 – 0.102*NoCons + 1.37*Valcons + 0.0358*House
    (d) Based on the matrix plot, a polynomial regression approach wouldn’t be justified, considering that none of the predictor as a clear non-linear (quadratic, cubic, etc) pattern when plotted against GRP.

    Question 2 [30 marks]

    An experiment was conducted to relate Yield in a chemical plant to temperature and pressure.
    The following table gives the experimental data, which was originally published in the text "Introduction to Linear Models and the Design and Analysis of Experiments" by Mendenhall, W., Duxbury Press.
    You need to enter the data manually to perform the analysis.
    Pressure Temperature Yield
    50 100 21
    50 200 23
    50 300 26
    80 100 22
    80 200 23
    80 300 28
    50 100 22
    50 200 23
    50 300 27
    80 100 21
    80 200 23
    80 300 27
    1. Discuss the basic principles of experimentation in the context. You need to discuss how the experimenter would have applied the basic principles, for example, how the principle of randomisation must have been applied. [6 marks]

(b) Perform one-way ANOVA tests to see whether there is any temperature or pressure effect.

Discuss your answer stating the limitations of this test. [4 marks]

(c) Perform two-way ANOVA tests (with and without interactions) to see whether there is any temperature and/or pressure effects. Explore the residuals of the fitted models and suggest whether or not you obtain any clues for improving the model. [8 marks]

(d) Perform a multiple linear regression of Yield on temperature and pressure. Interpret the t

and F-tests done in the context. [4 marks]

(e) Compare the regression and ANOVA analyses and comment. Which approach is reliable,

and why? Explain your answer in the context. [4 marks]

(f) Build a more appropriate model relating Yield with temperature and pressure. Note that this question is open ended, and you need to provide necessary justifications in your answer. [4 marks]

Price: $2.99
Solution: The downloadable solution consists of 12 pages
Deliverable: Word Document

log in to your account

Don't have a membership account?
REGISTER

reset password

Back to
log in

sign up

Back to
log in