Does defense win football games? There has long been speculation about how to win football games in college
- Does defense win football games? There has long been speculation about how to win football games in college football. The main question centers around whether defense wins football games, or whether you can win with offense alone. In this problem, we will fit a simple linear regression model in an effort to better understand the relationship between defenses and winning football games.
The file fbs2012_defense.jmp contains a defensive summary for each of the 120 schools in the FBS for the 2011-2012 season. Using this file, fit the simple linear regression model relating wins (Wins, \(y\) ) to yards allowed per game (Ydspgm, \(x\) ). For directions on how to use JMP to fit a simple linear regression model see the tutorial Least-Squares Use the resulting JMP output to answer the following questions. Make sure to print the output and turn it in with your assignment.
- Using the scatterplot, describe the relationship between wins and yards allowed per game. Do there appear to be any outliers?
- Report the fitted least squares regression equation.
- Report the coefficient of determination, \(R^{2}\), and interpret this value.
- Report the sample correlation, \(r\).
- Interpret the estimated slope within the context of the problem.
- Interpret of the estimated \(y\) -intercept within the context of the problem. Does the estimated value make sense? Explain briefly.
- Use the fitted least squares regression equation to predict the number of games a team allowing 300 yards per game will win.
- Calculate the residual for Iowa State. Does this indicate that Iowa State exceeded expectations or fell short based on their defensive ranking gauged by yards allowed per game?
2. Here is JMP output from a least squares linear fit of Camaro value in dollars (variable price) based on the number of miles driven in thousands (variable mileage). The units for price are "dollars" and the units for mileage are "thousands of miles". Use this JMP output to answer these questions.
- In general, what does a plot of residuals versus \(x\) (mileage in this problem) look like when there isn't a problem with the adequacy of a regression fit model?
- Does the residual vs. mileage plot given in the JMP output suggest that this linear fit regression model is adequate? Make sure to support your answer with reasons given in class.
- In general, what do we look for in either a normal quantile-quantile plot (a normal probability plot) of residuals or a histogram of residuals when diagnosing the adequacy of a regression fit model?
- Does the normal quantile-quantile plot given in the JMP output suggest that this linear fit regression model is adequate? Make sure to support your answer with reasons given in class.
- Suppose we use least squares to fit a quadratic curve instead of a line. How many parameters will we need to estimate in the quadratic model?
- The coefficient of determination, \(R^{2}\), for the quadratic fit model is $0.504$. Give an interpretation for this value.
- One observation is a low-mileage Camaro with 8,000 miles but is only worth $\$ 6,846$. What do you think happened to this car?
3. Commercial properties. A commercial real estate company evaluates vacancy rates, square footage, rental rates, and operating expenses for commercial properties in a large metropolitan area in order to provide clients with quantitative information upon which to make rental decisions. The data contained in the file commercial_properties.jmp are taken from 81 suburban commercial properties that are the newest, best located, most attractive, and expensive for five specific geographic areas including:
\(y\) - rental rates (in dollars per square foot per year)
x1 - the age (in years)
x2 - operating expenses and taxes (in dollars per square foot per year)
x3 - vacancy rates (\%)
x4 - total square footage
Using JMP, fit a multiple linear regression model using all four predictors of the rental rate.
- Report the fitted least squares regression equation.
- Report the coefficient of determination, \(R^{2}\), and interpret this value.
- Interpret the estimated slope for age within the context of the problem.
- Interpret the estimated slope for total square footage within the context of the problem.
- Plot the residuals against the predicted values. What does this plot indicate about the appropriateness of the regression model? You must turn in your plot with your answers to receive full credit.
- Make a normal probability plot of the residuals. What does this plot indicate about the appropriateness of the regression model? You must turn in your plot with your answers to receive full credit.
- Fit another multiple regression model without the vacancy rate, $x3$, but with the other three predictors. Report the fitted least squares regression equation and the coefficient of determination, \(R^{2}\), for this model.
- Based on the coefficient of determination, which model do you prefer? Explain briefly.
Deliverable: Word Document
