[Step-by-Step] Refer to the Baseball 2005 data, which report information on the 30 Major League Baseball teams for the 2005 season. Let the number of games won
Question: Refer to the Baseball 2005 data, which report information on the 30 Major League Baseball teams for the 2005 season. Let the number of games won be the dependent variable (Wins) and the following variables be independent variables: team batting average (Batting), number of stolen bases (SB), number of errors committed (Error), team ERA, number of home runs (HR), and whether the team’s home field is natural grass or artificial turf (Surface =1 if artificial, =0 if natural).
- Write out the regression equation, p-values and R-square. Discuss each coefficient of the variables and interpret your results.
- Do you see any problems with multicollinearity?
- Rerun the analysis using the stepwise regression. Which variable contribute the most to explain the dependent variable, by how much?
- Rerun the analysis using the best regression model in the analysis. Which model is the best model? Would you consider deleting any of the variables? If so, which ones? Identify these variables.
Deliverable: Word Document 