Given the wealth of available information on the prices of boats on the Internet, your project is to develop
Given the wealth of available information on the prices of boats on the Internet, your project is to develop a multivariate regression equation using the price of a used boat as your dependent variable. The independent variables on which to gather data include:
- Age of boat
- Length of boat
- Motor size
- Number of engines
- Bowrider vs. center console
- Brand of boat (Bayliner, Searay, Grady White
- Motor type (inboard vs. outboard)
And any other variables you feel relevant to the analysis.
The objective of this project is to build the best regression equation possible, given the data sampled. You must randomly sample at least 25 boats for each brand. Be sure that your statistical analysis includes at a minimum the following:
- an original correlation matrix
- hypothesis tests of each independent variable included in your final equation.
- any relevant statistical plots.
- an evaluation of your overall model.
- interpretations of the coefficients of the significant independent variables in the model.
- a discussion of any problems encountered in building the model and how they were resolved. (if applicable).
-
a discussion of how the model could be used and any suggestions for improving the model if possible if the project was repeated in the future. Be sure that in your discussion you describe the source of your data, the date sampled, etc.
Project Issues
Just a few guidelines so you can avoid some common problems on the miniproject.- you should be including higher order terms like rate of increase and interaction. The real world rarely moves in a straight line. Theoretically, you would not expect age to be linear for example. As a boat gets older it depreciates, but it will eventually level off. The same with length. As far as interaction, the impact of motor size or age on price could depend on length. Theoretically most large boats have large motors, so we may not find much of an effect in large boats. But we see a lot more variance in motor size with smaller boats. We would expect to see a bigger impact in these boats. Now you may not find that any of these are significant in your sample, that’s ok as long as you tried them. To code higher order terms you use Transform, Compute on the menu bar. Then just name your variable and multiply the variable by itself (for square term) or by another variable (for interaction). SPSS will place the new variable in the first blank column.
- Don’t use first person in your writeup. This should be like a research paper.
- To determine whether a bigger or smaller model is significant you should use the test of reduced vs. complete models that we had on the first test. That’s why we covered it.
Common dummy variables for this project are type of brand, bowrider, motor type, etc.
6) If you have references (and you should at least for data collection), then reference them. On the first case not many of you actually referenced your references.
7) If you have higher order terms don’t worry about correlations or VIFs between them and their lower order term. Of course a square term is going to be correlated with its lower order term. Multicollinearity rules don’t count then.
Deliverable: Word Document