Suppose you collect survey data from graduate students at University. You obtain gender, current grade
-
Suppose you collect survey data from graduate students at University. You obtain gender, current grade point average, college grade point average, family income, number of dependents, and parents’ education levels. In addition, you ask and obtain the answer to the following question: "On how many days within the last 30 did you consume the equivalent of a six pack of beer?" Moreover, you are able to link class attendance data from the administration office to the survey data.
- Write down a population model that would allow you to test two hypotheses at once: (1) heavy drinking does not affect graduate school GPA, assuming all else is equal; (2) skipping classes does not affect graduate school GPA, assuming all else is equal. You believe that one more day of heavy drinking should have a constant effect on GPA and that skipping classes have a diminishing effect on GPA.
- What statistical test do you need to perform to answer the first hypothesis? What statistical test do you need to perform to answer the second hypothesis?
- Augment your model above to allow the effect of heavy drinking on GPA to depend on whether the student is single (i.e., no dependents). State the null hypothesis that there is no difference in the effect of heavy drinking on GPA by single status.
- Use the coefficients from the model you build in (iii) and fill out this difference-in-differences table for grad GPA. For simplicity, ignore the skipping class variable. Show that the difference-in-differences estimate is indeed the coefficient on your interaction term
- What other factors could influence graduate school GPA? Is it a problem that we are not including them in the model? Why or why not?
-
Use the data in 401KSUBS for this exercise.
Estimate the model below and answer the following questions:
nettfa
=
B
0
+ B
1
e401k + B
2
inc + B
3
inc
2
+ B
4
fsize5 + u
where
fsize5
=1 if family size>5; and otherwise.
-
Based on model what is the partial derivative of
nettfa
with respect to
inc
(i.e., what is
nettfa/
inc )? What is the effect of inc on nettfa at the sample average value of inc ?
- What is the interpretation of the coefficient on fsize 5?
- I want to test the hypothesis that the effect of income on nettfa differs between small families (families with fewer than 5 members) and large families (families with 5 or more members). What model should I estimate to answer this hypothesis?
- What is your conclusion? Present all of your steps.
- Estimate a linear probability model that explains 401(k) participation (p401K) in terms of income, age, and gender. Your model should assume that there is a quadratic effect of income and age on participation. Interpret the partial effect of income and gender on the 401(k) participation rate. NOTE: Only people who are eligible to enroll in 401(k) can participate in such a plan. Make sure you make the appropriate data exclusion.
- In our data, only 39% of workers are eligible for 401(k). If companies now allow every worker to participate in 401(k), can you apply the estimation results you got from above to predict the participation rate of this newly eligible population? Why or why not?
-
Based on model what is the partial derivative of
nettfa
with respect to
inc
(i.e., what is
-
Use the data in HPRICE3
for this exercise
.
- Estimate the model below and present your findings in equation form: 3.1 log( price ) = B 0 + B 1 land + B 2 area + B 3 rooms + u where land =square footage lot area =square footage of house rooms =#of rooms in house
- Obtain the predicted log( price ) for a house of average characteristics (average land , average area, and average rooms ).
- Construct a 95% confidence interval around the predicted value of lprice in (ii).
- Now suppose we have a house of average characteristics that does not belong in the initial sample used for the regression estimates. Let its future selling price be price 0 . Using results from model above (3.1), find a 95% CI for log( price 0 ) . Comment on the width of this confidence interval compared to that in (iii).
- Note that the results derived in (ii), (iii), and (iv) are not useful to the general public because we are expressing our findings in terms of predicted log prices rather than prices in dollars. Find the predicted price for a house of average characteristics using results from the log model above (3.1). You do not need to construct confidence interval for this point estimate.
- Which housing characteristic in model (3.1) has the largest effect on housing price? Explain.
- Suppose we add an interaction term between area and rooms to model (3.1) and let i 1 =B 3 + 100 B 4 (where B 4 is the coefficient on the interaction term). What does the null hypothesis, H 0 : i 1 = 0, test? Obtain a 95% confidence interval for i 1 .
- Suppose you rerun model (3.1) but replace the dependent variable with the actual price . Which model (log or level) is the better model? Make sure to present all your steps.
Price: $27.32
Solution: The downloadable solution consists of 11 pages, 1632 words and 9 charts.
Deliverable: Word Document
Deliverable: Word Document
