Binary Choice Use the date in loanapp.dta for this exercise. The binary variable to be explained is approve,
Binary Choice
Use the date in loanapp.dta for this exercise. The binary variable to be explained is approve, which is equal to one if a mortgage loan to an individual is approved. The key explanatory variable is \(\underline{\text { white}}\), a dummy variable equal to one if the applicant was white. The other applicants in the data set are black and Hispanic.
To test for discrimination in the mortgage loan market, a model can be used:
\(\text { approve }=\beta_{0}+\beta_{1} \text { white }+\text { other factors }+e\)
- If there is discrimination against non-white individuals, and the appropriate factors have been controlled for, what sign would you expect for \(\beta_{1}\) ? Why?
- Estimate a linear probability model (LPM) of approve on white. Provide an interpretation of results. Is there a sign of discrimination? Explain your answer.
- Is LPM the correct approach for estimating the above model? Please explain your answer.
- Estimate the model in (2) via probit regression.
- What is the estimated probability of loan approval for whites?
- What is the estimated probability of loan approval for non-whites?
Hint: you need to calculate margins (not marginal effects).
5. Repeat (4) via logit regression. Are answers to (a) and (b) in 4 and 5 the same?
6. Estimate a new model by adding the following control variables to the model in (2); hrat, obrat, loanprc, unem, male, married, dep, sch, cosign, chist, pubrec, mortlatl, mortlat2, and \(v\).
- Which variables are statistically significant? Interpret these variables.
- Is there still evidence of discrimination against non-whites? Explain.
7. Estimate the model in (6) using logit and probit. Provide estimated coefficients from all 3 models in a single table formatted as a typical manuscript table using Stata syntaxes estimates store and estimates table. Make sure results contain only 2 decimal digits and use stars \((*)\) to designate the levels of statistical significance.
8. Now, the following model, approve \(=\beta_{0}+\beta_{1}\) atotinc \(+e\), estimates the impact of monthly income (atotinc) on the loan approval. Using results from this model, generate a single graph that shows the following:
- A scatter of actual observations of the model.
- Predicted probabilities from LPM, Logit, and Probit models. Make sure these graphs are properly labeled on the graph.
- What did you learn about the accuracy of predictions from generated graph? Explain briefly.
Multinomial Choice Models
9. Using data in nels_small.dat, estimate a multinomial logit model explaining PSECHOICE. Use the group who did not attend college as the base group. Use as explanatory variables GRADES, FAMINC, FEMALE, and BLACK. Are the estimated coefficients statistically significant? Pick two coefficient estimates and explain them in detail.
10. Compute the estimated probability that a white male student with median values of GRADES and FAMINC will attend a four-year college. Hint: You are estimating margins at specified values of covariates. These are NOT marginal effects. Example 7 in the Stata Manual for margin.s command will provide additional hints.
11. Compute the change in probability of attending a four-year college for the individual in question 10 assuming that the GRADE of the individual changes from median to the \(25^{\text {th }}\) percentile value.
Ordered Choice Models
12. Using data in nels_small.dat, estimate an ordered probit model where the dependent variable, PSECHOICE, is treated as an ordered (ranked) variable with 1 representing the least favored alternative (no college) and 3 denoting the most favored alternative (four-year college). Student's GRADES is the only independent variable in this model.
13. Calculate the probability that a student will choose no college, a two-year college, and a four-year college if the student's grades are the median value. Recompute these probabilities assuming that GRADES are at the \(25^{\text {th }}\) percentile value. Discuss the probability changes. Are they what you anticipated? Explain.
14. Expand the ordered probit model to include family income (FAMINC), family size (FAMSIZ), and the indicator variables BLACK and PARCOLL. Discuss the signs and significance of estimates. (Hint: Recall that the sign indicates the direction of the effect for the highest category, but is opposite for the lowest category).
15. Compute the probability that a black student from a household of four members, including a parent who went to college, and household income of $\$ 52,000$, will attend a four-year college if GRADES of the student are at
- median value
- \(25^{\text {th }}\) percentile value.
16. Repeat (15) for a 'nonblack', student and discuss the differences in your findings.
17. Test the null hypothesis that the two cut points are equal. What is your conclusion?
Deliverable: Word Document
