Just as with point estimates in a univariate context, we can create sampling distributions for our estimates
Problem 1
Just as with point estimates in a univariate context, we can create sampling distributions for our estimates in a bivariate (and also multivariate) context. This problem will walk you through creating a sampling distribution for OLS estimates in particular. Load the trusty subprime data from the course website. Recall that these are data collected by the U.S. government on all home lending transactions in Cape Coral and Fort Myers. They contain information on each loan applicant and give information on whether that applicant received a subprime loan (high.rate) as well as on the amount of the loan (loan.amount). They also contain basic demographic information such as race, gender, and income.
Assume the data represent the "truth" (i.e., an entire population). We are going to look at the (fairly boring) regression in which we use income (income) to predict loan amounts (loan. amount).
- Our first step is to find the "true" intercept and slope of the true population. Do so by regressing loan. amount on income.1 (You are welcome to use 1 m function in R. here.) Report the coefficients and interpret what they mean.
- After setting the seed to (12345), conduct 1000 simulations such that on every iteration you:
- Draw a sample of size 250 with replacement (Note: remember that each observation has two values, one for income and one for loan. amount.)
- Regress loan. amount on income for each sample and store the intercept and slope.
Plot the sampling distribution for the intercept and the sampling distribution for the slope. Describe these distributions (their means and variances) and offer a guess at why we might be finding these distribution.
Deliverable: Word Document
