In class, I said that a regression equation with two independent variables defines a plane. Let's use


  1. In class, I said that a regression equation with two independent variables defines a plane.

Let's use Microsoft Excel to demonstrate this:

  1. In Excel, set up a spreadsheet where \(X_{1}\) values from 0 to 10 at increments of .5 make up the first column, and \(X_{2}\) values from 0 to 10 at increments of .5 make up the first row. Then, in the second row or the first column, create a formula that calculates the \(Y\) -value associated with the coordinate pair \(\left(X_{11}, X_{21}\right)\) based on the following estimated regression equation:
    \(\hat{Y}=.7-.4 X_{1}+.2 X_{2}\)
    Copy this formula down the columns and across the rows so that a \(Y\) value is calculated for each coordinate pair \(\left(X_{1 i}, X_{2 j}\right)\). Then use Excel's graphing capabilities to graph \(\hat{Y}\) vs. \(X_{1}\) and \(X_{2}\) in three dimensions. You will want to use Insert>Other Charts > Surface (at least in Office 2007 or 2010). Your final graph should be rotated so that it looks nice, and has proper labels on all three axes. (If you click on your graph, a Chart Tools > Layout menu will become available near the top of the screen. You can use this to edit the rotation angle, axes titles, etc.)
    Note: I have posted some tips on working with cell references in Excel in the References section on Moodle. I have also posted an example Excel file that plots a graph for a different regression equation. Please feel free to consult with others on the mechanics of using Excel.
  2. Why does the estimated regression equation above define a "flat" plane rather than a curved surface or a surface with some other sort of peaks and valleys?
  3. If \(X_{1}\) were a dummy variable rather than a continuous variable, what would the graph of \(\hat{Y}\) look like? (You can graph an example if it helps you answer the question, but you don't have to. By all means, you can look for graphs in your book or the lecture notes, too.)
  4. Think of a hypothetical data generating process (DGP) in which two continuous independent variables \(\left(\right.\) and \(u\) ) determine the value of \(Y\), but the graph of \(\hat{Y}\) vs. \(X_{1}\) and \(X_{2}\) does NOT define a plane. Give the equation for your hypothetical DGP. (The possibilities are endless!)

2. Suppose you estimate a regression equation as:

\(\widehat{Y}=70-4 X\)

If the standard deviation of \(\hat{Y}\) is 10 and the standard deviation of \(X\) is 7 , calculate beta (the standardized coefficient) for \(X\). Do this two ways:

  1. First, calculate it without using a formula. To do this, you need to know that a one standard deviation ( of \(X\) ) change in \(X\) will cause a change of beta standard deviations (of \(\hat{Y})\) in \(\hat{Y}\). Show your work.
  2. Second, calculate it using formula 6.5 on page 188 of Wooldridge.

3. Suppose I run a regression of Republican Party Thermometer on Religious Attendance. I get the following results.

I then realize, however, that Ideology might be correlated with Church Attendance and also affect Republican Party Thermometer. To avoid omitted variable bias, I run a regression with both Church Attendance and Ideology included, and I get the following results:

Suppose that the estimated coefficients in the second table are unbiased. In order to calculate how much the coefficient on Church Attendance was biased by in the first regression using formula 3.63 on page 114 of Wooldridge, I also ran an auxiliary regression of Ideology on Church Attendance and got the following results:

In other words, \(\widehat{\delta}_{\jmath}=-.175\).

  1. Using formula 3.63, the assumption of unbiasedness in the second regression, and the definition of bias, calculate how much the coefficient of Church Attendance in the first regression is biased.
  2. How could you get the result for part (A) using only the information given in the first two tables?

4. Using the 2008 NAES data (available on Moodle), choose one categorical variable and one interval ratio - or "almost interval-ratio" variable - - that you think might affect thermometer score ratings for Barack Obama. Regress the Obama thermometer score variable on the two variables that you chose (you will have to make a series of dummy variables for you nominal variable).

  1. Make a nicely-formatted table using your regression results. You should look at tables of regression results in articles in some top journals to determine what kinds of things to include in your table.
  2. Interpret the statistical and substantive significance of your results

5) Using the CCES 2010 data on Moodle, estimate a linear model of Tea Party favorability that includes four independent variables that you think are particularly important in determining one's level of support for the Tea Party and interpret results. Include your regression output with the interpretations (you don't need to make a nice table for this question). You can treat Tea Party favorability as a continuous variable. The level of measurement of your independent variables will determine how you incorporate them into the model.

Price: $24.02
Solution: The downloadable solution consists of 10 pages, 1402 words and 8 charts.
Deliverable: Word Document


log in to your account

Don't have a membership account?
REGISTER

reset password

Back to
log in

sign up

Back to
log in