Lesson #7, Assignment #1: Chapter 6 Discussion Questions Imagine that you are examining the effects of
Lesson #7, Assignment #1:
Chapter 6 Discussion Questions
- Imagine that you are examining the effects of ten scholarships on the subsequent performance of disadvantaged youths in college. There are fewer awards than applicants, and awards must be based on merit. Therefore, it is not possible to provide all disadvantaged students with awards; nor it is politically feasible to select students randomly since this conflicts with the principle of merit. You decide to allocate the ten awards according to the following rules: No student will receive an award without scoring at least 86 on the examination; five awards will be given automatically to the top students in the 92-100 interval; and the remaining five awards will be given to a random sample of students in the 86-91 interval.
One year later you obtain information on the grade point averages of the ten award students and ten other disadvantaged students who did not receive an award. You have college entrance examination scores for all twenty students (below). Does the provision of scholarship awards to disadvantaged students improve subsequent achievement in college?
- Construct a regression-discontinuity graph that displays examination scores and grade point averages for the award and nonaward groups. Use Xs and Os to display data points for the experimental (x) and control (O) groups.
-
Construct a worksheet and compute for each group the values of a and b in the equation Y c = a + b(X). -
For each group write the regression equation that describes the relation between merit (examination scores) and subsequent achievement (grade point averages).
- Compute the standard error of estimate at the 95 percent estimation interval (that is, two standard errors) for each group.
-
Compute r 2 and r . - Interpret information contained in (a) through (e)and answer the question: Does the provision of scholarship awards to disadvantaged students improve subsequent achievement in college? Justify your answer.
2. Subtract 0.5 from each student’s grade point average in the nonaward (control) group below.
- Does the Y intercept change for the control group? Why?
-
Does the slope of the regression line change for the control group? Why? -
Does the standard error of estimate change for the control group? Why?
- Does r 2 and r change for the control group? Why?
Lesson #7, Assignment #2:
Chapter 6 SPSS Demonstration Exercise
As we know, the simple (bivariate) linear regression equation is written as either:
Y = a + b(x) or: y = b 0 + b 1 x (Equation 6.1)
When we use this equation to estimate the value of a variable in a time series , the equation is written as:
y t = b 0 + b 1 x t (Equation 6.2)
In Equation 6.1, the values of the variables y and x are not ordered in time. For example, the price and age of automobiles would be unrelated to the time at which price and age were measured. By contrast, Equation 6.2 expresses price as a function of time, for example, the year in which price was measured. The data would be arrayed like this:
NOTE: Years may be coded as (1, 2, ..., T), or as (-1, -2, ..., 0, 1, 2, ..., T)
| CASE | PRICE ( Y t ) | YEAR ( x t ) |
| 1 | $10,000 | 1985 |
| 2 | 10,200 | 1986 |
| 3 | 10,300 | 1987 |
| 4 | 10,600 | 1988 |
| 5 | 10,500 | 1989 |
| 6 | 11,100 | 1990 |
| 7 | 11,100 | 1991 |
| 8 | 11,200 | 1992 |
| 9 | 11,500 | 1993 |
Equation 6.2 is a time-series regression equation. It is frequently used to forecast the value of a variable in future years, for example, the price of Nissans in the year 2000.
Another type of time-series regression equation is one where there are two or more independent (predictor) variables, X 1t ... x kt , which are presumed to be causes of a dependent (response) variable, y t , which is presumed to be the effect (because regression analysis has nothing to do with causality, per se , the term "presumed" is used here). The equation is written as:
y t = b 0 + b 1 x 1t + b 2 x 2t + b k x kt (Equation 6.3)
When using time-series regression analysis, it is important to remember the following points:
-
We often want to estimate the effects of a policy intervention on a policy outcome. We do this by creating a so-called "dummy (categorical) variable," which takes the values of 0 before the policy intervention, and 1 after the policy intervention. For example, a dummy variable may be symbolized as x
2t
when the dummy variable is the second predictor variable. The first variable is time, x
1t
, measured in years. The policy intervention regression equation would be written:
y t (policy outcome) = b 0 + b 1 x 1t (time) + b 2 x 2t (policy intervention) (Equation 6.4) - In linear regression, we must satisfy assumptions of linearity and homoskedasticity. An additional assumption must be satisfied in time-series analysis: the observations of y in the time series must be independent (uncorrelated). This is called the non-autocorrelation assumption. Note that, just as there are tests for linearity (e.g., plotting the y values against normal scores in a normal probability plot), there are tests for autocorrelation . One of these is the Durbin-Watson (D-W) test. We can apply this test with SPSS and other statistical packages.
- If the autocorrelation coefficient, r, is statistically significant at a specified level of α (usually p = 0.05), we reject the null hypothesis that adjacent observations in a time series are uncorrelated–and we thereby accept the alternative hypothesis that adjacent observations are autocorrelated. When there is statistically significant autocorrelation, we often can eliminate most of its effects by regressing the values of y t , on their lagged values, y t-1 . This is a lag of one time period (e.g., one year). The lagged values (one or more time periods) of a variable can be easily computed with SPSS and other statistical packages.
The regression equation with a lagged dependent variable (in this case, one year) is written as following:
y t = b 0 + b 1 y t-1 (Equation 6.5)
Equation 6.4 and Equation 6.5 can be combined to express the effects of a lagged policy outcome variable (y t-1 ), time (x t1 ), and a policy intervention (x t2 ) on a policy outcome variable (y t ). Here is the combined equation:
y t = b 0 + b 1 y t-1 (lag 1 period) + b 2 x 2t (time) + b 3 x 3t (policy intervention) (Equation 6.6)
Demonstration Case: Rival Explanations of Policy Outcomes: The Political Economy of Traffic Fatalities in Europe and the United States
In this demonstration exercise you will use SPSS and a data file from the Fatal Accident Reporting System of the National Highway Traffic Safety Administration to perform the following tasks. The data is available as Exhibit 6.3 on pages 343-344 and Exhibit 6.4 on page 344 of the required text.
- Estimate the effect of time on fatalities before (1966-1973) and after (1974-2000) the adoption of the 55 mph speed limit. Run two separate regressions. Interpret the two regression estimates.
- Estimate the effect of employment on fatalities. Forecast traffic fatalities for 2000, using the actual value of employment, which is 125,331,000. Note that we either have to assume the 2000 values for unemployment, or use time-series analysis to forecast the 2000 value. Interpret the computer printout and compare the forecast with actual fatalities for 2000.
- Estimate the effect of time (years) and the policy intervention (55 mph speed limit) on fatalities, using a dummy variable. Assess whether the effect of the policy intervention is statistically significant and interpret the computer output.
- Re-estimate the equation in part 3 above. This time, control for the effect of autocorrelation by adding Y t-1 (the lagged value of the dependent variable) as a third predictor. Interpret the output and compare it to output obtained in part 3.
- Estimate the effect of the policy intervention on fatalities, after including miles traveled and employment as additional predictor variables. In this regression, do not include time as a predictor variable. Interpret your computer output.
- Create SPSS graphs that display an interrupted time series for (a) the United States; (b) for European countries that adopted 48 and 54 mph (80 and 90 kph) speed limits; and (c) for European countries that did not adopt the speed limit. Interpret the graphs.
Deliverable: Word Document
