Statistical Analysis The objective of the following paper is to analyze data related to the cigarette


Statistical Analysis

The objective of the following paper is to analyze data related to the cigarette consumption and other demographic variables related to it. The goal is to use Multiple Regression Analysis to draw meaningful conclusions about the relationship between cigarette consumption ( CIG) and the rest of the variables

Description of the dataset


The data are per capita numbers of cigarettes smoked (sold) by 43 states and the District of Columbia in 1960 together with death rates per thousand population from various forms of cancer.

Number of cases: 44

Variable Names :


CIG = Number of cigarettes smoked (hundreds per capita)
BLAD = Deaths per 100K population from bladder cancer
LUNG = Deaths per 100K population from lung cancer
KID = Deaths per 100K population from kidney cancer
LEUK = Deaths per 100 K population from leukemia


  1. Determine if there are any outliers.  If there are outliers, speculate why and decide whether you think it is permissible to remove the data points.  If you remove those observations, be able to document why you think it is allowable.

  2. Compute all bivariate correlations. Comment on direction and strength of each.
  3. Plot CIG against each of the other four variables and comment.
  4. Find the multiple regression equation relating cigarette smoking to the four variables.  Use the |Enter| method.
  1. What is R2?
  2. Test \({{H}_{0}}:{{\beta }_{1}}={{\beta }_{2}}={{\beta }_{3}}={{\beta }_{4}}\)
  3. Test \({{H}_{0}}:{{\beta }_{1}}=0|{{\beta }_{2}},{{\beta }_{3}},{{\beta }_{4}}\)
    d.    Are all predictor variables significant in the multiple regression model?  Explain.

e.    Test whether both bladder cancer and leukemia can be dropped from the model.



5.    Square all the predictor terms and rerun the multiple regression with the four terms and the four squared terms in the model using stepwise, backward, or forward elimination.  Choose a model and write the equation.


6.    Is there any improvement in this model over the model in #4?  Explain.

7.    Center the four predictor variables (not CIG).
8.    Square the centered terms.  Print the first 5 observations for the centered and centered squared variables.

9.    Rerun the multiple regression using stepwise, backward, or forward elimination.  Choose a model and write the equation.

  1. Is the constant significant in your model?
  2. If the constant is not significant, delete it from the model and rerun the multiple regression.
  3. Is the resulting model a better predictor?  Why or why not?

    d.    Write the regression model.

e.    If all predictor variables in the model are held constant except for lung cancer, what is the impact on predicted cigarette consumption for each unit increase in lung cancer?

f.    Predict how many cigarettes per capita would be sold if the bladder cancer rate was 3, the lung cancer rate as 15, the kidney cancer rate was 2 and the leukemia rate was 8

Price: $23.66
Solution: The downloadable solution consists of 14 pages, 966 words and 17 charts.
Deliverable: Word Document


log in to your account

Don't have a membership account?
REGISTER

reset password

Back to
log in

sign up

Back to
log in