Example 2 The SPSS file provided (COPD2011.sav) contains information about patients who were recruited
Example 2
The SPSS file provided (COPD2011.sav) contains information about patients who were recruited as part of a study into chronic obstructive pulmonary disease (COPD).
The file contains the following variables:
Subject ID
Unique patient identification number
Status
Disease status, where Controls=0 and COPD=1
Height
Height in cm
SNP1
Genotype at SNP 1, where 0=AA, 1=AG and 2=GG
SNP2
Genotype at SNP 2, where 0=CC, 1=CT and 2=TT
FEV1
Forced expiratory volume – the volume of air which can be expired with force in 1 second. This is recorded on the database as a percentage of the expected FEV1 in healthy subjects of the same sex and age as the patient
FVC
Forced vital capacity – the total volume of air which can be expired with force after a forced deep intake of breath. This is recorded on the database as a percentage of the expected FVC in healthy subjects of the same sex and age as the patient.
Alpha1
Plasma levels of alpha-1-antitrypsin (mg/mL)
Age
In years recorded at recruitment
Gender Sex of patient, where 1=Female and 2=Male
Packyear
Smoking habit recorded in pack-years. This is calculated by multiplying the number of packs of cigarettes smoked per day by the number of years the person has smoked. For example, 1 pack year is equal to smoking 20 cigarettes per day for 1 year, or 40 cigarettes per day for half a year.
Use SPSS to answer the following questions. Include relevant output from SPSS in your report. Guidance about what to include in the report is also given below.
- Explore the variable "Alpha1" in the controls and cases separately. Without carrying out any formal tests, discuss whether you expect this variable to be following a Normal distribution in both groups. Using the most appropriate measures of location and spread, summarise the main characteristics of this variable and justify your choice of descriptive statistics. Graphical output can be included if this helps your explanation.
- Is there any statistical evidence for a difference in smoking history in COPD patients compared to controls?
- The height of individuals was also measured and recorded. Use the most appropriate statistical test to investigate whether there is a difference in height between males and females.
- Is there any evidence to suggest that the two polymorphisms in this data set (SNP1 and SNP2) are associated with COPD? Can you quantify the effect of genotype on the risk of developing COPD?
- FEV1 and FVC are related measures of lung function. Are they significantly correlated? Is there good reason to consider the COPD and control cases separately?
- FEV1 can be measured very simply using a hand-held instrument. Is it possible to generate regression equations for the cases and controls to predict FVC from measured FEV1? Is there any evidence to support including additional predictor variables in the model?
- Your colleague asks you to use the univariate regression analysis in Q6 to predict FVC values for two individuals using only their FEV1 values. The first individual is a COPD patient with FEV1=11%, and the second is a control patient with FEV1=91%. Use the regression equation to predict FVC values for these two individuals, and comment briefly on your results (including any limitations or reservations that you may have).
Deliverable: Word Document
