Objective: To gain familiarity with Chi-Square test for association between two categorical variables
Objective: To gain familiarity with Chi-Square test for association between two categorical variables .
Topic 1: Understanding crosstabs and Chi-Square test
Using the "HealthSurvey2014" from the blackboard site, pick two dichotomous variables (Sports and Breakfast) to produce a crosstab. The researchers want to know if there is an association between eating breakfast and participation in sports.
H0: There is no association between eating breakfast and participation in sports.
H1; There is an association between eating breakfast and participation in sports.
To produce a crosstab, click "Analyze", then click "descriptive statistics", then click "crosstabs". Highlight the dichotomous variable you have picked for the row and arrow to the right, highlight the dichotomous variable you have picked for the column and arrow to the right. Also click on "Cells" and select percentages for the row and columns, and select counts for both observed (default) and expected. Also, clink on "Statistics" and select ‘Chi-square". Then click "OK".
| Ate Breakfast * Participated in Sports Crosstabulation | |||||
| Participated in Sports | Total | ||||
| No | Yes | ||||
| Ate Breakfast | No | Count | 3 | 8 | 11 |
| Expected Count | 1.9 | 9.1 | 11.0 | ||
| % within Ate Breakfast | 27.3% | 72.7% | 100.0% | ||
| % within Participated in Sports | 21.4% | 12.1% | 13.8% | ||
| Yes | Count | 11 | 58 | 69 | |
| Expected Count | 12.1 | 56.9 | 69.0 | ||
| % within Ate Breakfast | 15.9% | 84.1% | 100.0% | ||
| % within Participated in Sports | 78.6% | 87.9% | 86.3% | ||
| Total | Count | 14 | 66 | 80 | |
| Expected Count | 14.0 | 66.0 | 80.0 | ||
| % within Ate Breakfast | 17.5% | 82.5% | 100.0% | ||
| % within Participated in Sports | 100.0% | 100.0% | 100.0% | ||
| Chi-Square Tests | |||||
| Value | df | Asymptotic Significance (2-sided) | Exact Sig. (2-sided) | Exact Sig. (1-sided) | |
| Pearson Chi-Square | .844 a | 1 | .358 | ||
| Continuity Correction b | .241 | 1 | .623 | ||
| Likelihood Ratio | .764 | 1 | .382 | ||
| Fisher's Exact Test | .397 | .294 | |||
| Linear-by-Linear Association | .833 | 1 | .361 | ||
| N of Valid Cases | 80 | ||||
|
|||||
| b. Computed only for a 2x2 table | |||||
The output shows the cross-tabulation for the two categorical variables, with corresponding ‘Count’, ‘Expected Count’, ‘Row Percent" and "Column Percent" (We had learnt this in Lab 2 Exercise). In addition, the Chi-Square Tests table shows the test result, chi-square statistics=0.844, with df=1 and p-value=0.358. Since p-value is greater than 0.05, we fail to reject the null hypotheses, means there is not a significant association between eating breakfast and participation in sports.
Lab Exercise for Chi-Square Test
-
Previously, we used INCPAD trial data to confirm that randomization is successful for a continuous patient characteristic variable (‘age’). Now, we will revisit this trial data, but want to assess whether or not the randomization is still successful for a categorical characteristics variable (‘gender’). The
incpad
2
.sav
data contains randomization assignment and one of the important demographic variable: gender. In the provided data, a variable named ‘group’ is used to define the randomization assignment; with ‘1’ stands for intervention and ‘0’ for control. Also, a variable named ‘gender’ is used with ‘1’ indicated ‘male, and ‘0’ indicated ‘female’. Again, we want to see if there is association between ‘gender’ by ‘group’ and make sure randomization works. Use the
incpad2
.sav
data to complete the following:
- Write the null hypothesis and the alternative hypothesis.
-
Report the four conditional probabilities.
- In intervention group, what percentage are male patients?
- In control group, what percentage are male patients?
- For those female patients, what percentage are assigned to intervention group?
- For those male patients, what percentage are assigned to intervention group?
- Perform appropriate statistical test for part a. What is the test procedure you will chose?
- What do you conclude? (report test statistics, include the degrees of freedoms and the corresponding p-value).
- Can you confirm if our randomization successful (for ‘gender’ variable)?
Deliverable: Word Document
