Non-Parametric Statistics, or What to Do When the Assumptions for a Parametric Test Fail


Question 1: A medical researcher believes the number of ear infections in swimmers can be reduced if the swimmers use ear plugs. A sample of ten people was selected, and the number of ear infections for a four-month period was recorded. During the first two months, the swimmers did not use the ear plugs; during the second two months, they did. At the beginning of the second two-month period, each swimmer was examined to make sure that no infections were present. The data are shown below. At \(\alpha = 0.05\), can the researcher conclude that using ear plugs effects the number of ear infections?

Solution: We need to test the hypotheses

\[\begin{align} & {{H}_{0}}:\text{ ear infections are the same with or without the ear plugs} \\ &{{H}_{A}}:\text{ swimmers get less ear infections with ear plugs} \\ \end{align}\]

We use a Sign-Test. We use Statdisk to get the following output:

The \(x\) statistics is equal to 2 (the less frequent number of signs). The critical value is 1. Since \(x\) is not less than or equal to the critical value, we fail to reject the null hypothesis. This means that we don’t have enough evidence to support the claim that the number of ear infections in swimmers can be reduced if the swimmers use ear plugs.



Question 2: Research indicates that people who volunteer to participate in research studies tend to have higher intelligence than nonvolunteers. To test this phenomenon, a researcher obtains a sample of 200 high school students. The students are given a description of a psychological research study and asked whether they would volunteer to participate. The researcher also obtains an IQ score for each student and classifies the students into high, medium, and low IQ groups. Do the following data indicate a significant relationship between IQ and volunteering? Test at the .05 level of significance.

Solution: The following table shows the corresponding contingency table:

Observed

High

Medium

Low

Total

Volunteer

43

73

34

150

Not Volunteer

7

27

16

50

Total

50

100

50

200


We are interested in testing the following null and alternative hypotheses:

\[\begin{align}{{H}_{0}}:\,\,\, \text{Volunteer Status}\text{ and }\text {IQ}\text{ are independent} \\ {{H}_{A}}:\,\,\,\text{Volunteer Status}\text{ and }\text {IQ}\text{ are NOT independent} \\ \end{align}\]

From the table above we compute the table with the expected values

Expected

High

Medium

Low

Volunteer

37.5

75

37.5

Not Volunteer

12.5

25

12.5


The way those expected frequencies are calculated is shown below:

\[{E}_{{1},{1}}= \frac{ {R}_{1} \times {C}_{1} }{T}= \frac{{150} \times {50}}{{200}}={37.5},\,\,\,\, {E}_{{1},{2}}= \frac{ {R}_{1} \times {C}_{2} }{T}= \frac{{150} \times {100}}{{200}}={75},\,\,\,\, {E}_{{1},{3}}= \frac{ {R}_{1} \times {C}_{3} }{T}= \frac{{150} \times {50}}{{200}}={37.5}\]

\[,\,\,\,\, {E}_{{2},{1}}= \frac{ {R}_{2} \times {C}_{1} }{T}= \frac{{50} \times {50}}{{200}}={12.5},\,\,\,\, {E}_{{2},{2}}= \frac{ {R}_{2} \times {C}_{2} }{T}= \frac{{50} \times {100}}{{200}}={25},\,\,\,\, {E}_{{2},{3}}= \frac{ {R}_{2} \times {C}_{3} }{T}= \frac{{50} \times {50}}{{200}}={12.5}\]

Finally, we use the formula \(\frac{{{\left( O-E \right)}^{2}}}{E}\) to get

(fo – fe)²/fe

High

Medium

Low

Volunteer

0.8067

0.0533

0.3267

Not Volunteer

2.42

0.16

0.98


The calculations required are shown below:

\[\frac{ {\left( {43}-{37.5} \right)}^{2} }{{37.5}} ={0.8067},\,\,\,\, \frac{ {\left( {73}-{75} \right)}^{2} }{{75}} ={0.0533},\,\,\,\, \frac{ {\left( {34}-{37.5} \right)}^{2} }{{37.5}} ={0.3267},\,\,\,\, \frac{ {\left( {7}-{12.5} \right)}^{2} }{{12.5}} ={2.42}\]

\[,\,\,\,\, \frac{ {\left( {27}-{25} \right)}^{2} }{{25}} ={0.16},\,\,\,\, \frac{ {\left( {16}-{12.5} \right)}^{2} }{{12.5}} ={0.98}\]

Hence, the value of Chi-Square statistics is

\[{{\chi }^{2}}=\sum{\frac{{{\left( {{O}_{ij}}-{{E}_{ij}} \right)}^{2}}}{{{E}_{ij}}}}={0.8067} + {0.0533} + {0.3267} + {2.42} + {0.16} + {0.98} = 4.747\]

The critical Chi-Square value for \(\alpha =0.05\) and \(\left( 3-1 \right)\times \left( 2-1 \right)=2\) degrees of freedom is \(\chi _{C}^{2}= {5.991}\). Since \({{\chi }^{2}}=\sum{\frac{{{\left( {{O}_{ij}}-{{E}_{ij}} \right)}^{2}}}{{{E}_{ij}}}}= {4.747}\) < \(\chi _{C}^{2}= {5.991}\), then we fail to reject the null hypothesis, which means that we don’t have enough evidence to reject the null hypothesis of independence.



Question 3: Listed below are the numbers of years that U.S. presidents, popes since 1690 and British monarchs lived after they were inaugurated, elected, or coronated. As of the writing, the last president is Gerald Ford and the last pope is John Paul II. The times are based on data from Computer interactive Data Analysis, by Lunn and McNeil, John Wiley & Son. Use a 0.05 significance level to test the claim that the 2 samples of longevity data from popes and monarchs are from populations with the same median.

Presidents

10 29 26 28 15 23 17 25 0 20 4 1 24 16 12 4 10 17 16 0 7 24 12 4

18 21 11 2 9 36 12 28 3 16 9 25 23 32

Popes

2 9 21 3 6 10 18 11 6 25 23 6 2 15 32 25 11 8 17 19 5 15 0 26

Monarchs 17 6 13 12 13 33 59 10 7 63 9 25 36 15

Solution: We need to use a Wilcoxon test in order to assess the claim that the 2 samples are from populations with the same median. The following results are obtained:

Wilcoxon – Mann/Whitney Test

n

sum of ranks

24

416

Popes

14

325

Monarchs

38

741

total

468.00

expected value

33.00

standard deviation

-1.56

z, corrected for ties

.1186

p-value (two-tailed)

No.

Label

Data

Rank

1

Popes

2

2.5

2

Popes

9

12.5

3

Popes

21

28

4

Popes

3

4

5

Popes

6

7.5

6

Popes

10

14.5

7

Popes

18

26

8

Popes

11

16.5

9

Popes

6

7.5

10

Popes

25

31

11

Popes

23

29

12

Popes

6

7.5

13

Popes

2

2.5

14

Popes

15

22

15

Popes

32

34

16

Popes

25

31

17

Popes

11

16.5

18

Popes

8

11

19

Popes

17

24.5

20

Popes

19

27

21

Popes

5

5

22

Popes

15

22

23

Popes

0

1

24

Popes

26

33

25

Monarchs

17

24.5

26

Monarchs

6

7.5

27

Monarchs

13

19.5

28

Monarchs

12

18

29

Monarchs

13

19.5

30

Monarchs

33

35

31

Monarchs

59

37

32

Monarchs

10

14.5

33

Monarchs

7

10

34

Monarchs

63

38

35

Monarchs

9

12.5

36

Monarchs

25

31

37

Monarchs

36

36

38

Monarchs

15

22


Since we are comparing two independent groups (Popes & Monarchs), we can use Wilcoxon Rank Sum Test.

The null hypothesis tested is

H0: The two samples are from populations with the same median.

The alternative hypothesis is

H1: The two samples are from populations with the different median.

Significance level = 0.05

Test statistic: The observed values from the pooled sample results are ranked from smallest to largest. After the rankings are obtained, the samples are separated, and the sum of the rankings is calculated for each.

The test statistic used is

\[Z=\frac{{{T}_{A}}-\frac{{{n}_{2}}\left( {{n}_{1}}+{{n}_{2}}+1 \right)}{2}}{\sqrt{\frac{{{n}_{1}}{{n}_{2}}\left( {{n}_{1}}+{{n}_{2}}+1 \right)}{12}}}\]

,

where TA is the sum of ranks of the smaller sample. Here n1 = 24, n2 = 14, TA = 416.

Therefore,

\[Z=\frac{416-\frac{24\left( 14+24+1 \right)}{2}}{\sqrt{\frac{14*24\left( 14+24+1 \right)}{12}}}=-1.57\]

Rejection criteria: Reject the null hypothesis, if the absolute value of test statistic is greater than the critical value at the 0.05 significance level.

Lower critical value = -1.96

Upper critical value = 1.96

Conclusion: Fails to reject the null hypothesis, since the absolute value of test statistic is less than the critical value. The sample does not provide enough evidence to reject the claim that the two samples are from populations with the same median.

This tutorial is brought to you courtesy of MyGeekyTutor.com



In case you have any suggestion, please do not hesitate to contact us.

Non-Parametric Statistics, or What to Do When the Assumptions for a Parametric Test Fail

log in

reset password

Back to
log in