Assignment Note: All answers to be completed in SAS. All SAS input and output must pasted in to the solutions.
Assignment
Note: All answers to be completed in SAS. All SAS input and output must pasted in to the solutions.
Other Statistical software, such as Minitab, JMP, SPSS, R, and MATLAB, are also allowed to use .
- A psychologist has tested 25 independent hypotheses. The 25 unadjusted p-values she obtained are as follows: 0.0042,0.3303,0.0025,0.0018,0.00001,0.8329,0.0002,0.1364,0.0006,0.0407,0.1523, 0.0022,0.2276,0.0014,0.0036,0.0123,0.3103,0.0067,0.8628,0.6282,0.0053,0.00003,0.0173, 0.4636, and 0.0082. Which, if any, hypotheses can she reject controlling the false discovery rate at 0.05 ? Which, if any, hypotheses can she reject controlling the experimentwise error rate at no more than 0.05?
- Montgomery 3.54, parts (a)-(e) only. Part (f) refers to fitting the model using residual maximum likelihood (REML). You can use either REML or least squares estimation approaches for this problem.
The DAT file 3.54 is attached with the email.
3.54. A textile mill has a large number of looms. Each loom is supposed to provide the same output of cloth per minute. To investigate this assumption, five looms are chosen at random, and their output is noted at different times. The following data are obtained:
- Explain why this is a random effects experiment. Are the looms equal in output? Use \(\alpha=0.05\).
- Estimate the variability between looms.
- Estimate the experimental error variance.
- Find a 95 percent confidence interval for \(\sigma_{\tau}^{2} /\left(\sigma_{\tau}^{2}+\sigma^{2}\right)\).
- Analyze the residuals from this experiment. Do you think that the analysis of variance assumptions are satisfied?
- Use the REML method to analyze this data. Compare the 95 percent confidence interval on the error variance from REML with the exact chi-square confidence interval.
3. A sociologist is interested in studying the ability of teachers from low income areas of major cities to cope with stress. Seven schools were randomly chosen from low income areas and from each of these schools, four teachers were randomly chosen. The following table summarizes the average coping score (higher the score, the better the ability to cope) for each of these schools.
a If \(\mathrm{MS}_{\mathrm{E}}=74\), is there significant variability in average coping scores among schools in low income areas (use \(\alpha=.05\) )?
b Estimate all variance components.
c Suppose the national average coping score for teachers is 105 . Test to see if the data support the hypothesis that the average coping score of these teachers is lower than the national average \((\alpha=.05)\)
4. In Montgomery 3.13, there was not enough evidence to support the hypothesis that different types of car had different means. Using the estimated variance from this study, what is the power of this study to detect a difference in means of 1 day? What sample size is necessary for this power to be at least \(80 \%\) ?
3.13. A rental car company wants to investigate whether the type of car rented affects the length of the rental period. An experiment is run for one week at a particular location, and
10 rental contracts are selected at random for each car type.
The results are shown in the following table.
- Is there evidence to support a claim that the type of car rented affects the length of the rental contract? Use \(\alpha=0.05\). If so, which types of cars are responsible for the difference?
- Analyze the residuals from this experiment and comment on model adequacy.
- Notice that the response variable in this experiment is a count. Should this cause any potential concerns about the validity of the analysis of variance?
The DAT file for 3.13 is attached with the email.
5. The results from the study in Montgomery 3.14 were somewhat inconclusive. The data suggested that the average score in the winter season was about 2-3 shots higher than the other two season but it was not found statistically significant. This season, Montgomery plans to golf more. Assuming that \(\mu_{\text {Shoulder }}=\mu_{\text {Winter }}-3\) and \(\mu_{\text {Summer }}=\mu_{\text {Winter }}-2\), what total sample size \(N\) is needed for the power of the \(F\) test to be at least \(90 \%\).
3.14. I belong to a golf club in my neighborhood. I divide the year into three golf seasons: summer (June-September), winter (November-March), and shoulder (October, April, and May). I believe that I play my best golf during the summer (because I have more time and the course isn't crowded) and shoulder (because the course isn't crowded) seasons, and my worst golf is during the winter (because when all of the part-year residents show up, the course is crowded, play is slow, and I get frustrated). Data from the last year are shown in the following table.
- Do the data indicate that my opinion is correct? Use \(\alpha=0.05\).
- Analyze the residuals from this experiment and comment on model adequacy.
The DAT file for 3.14 is attached with the email.
6. Refer to Problem 3. There were \(a=7\) randomly chosen schools each with \(n=4\) teachers.
a How much power does this study have if the true variances were such that \(1.5 \sigma_{\tau}^{2}=\sigma^{2}\) ? Make sure you show your software inputs or hand calculations to receive partial credit.
b In a random effects situation you can either increase \(a\) and/or \(n\) to increase the power. You investigate different combinations and find the following:
- \(a=11, n=5,89.8 \%\) power
- \(a=9, n=6,89.7 \%\) power
- \(a=6, n=9,88.4 \%\) power
- \(a=5, n=11,87.2 \%\) power
If it costs $\$ 25$ in time and resources to evaluate each teacher once at a school and $\$ 100$ in time and resources to access a school, which of these options would you choose? Explain your answer.
Problem 3 is listed again below
3. A sociologist is interested in studying the ability of teachers from low income areas of major cities to cope with stress. Seven schools were randomly chosen from low income areas and from each of these schools, four teachers were randomly chosen. The following table summarizes the average coping score (higher the score, the better the ability to cope) for each of these schools.
a If \(\mathrm{MS}_{\mathrm{E}}=74\), is there significant variability in average coping scores among schools in low income areas (use \(\alpha=.05\) )?
b Estimate all variance components.
c Suppose the national average coping score for teachers is 105 . Test to see if the data support the hypothesis that the average coping score of these teachers is lower than the national average \((\alpha=.05)\).
Deliverable: Word Document
