Assignment 1 - Types of Data Graphing The goal of this assignment is to understand and present distributions
Assignment 1 – Types of Data & Graphing
The goal of this assignment is to understand and present distributions of both categorical and quantitative variables. Specifically: 1) Be able to distinguish between quantitative and categorical variables - and to explain why this distinction is important in statistics; 2) Know how to interpret histograms and be able to describe their key features (shape, center, spread, and outliers).
- Figure 1a through Fig. 1d displays four histograms without axis markings. They are the distributions of these four variables:
- The gender of the students in a large college course, recorded as 0 for male and 1 for female.
- The heights of the students in the same class.
- The handedness of students in the class, recorded as 0 for right-handed and 1 for left-handed.
- The lengths of words used in Shakespeare's plays.
Without further information, match the title to the appropriate variables, and explain your reasoning.
Figure 1:
-
Births are not, as you might think, evenly distributed across the days of the week. Table 1 shows the average numbers of babies born on each day of the week in 1999 (National Center for Health Statistics, Births: Final Data for 1999, National Vital Statistics Reports, Vol. 49, No.1, 2001)
Table 1:
Calculate the percent values for each day. 1Day Births Percent Sunday 7,731 Monday 11,018 Tuesday 12,424 Wednesday 12,183 Thursday 11,893 Friday 12,012 Saturday 8,654 Total
Is the graph uniformly or normally distributed? Based on these data, give a detailed description of how births depend on the day of the week, and possible reasons for the pattern. - Take a look at the example histograms from the National Center for Education Statistics (NCES) "Create a Graph" Tool. I suggest looking at all of them. Critique at least one of the "Examples" graphs , e.g., U.S. Public School Student Membership, International Per Capita Consumption of Turkey , Percentage of students who reported being bullied at school , Airline On-Time Statistics and Delay Causes , Air Passenger Travel Arrivals in the United States . ( http://nces.ed.gov/nceskids/createAgraph/ )
- Look at how different grouping in histogram can lead to different conclusions at: Interactive Histograms ( http://www.shodor.org/interactivate/activities/Histogram/?version=1.5.0_16&browser=safari&vendor=Apple_Inc.&flash=10.0.22 )
Put in interval values of 10, 20, and 50 and discuss how that changes the findings. Which do you think is most appropriate? Why?
Excel Exercise:
In this assignment, you will analyze data from a pseudo sample of social service users in New York City. This subset includes 324 cases and 10 variables concerning demographic and usage data on a NYC social service agency.
Please download the data file named "made up social service agency database.xls" located in the resources area to your personal computer folder. There are two "tabs." One contains data, and the other contains a data dictionary. Answer the questions on the spreadsheet, and submit it under the Assignments section of iLearn.
Types of Data and Graphs
Once you open the data:
- State whether each of the variables is categorical or quantitative. If they are categorical, identify it as either ordinal or nominal.
- Identify at least three variables for your analysis. At least two of the variables must be quantitative (interval or ratio) and one must be categorical (nominal or ordinal). Using Excel, graph each of your three variables with an appropriate graph for the level of measurement, and add all necessary labels and titles to your data and graphs.
- Are there any outliers or unusual results? Does the data look as you would expect? Describe the overall patterns you observe in each graph.
Deliverable: Word Document
