Editing Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 6
From CommunityData
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 5: | Line 5: | ||
: Lagakos, S., & Mosteller, F. (1981). A case study of statistics in the regulatory process: the FD&C Red No. 40 experiments. ''Journal of the National Cancer Institute'', 66(1), 197–212. [[https://www.gwern.net/docs/statistics/1981-lagakos.pdf PDF]] | : Lagakos, S., & Mosteller, F. (1981). A case study of statistics in the regulatory process: the FD&C Red No. 40 experiments. ''Journal of the National Cancer Institute'', 66(1), 197–212. [[https://www.gwern.net/docs/statistics/1981-lagakos.pdf PDF]] | ||
: '''PC0.''' Download the dataset by clicking through on the "Red Dye | : '''PC0.''' Download the dataset by clicking through on the "Red Dye" link on [http://college.cengage.com/mathematics/brase/understandable_statistics/7e/students/datasets/owan/frames/frame.html this webpage]. You'll find that the it's not in an ideal setup. It's an Excel file (XLS) with a series of columns labeled X1.. X4. The format is not exactly tabular. | ||
: '''PC1.''' Load the data into R. Now get to work on reshaping the dataset. I think a good format would be a data frame with two columns: group, time of death (i.e., lifespan). | : '''PC1.''' Load the data into R. Now get to work on reshaping the dataset. I think a good format would be a data frame with two columns: group, time of death (i.e., lifespan). | ||
: '''PC2.''' Create summary statistics and visualizations for each group. Write code that allows you to generate a useful way to both (a) get a visual sense both for the shape of the data and its relationships and (b) the degree to which the assumptions for t-tests and ANOVA hold. What is the global mean of your dependent variable? | : '''PC2.''' Create summary statistics and visualizations for each group. Write code that allows you to generate a useful way to both (a) get a visual sense both for the shape of the data and its relationships and (b) the degree to which the assumptions for t-tests and ANOVA hold. What is the global mean of your dependent variable? |