Editing Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 6
From CommunityData
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 5: | Line 5: | ||
: Lagakos, S., & Mosteller, F. (1981). A case study of statistics in the regulatory process: the FD&C Red No. 40 experiments. ''Journal of the National Cancer Institute'', 66(1), 197–212. [[https://www.gwern.net/docs/statistics/1981-lagakos.pdf PDF]] | : Lagakos, S., & Mosteller, F. (1981). A case study of statistics in the regulatory process: the FD&C Red No. 40 experiments. ''Journal of the National Cancer Institute'', 66(1), 197–212. [[https://www.gwern.net/docs/statistics/1981-lagakos.pdf PDF]] | ||
: '''PC0.''' Download the dataset | I found a copy of the dataset [http://college.cengage.com/mathematics/brase/understandable_statistics/7e/students/datasets/owan/frames/frame.html at this link]. | ||
: '''PC1.''' Load the data | |||
: '''PC0.''' Download the dataset from [http://college.cengage.com/mathematics/brase/understandable_statistics/7e/students/datasets/owan/frames/frame.html from this webpage]. You'll find that the it's not in an ideal setup. It's an Excel files (XLS) with a series of columns labeled X1.. X4. The format is not exactly tabular. | |||
: '''PC1.''' Load the data. Now get to work on reshaping the dataset. I think a good format would be a data frame with two columns: group, time of death (i.e., lifespan). | |||
: '''PC2.''' Create summary statistics and visualizations for each group. Write code that allows you to generate a useful way to both (a) get a visual sense both for the shape of the data and its relationships and (b) the degree to which the assumptions for t-tests and ANOVA hold. What is the global mean of your dependent variable? | : '''PC2.''' Create summary statistics and visualizations for each group. Write code that allows you to generate a useful way to both (a) get a visual sense both for the shape of the data and its relationships and (b) the degree to which the assumptions for t-tests and ANOVA hold. What is the global mean of your dependent variable? | ||
: '''PC3.''' Do a t-test between mice with '' | : '''PC3.''' Do a t-test between mice with ''any'' RD40 and mice with at least a small amount. Run a t-test between the group with a high dosage and control group. | ||
: '''PC4.''' | : '''PC4.''' Run an anova using aov() to see if there is a difference between the groups. | ||
== Statistical Questions from OpenIntro §6 == | == Statistical Questions from OpenIntro §6 == | ||
Line 29: | Line 32: | ||
:* (d) Why weren't we happy just leaving it where we did in week 2? Why bother with the statistical test? | :* (d) Why weren't we happy just leaving it where we did in week 2? Why bother with the statistical test? | ||
: '''Q6.''' Do the same as above but for ''Study 2''. | : '''Q6.''' Do the same as above but for ''Study 2''. | ||