Latest revision |
Your text |
Line 1: |
Line 1: |
| == Programming Challenges == | | == Programming Challenges == |
|
| |
|
| Let's re-evaluate some data from this paper:
| | : '''PC0.''' I've provided the full dataset from which I drew each of your samples in a TSV file in the directory <code>week_05</code> in [https://github.com/makoshark/uwcom521-assignments/ class assignment git repository]. These are ''tab delimited'', not comma delimited. TSV, is related to CSV and is also a common format. Go ahead and load it into R (''HINT: <code>read.delim()</code>''). Take the mean of the variable <code>x</code> in that dataset. That is the true population mean — the thing we have been creating estimates of in week 2 and week 3. |
| | |
| : Lagakos, S., & Mosteller, F. (1981). A case study of statistics in the regulatory process: the FD&C Red No. 40 experiments. ''Journal of the National Cancer Institute'', 66(1), 197–212. [[https://www.gwern.net/docs/statistics/1981-lagakos.pdf PDF]]
| |
| | |
| : '''PC0.''' Download the dataset by clicking through on the "Red Dye Number 40" link on [http://college.cengage.com/mathematics/brase/understandable_statistics/7e/students/datasets/owan/frames/frame.html this webpage]. You'll find that the it's not in an ideal setup. It's an Excel file (XLS) with a series of columns labeled X1.. X4. The format is not exactly tabular.
| |
| : '''PC1.''' Load the data into R. Now get to work on reshaping the dataset. I think a good format would be a data frame with two columns: group, time of death (i.e., lifespan).
| |
| : '''PC2.''' Create summary statistics and visualizations for each group. Write code that allows you to generate a useful way to both (a) get a visual sense both for the shape of the data and its relationships and (b) the degree to which the assumptions for t-tests and ANOVA hold. What is the global mean of your dependent variable?
| |
| : '''PC3.''' Do a t-test between mice with ''none'' RD40 and mice with ''any'' (i.e., at least a small amount). Next, run a t-test between the group with a high dosage and control group. How would you go about doing it using formula notation? Be ready to report, interpret, and discuss the results in substantive terms.
| |
| : '''PC4.''' Estimate an ANOVA analysis using aov() to see if there is a difference between the groups. Be ready to report, interpret, and discuss the results in substantive terms.
| |
|
| |
|
| == Statistical Questions from OpenIntro §6 == | | == Statistical Questions from OpenIntro §6 == |
Line 29: |
Line 21: |
| :* (d) Why weren't we happy just leaving it where we did in week 2? Why bother with the statistical test? | | :* (d) Why weren't we happy just leaving it where we did in week 2? Why bother with the statistical test? |
| : '''Q6.''' Do the same as above but for ''Study 2''. | | : '''Q6.''' Do the same as above but for ''Study 2''. |
|
| |
| == Questions on Gelman and Loken ==
| |
|
| |
| : '''Q7.''' Be ready to summarize the main point of, and share some reflections on, the paper. There are no specific questions.
| |