Statistics and Statistical Programming (Spring 2019)/Problem Set: Week 8: Difference between revisions
From CommunityData
Line 14: | Line 14: | ||
:: (c) A QQ plot. | :: (c) A QQ plot. | ||
: '''PC6.''' Generate a nice looking publication-ready table with the fitted model formatted as HTML or LaTeX. | : '''PC6.''' Generate a nice looking publication-ready table with the fitted model formatted as HTML or LaTeX. | ||
Now, lets go back to the Michelle Obama dataset we used last week as part of the [[Statistics and Statistical Programming (Spring 2019)/Problem Set: Week 7|week 7 problem set]]. | Now, lets go back to the Michelle Obama dataset we used last week as part of the [[Statistics and Statistical Programming (Spring 2019)/Problem Set: Week 7|week 7 problem set]]. |
Revision as of 16:35, 15 May 2019
Programming Challenges
The first set of programming challenges will once again use your dataset from the week 3 problem set:
- PC0. Load up your dataset as you did in Week 3 PC1.
- PC1. Refamiliarize yourself with the data and recode your variables as you did for Week 3 PC8. You may recall from that x and y looked like they might be related. We now have the tools and terminology to describe this relationship and to estimate just how related they are.
- PC2. Run a t.test between x and y in the dataset and be prepared to interpret the results.
- PC3. Estimate how correlated x and y are with each other.
- PC4. Fit a linear model corresponding to the following formula and be ready to interpret the coefficients, standard errors, t-statistics, p-values, and of it:
- PC5. Generate the following diagnostic plots and be prepared to explain (a) what issue(s) and/or assumptions each one can help you evaluate; (b) what conclusions you draw from them:
- (a) A histogram of the residuals.
- (b) Plot the residuals by your values of x.
- (c) A QQ plot.
- PC6. Generate a nice looking publication-ready table with the fitted model formatted as HTML or LaTeX.
Now, lets go back to the Michelle Obama dataset we used last week as part of the week 7 problem set.
- PC7. Load up the full dataset and fit the following linear model. Be ready to interpret the results in the same way you did for PC4 above:
- PC8. Examine the residuals for your model in and try to interpret these as you did in PC4 above. What do you notice?
- PC9. Run the model on three subsets of the dataset: just 2012, 2014, and 2015. Be prepared to talk through the results.
Statistics Questions
- SQ0. Any questions or clarifications from the PSU material or the OpenIntro text?
- SQ1. Exercise 8.4 on school absenteeism
- SQ2. Exercise 8.10 on school absenteeism again (no sub-parts)
- SQ3. Exercise 8.14 on evaluating regression residuals (no sub-parts)
- SQ4. Exercise 8.16 on Challenger o-rings.
- SQ5. Exercise 8.18 which is more on Challenger o-rings.
Empirical Paper Questions
These questions are about the Lampe and Resnick once again. For this week, we'll focus on the logistic regression table in Table 4.
- EQ0. Any questions or clarifications from the paper that we didn't cover last week?
- EQ1. Be ready to explain what all of Table 5 means in both statistical and substantive terms. In particular, be ready to interpret all of the coefficients and to explain what the t-statistics, , and p-values mean.
- EQ2. Be ready to explain what Table 4 means in both statistical and substantive terms. In particular, be ready to interpret the coefficients in substantive terms and be ready to explain what the Z-statistics, Pseudo , and p-values mean.
And these questions focus on issues raised by Reinhart in §8 and §9.