Statistics and Statistical Programming (Spring 2019)/Problem Set: Week 8: Difference between revisions

From CommunityData
(Created page with "== Programming Challenges == The first set of programming challenges will once again use your dataset from Statistics and Statistical Programming (Spring 2019)/Problem Set:...")
 
Line 8: Line 8:
: '''PC23''' Estimate how correlated x and y are with each other.
: '''PC23''' Estimate how correlated x and y are with each other.
: '''PC4.''' Fit a linear model corresponding to the following formula and be ready to interpret the coefficients, standard errors, t-statistics, p-values, and <math>\mathrm{R}^2</math> of it:  
: '''PC4.''' Fit a linear model corresponding to the following formula and be ready to interpret the coefficients, standard errors, t-statistics, p-values, and <math>\mathrm{R}^2</math> of it:  
:: (a) <math>\hat{y} = \beta_0 + \beta_1 x + \varepsilon</math>
:: <math>\hat{y} = \beta_0 + \beta_1 x + \varepsilon</math>
: '''PC5.''' Generate the following diagnostic plots and be prepared to explain (a) what issue(s) and/or assumptions each one can help you evaluate; (b) what conclusions you draw from them:
: '''PC5.''' Generate the following diagnostic plots and be prepared to explain (a) what issue(s) and/or assumptions each one can help you evaluate; (b) what conclusions you draw from them:
:: (a) A histogram of the residuals.
:: (a) A histogram of the residuals.

Revision as of 16:28, 14 May 2019

Programming Challenges

The first set of programming challenges will once again use your dataset from the week 3 problem set:

PC0. Load up your dataset as you did in Week 3 PC1.
PC1. Refamiliarize yourself with the data and recode your variables as you did for Week 3 PC8. You may recall from that x and y looked like they might be related. We now have the tools and terminology to describe this relationship and to estimate just how related they are.
PC2. Run a t.test between x and y in the dataset and be prepared to interpret the results.
PC23 Estimate how correlated x and y are with each other.
PC4. Fit a linear model corresponding to the following formula and be ready to interpret the coefficients, standard errors, t-statistics, p-values, and of it:
PC5. Generate the following diagnostic plots and be prepared to explain (a) what issue(s) and/or assumptions each one can help you evaluate; (b) what conclusions you draw from them:
(a) A histogram of the residuals.
(b) Plot the residuals by your values of x.
(c) A QQ plot.
PC6. Generate a nice looking publication-ready table with the fitted model formatted as HTML or LaTeX.

Now, lets go back to the Michelle Obama dataset we used last week as part of the week 7 problem set.

PC7. Load up the full dataset and fit the following linear model. Be ready to interpret the results in the same way you did for PC4 above:
PC8. Examine the residuals for your model in and try to interpret these as you did in PC4 above. What do you notice?
PC9. Run the model on three subsets of the dataset: just 2012, 2014, and 2015. Be prepared to talk through the results.

Statistics Questions

SQ0. Any questions or clarifications from the PSU material or the OpenIntro text?
SQ1. Exercise 8.4 on school absenteeism
SQ2. Exercise 8.10 on school absenteeism again (no sub-parts)
SQ3. Exercise 8.14 on evaluating regression residuals (no sub-parts)
SQ4. Exercise 8.16 on Challenger o-rings.
SQ5. Exercise 8.18 which is more on Challenger o-rings.

Empirical Paper Questions

These questions are about the Lampe and Resnick once again. For this week, we'll focus on the logistic regression table in Table 4.

EQ0. Any questions or clarifications from the paper that we didn't cover last week?
EQ1. Be ready to explain what Table 4 means in both statistical and substantive terms. In particular, be ready to interpret the coefficients in substantive terms and be ready to explain what the Z-statistics, Pseudo , and p-values mean. Be ready to provide a sentence for each that interprets each number in the table in substantive terms. This will mean understanding what every variable actually measures.

And these questions focus on issues raised by Reinhart in §8 and §9.