Editing Statistics and Statistical Programming (Spring 2019)/Problem Set: Week 8 (section)

== Programming Challenges ==

The first set of programming challenges will once again use your dataset from [[Statistics and Statistical Programming (Spring 2019)/Problem Set: Week 3|the week 3 problem set]]:

: '''PC0.''' Load up your dataset as you did in [[Statistics and Statistical Programming (Spring 2019)/Problem Set: Week 3|Week 3 PC1]]. 
: '''PC1.''' Refamiliarize yourself with the data and recode your variables as you did for [[Statistics and Statistical Programming (Spring 2019)/Problem Set: Week 3|Week 3 PC8]]. You may recall from that x and y looked like they might be related. We now have the tools and terminology to describe this relationship and to estimate just how related they are.
: '''PC2.''' Run a t.test between x and y in the dataset and be prepared to interpret the results.
: '''PC3.''' Estimate how correlated x and y are with each other.
: '''PC4.''' Fit a linear model corresponding to the following formula and be ready to interpret the coefficients, standard errors, t-statistics, p-values, and <math>\mathrm{R}^2</math> of it: 
:: <math>\hat{y} = \beta_0 + \beta_1 x + \varepsilon</math>
: '''PC5.''' Generate the following diagnostic plots and be prepared to explain (a) what issue(s) and/or assumptions each one can help you evaluate; (b) what conclusions you draw from them:
:: (a) A histogram of the residuals.
:: (b) Plot the residuals by your values of x.
:: (c) A QQ plot. 
: '''PC6.''' Generate a nice looking publication-ready table with the fitted model formatted as raw text, HTML, or LaTeX.


Now, lets go back to the Michelle Obama dataset we used last week as part of the [[Statistics and Statistical Programming (Spring 2019)/Problem Set: Week 7|week 7 problem set]].
: '''PC7.''' Load up the full dataset and fit the following linear model. Be ready to interpret the results in the same way you did for PC4 above:
:: <math>\widehat{\mathrm{fruit}} = \beta_0 + \beta_1 \mathrm{obama} + \varepsilon</math>
: '''PC8.''' Examine the residuals for your model in and try to interpret these as you did in PC4 above. What do you notice? (Note: treat the dichotomous measures as continuous for the moment. We'll discuss the implications of that in class.)
: '''PC9.''' Run the model on three subsets of the dataset: just 2012, 2014, and 2015. Be prepared to talk through the results.