Editing Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 8 (section)

== Programming Challenges ==

The first set of programming challenges will use your the individual dataset we used in [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 3|the week 3 problem set's programming challenges]]:

: '''PC0.''' Load up your dataset as you did in [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 3|Week 3 PC2]].
: '''PC1.''' If you recall from [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 3|Week PC6]], x and y seemed like they linearly related. We now have the tools and terminology to describe this relationship and to estimate just how related they are. Run a t.test between x and y in the dataset and be ready to interpret the results for the class.
: '''PC2.''' Estimate how correlated x and y are with each other.
: '''PC3.''' Recode your data in the way that I laid out in [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 3|Week 3 PC7]].
: '''PC4.'''  Generate a set of three linear models and be ready to intrepret the coefficients, standard errors, t-statistics, p-values, and <math>\mathrm{R}^2</math> for each: 
:: (a) <math>\hat{y} = \beta_0 + \beta_1 x + \varepsilon</math>
:: (b) <math>\hat{y} = \beta_0 + \beta_1 x + \beta_2 i + \beta_3 j + \varepsilon</math>
:: (c) <math>\hat{y} = \beta_0 + \beta_1 x + \beta_2 i + \beta_3 j + \beta k + \varepsilon</math>
: '''PC5.''' Generate a set of residual plots for the final model (c) and be ready to interpret your model in terms of each of these:
:: (a) A histogram of the residuals.
:: (b) Plot the residuals by your values of x, i, j, and k (four different plots).
:: (c) A QQ plot to evaluate the normality of residuals assumption.
: '''PC6.''' Generate a nice looking publication-ready table with a series of fitted models and put them in a Word document.

Now, lets go back to the Michelle Obama dataset we used last week [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 7|the week 7 problem set's programming challenges]].
: '''PC7.''' Load up the dataset once again and fit the following linear models and be ready to interpret them similar to the way you did above in PC4:
:: (a) <math>\widehat{\mathrm{fruit}} = \beta_0 + \beta_1 \mathrm{obama} + \varepsilon</math>
:: (b) Add a control for age and a categorical version of a control for year to the model in (a).
: '''PC8.''' Take a look at the residuals for your model in (a) and try to interpret these as you would in PC4 above. What do you notice?
: '''PC9.''' Run the simple model in (a) three times on three subsets of the dataset: just 2012, 2014, and 2015. Be ready to talk through the results.