Editing Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 8

== Programming Challenges ==

The first set of programming challenges will use your the individual dataset we used in [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 3|the week 3 problem set's programming challenges]]:

: '''PC0.''' Load up your dataset as you did in [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 3|Week 3 PC2]].
: '''PC1.''' If you recall from [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 3|Week PC6]], x and y seemed like they linearly related. We now have the tools and terminology to describe this relationship and to estimate just how related they are. Run a t.test between x and y in the dataset and be ready to interpret the results for the class.
: '''PC2.''' Estimate how correlated x and y are with each other.
: '''PC3.''' Recode your data in the way that I laid out in [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 3|Week 3 PC7]].
: '''PC4.'''  Generate a set of three linear models and be ready to intrepret the coefficients, standard errors, t-statistics, p-values, and <math>\mathrm{R}^2</math> for each: 
:: (a) <math>\hat{y} = \beta_0 + \beta_1 x + \varepsilon</math>
:: (b) <math>\hat{y} = \beta_0 + \beta_1 x + \beta_2 i + \beta_3 j + \varepsilon</math>
:: (c) <math>\hat{y} = \beta_0 + \beta_1 x + \beta_2 i + \beta_3 j + \beta k + \varepsilon</math>
: '''PC5.''' Generate a set of residual plots for the final model (c) and be ready to interpret your model in terms of each of these:
:: (a) A histogram of the residuals.
:: (b) Plot the residuals by your values of x, i, j, and k (four different plots).
:: (c) A QQ plot to evaluate the normality of residuals assumption.
: '''PC6.''' Generate a nice looking publication-ready table with a series of fitted models and put them in a Word document.

Now, lets go back to the Michelle Obama dataset we used last week [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 7|the week 7 problem set's programming challenges]].
: '''PC7.''' Load up the dataset once again and fit the following linear models and be ready to interpret them similar to the way you did above in PC4:
:: (a) <math>\widehat{\mathrm{fruit}} = \beta_0 + \beta_1 \mathrm{obama} + \varepsilon</math>
:: (b) Add a control for age and a categorical version of a control for year to the model in (a).
: '''PC8.''' Take a look at the residuals for your model in (a) and try to interpret these as you would in PC4 above. What do you notice?
: '''PC9.''' Run the simple model in (a) three times on three subsets of the dataset: just 2012, 2014, and 2015. Be ready to talk through the results.

== Statistics Questions ==

: '''Q0.''' Any questions or clarifications from the PSU material or the OpenIntro text?
: '''Q1-Q4.''' The next four questions are all of the form "interpret this model" and are using the example we used in the text. They are listed on [https://faculty.washington.edu/makohill/com521/week_06_statistics_questions.nb.html this page I've created] (it requires a UW NetID). If it's helpful, that page also includes all the R code so you can try stuff out yourself.
: '''Q5.''' Exercise 8.16 on Challenger o-rings.
: '''Q6.''' Exercise 8.18 which is more on challenger o-rings.

== Empirical Paper Questions == 

These questions are about the [http://dx.doi.org/10.1145/985692.985761 Lampe and Resnick] once again. For this week, we'll focus on the logistic regression table in Table 4.

: '''Q7.''' Be ready to explain what Table 4 means in both statistical and substantive terms. In particular, be ready to interpret the coefficients in substantive terms and be ready to explain what the Z-statistics, Pseudo <math>R^2</math>, and p-values mean. Be ready to provide an sentence for each that interprets each number in the table in substantive terms. This will mean understanding what every variable actually measures.

== Questions on Ioannidis (2005) ==

: '''Q8.''' Be ready to summarize the main point of, and share some reflections on, the paper. There are no specific questions.