Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 8: Difference between revisions
From CommunityData
No edit summary |
No edit summary |
||
Line 17: | Line 17: | ||
Now, lets go back to the Michelle Obama dataset we used last week [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 7|the week 7 problem set's programming challenges]]. | Now, lets go back to the Michelle Obama dataset we used last week [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 7|the week 7 problem set's programming challenges]]. | ||
: '''PC7.''' Load up the dataset once again and fit the following linear models and be ready to interpret them similar to the way you did above in PC4: | : '''PC7.''' Load up the dataset once again and fit the following linear models and be ready to interpret them similar to the way you did above in PC4: | ||
:: (a) <math>\ | :: (a) <math>\widehat{\mathrm{fruit}} = \beta_0 + \beta_1 \mathrm{obama} + \varepsilon</math> | ||
:: (b) Add a control for age and a categorical version of a control for year to the model in (a). | :: (b) Add a control for age and a categorical version of a control for year to the model in (a). | ||
:: (c) Take a look at the residuals and try to interpret these as you would in PC4 above. | :: (c) Take a look at the residuals and try to interpret these as you would in PC4 above. What do you notice? | ||
:: (d) Run the simple model in (a) three times on three subsets of the dataset: just 2012, 2014, and 2015. Be ready to talk through the results. | :: (d) Run the simple model in (a) three times on three subsets of the dataset: just 2012, 2014, and 2015. Be ready to talk through the results. | ||
: '''PC8.''' | : '''PC8.''' |
Revision as of 06:50, 16 February 2017
The first set of programming challenges will use your the individual dataset we used in the week 3 problem set's programming challenges:
- PC0. Load up your dataset as you did in Week 3 PC2.
- PC1. If you recall from Week PC6, x and y seemed like they linearly related. We now have the tools and terminology to describe this relationship and to estimate just how related they are. Run a t.test between x and y in the dataset and be ready to interpret the results for the class.
- PC2. Estimate how correlated x and y are with each other?
- PC3. Recode your data in the way that I laid out in Week 3 PC7.
- PC4. Generate a set of three linear models and be ready to intrepret the coefficients, standard errors, t-statistics, p-values, and for each:
- (a)
- (b)
- (c)
- PC5. Generate a set of residual plots for the final model (c) and be ready to interpret your model in terms of each of these:
- (a) A histogram of the residuals.
- (b) Plot the residuals by your values of x, i, j, and k (four different plots).
- (c) A QQ plot to evaluate the normality of residuals assumption.
- PC6. Generate a nice looking publication-ready table with a series of fitted models and put them in your table.
Now, lets go back to the Michelle Obama dataset we used last week the week 7 problem set's programming challenges.
- PC7. Load up the dataset once again and fit the following linear models and be ready to interpret them similar to the way you did above in PC4:
- (a)
- (b) Add a control for age and a categorical version of a control for year to the model in (a).
- (c) Take a look at the residuals and try to interpret these as you would in PC4 above. What do you notice?
- (d) Run the simple model in (a) three times on three subsets of the dataset: just 2012, 2014, and 2015. Be ready to talk through the results.
- PC8.