Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 8: Difference between revisions

From CommunityData
No edit summary
No edit summary
Line 9: Line 9:
:: (b) <math>\hat{y} = \beta_0 + \beta_1 x + \beta_2 i + \beta_3 j + \varepsilon</math>
:: (b) <math>\hat{y} = \beta_0 + \beta_1 x + \beta_2 i + \beta_3 j + \varepsilon</math>
:: (c) <math>\hat{y} = \beta_0 + \beta_1 x + \beta_2 i + \beta_3 j + \beta k + \varepsilon</math>
:: (c) <math>\hat{y} = \beta_0 + \beta_1 x + \beta_2 i + \beta_3 j + \beta k + \varepsilon</math>
: '''PC5.''' Generate a nice looking publication-ready table with a series of fitted models and put them in your table.
: '''PC5.''' Generate a set of residual plots for the final model (c) and be ready to interpret your model in terms of each of these:
:: (a) A histogram of the residuals.
:: (b) Plot the residuals by your values of x, i, j, and k (four different plots).
:: (c) A QQ plot to evaluate the normality of residuals assumption.
: '''PC6.''' Generate a nice looking publication-ready table with a series of fitted models and put them in your table.
 
Now, lets go back to the Michelle Obama dataset we used last week [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 3|the week 3 problem set's programming challenges]]

Revision as of 06:38, 16 February 2017

The first set of programming challenges will use your the individual dataset we used in the week 3 problem set's programming challenges:

PC0. Load up your dataset as you did in Week 3 PC2.
PC1. If you recall from Week PC6, x and y seemed like they linearly related. We now have the tools and terminology to describe this relationship and to estimate just how related they are. Run a t.test between x and y in the dataset and be ready to interpret the results for the class.
PC2. Estimate how correlated x and y are with each other?
PC3. Recode your data in the way that I laid out in Week 3 PC7.
PC4. Generate a set of three linear models and be ready to intrepret the coefficients, standard errors, t-statistics, p-values, and for each:
(a)
(b)
(c)
PC5. Generate a set of residual plots for the final model (c) and be ready to interpret your model in terms of each of these:
(a) A histogram of the residuals.
(b) Plot the residuals by your values of x, i, j, and k (four different plots).
(c) A QQ plot to evaluate the normality of residuals assumption.
PC6. Generate a nice looking publication-ready table with a series of fitted models and put them in your table.

Now, lets go back to the Michelle Obama dataset we used last week the week 3 problem set's programming challenges