Not logged in
Talk
Contributions
Create account
Log in
Navigation
Main page
About
People
Publications
Teaching
Resources
Research Blog
Wiki Functions
Recent changes
Help
Licensing
Page
Discussion
Edit
View history
Editing
Statistics and Statistical Programming (Spring 2019)/Problem Set: Week 8
(section)
From CommunityData
Jump to:
navigation
,
search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Programming Challenges == The first set of programming challenges will once again use your dataset from [[Statistics and Statistical Programming (Spring 2019)/Problem Set: Week 3|the week 3 problem set]]: : '''PC0.''' Load up your dataset as you did in [[Statistics and Statistical Programming (Spring 2019)/Problem Set: Week 3|Week 3 PC1]]. : '''PC1.''' Refamiliarize yourself with the data and recode your variables as you did for [[Statistics and Statistical Programming (Spring 2019)/Problem Set: Week 3|Week 3 PC8]]. You may recall from that x and y looked like they might be related. We now have the tools and terminology to describe this relationship and to estimate just how related they are. : '''PC2.''' Run a t.test between x and y in the dataset and be prepared to interpret the results. : '''PC3.''' Estimate how correlated x and y are with each other. : '''PC4.''' Fit a linear model corresponding to the following formula and be ready to interpret the coefficients, standard errors, t-statistics, p-values, and <math>\mathrm{R}^2</math> of it: :: <math>\hat{y} = \beta_0 + \beta_1 x + \varepsilon</math> : '''PC5.''' Generate the following diagnostic plots and be prepared to explain (a) what issue(s) and/or assumptions each one can help you evaluate; (b) what conclusions you draw from them: :: (a) A histogram of the residuals. :: (b) Plot the residuals by your values of x. :: (c) A QQ plot. : '''PC6.''' Generate a nice looking publication-ready table with the fitted model formatted as raw text, HTML, or LaTeX. Now, lets go back to the Michelle Obama dataset we used last week as part of the [[Statistics and Statistical Programming (Spring 2019)/Problem Set: Week 7|week 7 problem set]]. : '''PC7.''' Load up the full dataset and fit the following linear model. Be ready to interpret the results in the same way you did for PC4 above: :: <math>\widehat{\mathrm{fruit}} = \beta_0 + \beta_1 \mathrm{obama} + \varepsilon</math> : '''PC8.''' Examine the residuals for your model in and try to interpret these as you did in PC4 above. What do you notice? (Note: treat the dichotomous measures as continuous for the moment. We'll discuss the implications of that in class.) : '''PC9.''' Run the model on three subsets of the dataset: just 2012, 2014, and 2015. Be prepared to talk through the results.
Summary:
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see
CommunityData:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:
Cancel
Editing help
(opens in new window)
Tools
What links here
Related changes
Special pages
Page information