Not logged in
Talk
Contributions
Create account
Log in
Navigation
Main page
About
People
Publications
Teaching
Resources
Research Blog
Wiki Functions
Recent changes
Help
Licensing
Page
Discussion
Edit
View history
Editing
Statistics and Statistical Programming (Winter 2021)/Problem set 14
(section)
From CommunityData
Jump to:
navigation
,
search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Programming challenges == === PC1 Import and update data === Data for all U.S. presidential elections 1952-2012 are [https://github.com/avehtari/ROS-Examples/raw/master/ElectionsEconomy/data/hibbs.dat available here]. Note that this points to a ".dat" file, which in this case is just a raw text file format that you can import using the following command: <code>read.table(url(<insert.url.here>), header=TRUE)</code>. (inserting the URL for the dataset in the appropriate spot). Each row corresponds to one presidential election since 1952. The variables provided are: * <code>year</code> The year of the presidential election. * <code>growth</code> Economic growth during the preceding four years (increase in per-capita income). * <code>vote</code> Proportion of the popular vote won by the incumbent party candidate. * <code>inc_party_candidate</code> Incumbent party candidate. * <code>other_candidate</code> Out-party candidate. The dataset does not include 2016 or 2020 so lets add 2016 by hand. You will likely recall that Hillary Clinton was the incumbent party candidate and Donald Trump was the out-party candidate that year. Clinton won approximately 51.1% of the popular vote and a reasonable estimate for per-capita income growth 2012-2016 is 2.2%. You can append the row for the 2016 election to the imported dataset in a bunch of different ways. (I would personally do so using a call to <code>list()</code> nested inside a call to <code>rbind()</code> (e.g., <code>rbind(<hibbs_data>, list(<2016 row>))</code>). You could also explore the <code>add_row()</code> function in the tidyverse. As usual, your mileage may vary.) === PC2 Summarize and visualize data === You should be familiar with how to do this by now. Make sure to include a scatterplot of <code>growth</code> against <code>vote</code>. === PC3 Calculate covariance and correlation === Calculate the covariance and correlation of <code>growth</code> and <code>vote</code>. See this week's R tutorial for example commands here and the Wikipedia articles on correlation and covariance for details about the underlying calculations. === PC4 Fit and summarize a linear model === Use the <code>lm()</code> function to fit a least squares regression of economic growth on incumbent party vote share. Use the <code>summary()</code> function to present a summary of the model results. === PC5 Assess the model fit === Evaluate the conditions for least squares regression (linearity, normal residuals, constant variability, independent observations). Wherever possible, present plots and/or calculations to support your evaluations. In particular, you probably want to produce the following (examples provided in this week's R tutorial): (a) a histogram of the residuals (b) a plot of the residuals against the (sequential) values of X (c) a quantile-quantile plot === PC6 Calculate confidence interval for a coefficient === The very last part of `OpenIntro` Β§8 provides detailed instructions for estimating a confidence interval around a regression coefficient. Please calculate the confidence interval for the coefficient on <code>growth</code> from the results of your regression model. === PC7 Calculate an out-of-sample prediction and 95% prediction interval === What was/is the predicted vote share for Donald Trump in 2020 based on this model? The online supplement to `OpenIntro` Β§8 assigned this week provides detailed examples for how to produce a out-of-sample prediction from a regression model. Please calculate the point estimate and 95% prediction interval for the incumbent party candidate's share of the vote in 2020 given that (a [https://osf.io/preprints/socarxiv/xrf3t/ reasonable estimate] of) the per-capita income growth 2016-2020 is 2.5%.
Summary:
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see
CommunityData:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:
Cancel
Editing help
(opens in new window)
Tools
What links here
Related changes
Special pages
Page information