Latest revision |
Your text |
Line 1: |
Line 1: |
| This problem set asks you to apply, extend, and interpret the widely influential "bread and peace" model of U.S. electoral behavior from the work of [https://douglas-hibbs.com/ Douglas Hibbs]. In brief, Hibbs argues that two variables almost perfectly predict U.S. presidential election vote-share for incumbent party candidates since 1950: economic growth and U.S. military fatalities (both calculated over the duration of the previous president's term). Since we're doing univariate (one predictor variable) regression this week, I ask you to work with the income measure (predictor) and the incumbent part vote share (outcome).
| |
|
| |
|
| == Programming challenges ==
| |
| === PC1 Import and update data ===
| |
| Data for all U.S. presidential elections 1952-2012 are [https://github.com/avehtari/ROS-Examples/raw/master/ElectionsEconomy/data/hibbs.dat available here]. Note that this points to a ".dat" file, which in this case is just a raw text file format that you can import using the following command: <code>read.table(url(<insert.url.here>), header=TRUE)</code>. (inserting the URL for the dataset in the appropriate spot).
| |
|
| |
|
| Each row corresponds to one presidential election since 1952. The variables provided are:
| |
| * <code>year</code> The year of the presidential election.
| |
| * <code>growth</code> Economic growth during the preceding four years (increase in per-capita income).
| |
| * <code>vote</code> Proportion of the popular vote won by the incumbent party candidate.
| |
| * <code>inc_party_candidate</code> Incumbent party candidate.
| |
| * <code>other_candidate</code> Out-party candidate.
| |
|
| |
|
| The dataset does not include 2016, so we can add that by hand. You might recall that Hillary Clinton was the incumbent party candidate and Donald Trump was the out-party candidate that year. Clinton won approximately 51.1% of the popular vote and a reasonable estimate for per-capita income growth 2012-2016 is 2.2%. You can append this information to the imported dataset in a bunch of different ways. (I would personally do so using a call to <code>list()</code> nested inside a call to <code>rbind()</code> (e.g., <code>rbind(<hibbs_data>, list(<2016 row>))</code>). You could also explore the <code>add_row()</code> function in the tidyverse. As usual, your mileage may vary.)
| | == Programming challenges (Part I) == |
| | | === Import and update the data === |
| === PC2 Summarize and visualize data === | | === Summarize and visualize === |
| | | === Fit and summarize a linear model === |
| You should be familiar with how to do this by now. Make sure to include a scatterplot of <code>growth</code> against <code>vote</code>.
| | === Assess the model fit === |
| | | === Interpret the results === |
| === PC3 Calculate covariance and correlation === | | === Calculate an out-of-sample prediction interval === |
| Calculate the covariance and correlation of <code>growth</code> and <code>vote</code>.
| |
| | |
| See this week's R tutorial for example commands here and the Wikipedia articles on correlation and covariance for details about the underlying calculations.
| |
| | |
| === PC4 Fit and summarize a linear model === | |
| | |
| Use the <code>lm()</code> function to fit a least squares regression of economic growth on incumbent party vote share. Use the <code>summary()</code> function to present a summary of the model results.
| |
| | |
| === PC5 Assess the model fit === | |
| | |
| Evaluate the conditions for least squares regression (linearity, normal residuals, constant variability, independent observations). Wherever possible, present plots and/or calculations to support your evaluations. In particular, you probably want to produce the following (examples provided in this week's R tutorial):
| |
| (a) a histogram of the residuals
| |
| (b) a plot of the residuals against the (sequential) values of X
| |
| (c) a quantile-quantile plot
| |
| | |
| === PC6 Calculate confidence interval for a coefficient === | |
| | |
| The very last part of `OpenIntro` §8 provides detailed instructions for estimating a confidence interval around a regression coefficient. Please calculate the confidence interval for the coefficient on <code>growth</code> from the results of your regression model.
| |
| | |
| === PC7 Calculate an out-of-sample prediction and 95% prediction interval === | |
| | |
| What was/is the predicted vote share for Donald Trump in 2020 based on this model? The online supplement to `OpenIntro` §8 assigned this week provides detailed examples for how to produce a out-of-sample prediction from a regression model. Please calculate the point estimate and 95% prediction interval for the incumbent party candidate's share of the vote in 2020 given that (a [https://osf.io/preprints/socarxiv/xrf3t/ reasonable estimate] of) the per-capita income growth 2016-2020 is 2.5%.
| |
| | |
| == Statistical questions ==
| |
| The questions below refer to the univariate regression analysis you completed in the programming challenges above.
| |
| | |
| === SQ1 Describe and interpret the results ===
| |
| Do this for any/all of the analysis you conducted in the programming challenges. In particular, be sure to:
| |
| * address any noteworthy observations from the descriptive summaries and plots
| |
| * summarize the regression results effectively (including the coefficients and <math>R^2</math> value).
| |
| * summarize the confidence interval around the estimate for <growth> that you calculated.
| |
| * provide a substantive interpretation of the results in terms of the variables/concepts included in the analysis.
| |
|
| |
| === SQ2 Discuss regression diagnostics ===
| |
| Describe the regression diagnostics and whether the conditions necessary to identify a least-squares fit seem to apply. If there are violations of these assumptions/conditions, consider how that might bias the results.
| |
| | |
| === SQ3 Disambiguate: correlation vs. covariance vs. OLS estimate ===
| |
| You characterized the relationship between <code>growth</code> and <code>vote</code> in three different ways. What do you make of each of these? What are the similarities and differences between them?
| |
| | |
| === SQ4 Interpret out-of-sample prediction ===
| |
| Discuss and interpret the out-of-sample prediction you calculated for Trump's vote share in 2020. As of the writing of the problem set, Trump seems to have received about [https://en.wikipedia.org/w/index.php?title=2020_United_States_presidential_election&oldid=988030609 47.6% of the popular vote]. How does this (not-yet-final) observed value relate to your prediction? How do you interpret this relationship?
| |
| | |
| === SQ5 Revisit (vaguely stated) theory ===
| |
| | |
| Insofar as we've only considered one part of the "bread and peace" theory here, how would you interpret your results in light of the prior theory/findings as described at the beginning of the problem set? Any confounding factors not present in the original theory/models that you think might be important to include? Why would you argue to include them (or not)?
| |