Statistics and Statistical Programming (Fall 2020)/w10 session plan

From CommunityData

11/17 Agenda[edit]

  • Planning documents
    • In general, very good. Mostly we wanted more details about the measures, variables, and how you plan to analyze them. It will really benefit you to get very specific about this sooner rather than later!
    • Please come meet with us as you continue to develop these projects. This is a small class. We can provide very "bespoke" attention to your projects. Take advantage of this! We have office hours and can make additional time/appointments as-needed!
    • In Aaron's case, I'm also happy to meet after thanksgiving, but the price of admission to any such meeting is a draft table and/or figure with (preliminary) results! ;)
    • Another thing I observed that you might consider is identifying a "model paper" that can help serve as a rough template for the kinds of analysis you plan to do and how to report them. Note that Aaron and Nick can help you assess whether your model paper meets whatever expectations/hopes we might have for how you analyze your data and report your results.
    • The next deadlines on this project are a (recorded) presentation and the paper itself. We'll talk a bit more about these (likely on Thursday?), but for now please note that I have asked you to submit data + code alongside your finished paper (if possible). If for some reason, doing so is not possible, please get in touch.
  • PS7: Univariate regression and the "bread and peace" model of U.S. elections.
    • SQ1: walk through interpretation of results.
    • SQ2: how normal do residuals need to be??? how to diagnose issues??

Additional comments from Nick:

  • Revisiting confidence intervals for regression coefficients.
    • What precisely do they mean?
    • Can we use 1.96 x SE?
  • If we flip the independent and dependent variables in a bivariate regression analysis: what does that tell us?
  • Review exactly what we’re looking for in residual plots.
  • (Optional thing Nick thought might be interesting) Everyone suggested a great number of factors our model might be missing (voting public’s perceptions of the economy, attributes of the candidates, effects of pandemic, etc.). How could we go about capturing these? Should we make a new linear regression model and post it on Twitter?