CommunityData:StatsGaps: Difference between revisions

From CommunityData
Line 192: Line 192:


=== Week 8 ===
=== Week 8 ===
====Polynomial Terms, Interactions, and Logistic Regression ====
Polynomial Terms, Interactions, and Logistic Regression


All:
====All:====
* Diez, Barr, and Çetinkaya-Rundel: §8 (Multiple and logistic regression)
* Diez, Barr, and Çetinkaya-Rundel: §8 (Multiple and logistic regression)
* [https://onlinecourses.science.psu.edu/stat501/node/301 Lesson 8: Categorical Predictors] and [https://onlinecourses.science.psu.edu/stat501/node/318 Lesson 9: Data Transformations] from the PennState Eberly College of Science STAT 501 Regression Methods Course. There are several subparts (many quite short), please read them all carefully.
* [https://onlinecourses.science.psu.edu/stat501/node/301 Lesson 8: Categorical Predictors] and [https://onlinecourses.science.psu.edu/stat501/node/318 Lesson 9: Data Transformations] from the PennState Eberly College of Science STAT 501 Regression Methods Course. There are several subparts (many quite short), please read them all carefully.
* Mako Hill wrote this document which will likely be useful for many of you: [https://communitydata.cc/~mako/2017-COM521/logistic_regression_interpretation.html Interpreting Logistic Regression Coefficients with Examples in R]
* Mako Hill wrote this document which will likely be useful for many of you: [https://communitydata.cc/~mako/2017-COM521/logistic_regression_interpretation.html Interpreting Logistic Regression Coefficients with Examples in R]


Learn Stats:
====Learn Stats:====
* [[Statistics and Statistical Programming (Spring 2019)/Problem Set: Week 8]]
* [[Statistics and Statistical Programming (Spring 2019)/Problem Set: Week 8]]


Learn R:
====Learn R:====
*[https://communitydata.science/~ads/teaching/2019/stats/r_lectures/w08-R_lecture.Rmd Week 8 R lecture materials]
*[https://communitydata.science/~ads/teaching/2019/stats/r_lectures/w08-R_lecture.Rmd Week 8 R lecture materials]


'''Resources:'''
====Resources====
* Verzani: §11.3 (Linear regression), §13.1 (Logistic regression)
* Verzani: §11.3 (Linear regression), §13.1 (Logistic regression)
* Ioannidis, John P. A. 2005. “Why Most Published Research Findings Are False.” ''PLoS Medicine'' 2(8):e124. [[http://dx.doi.org/10.1371%2Fjournal.pmed.0020124 Open Access]]
* Ioannidis, John P. A. 2005. “Why Most Published Research Findings Are False.” ''PLoS Medicine'' 2(8):e124. [[http://dx.doi.org/10.1371%2Fjournal.pmed.0020124 Open Access]]

Revision as of 06:44, 7 August 2019

Welcome to the StatsGaps StudyGroup page -- a set of suggested learning pathways making use of course resources produced by community data faculty, meant to be used by folks who have a mixture of familiarity and non-familiarity with R, statistics, and research processes. The primary text is: Open Intro to Statistics

We borrow heavily from the course most recently taught by Aaron: Statistics_and_Statistical_Programming_(Spring_2019)

To support participation from people with ranging prior experiences and learning goals for this summer, we've organized this content into the following strands:

  • Learn R -- you don't know R
  • Learn Stats -- you haven't taken much if any statistics, or otherwise feel you're mostly starting from scratch
  • Refresh -- overview and shore up your stats knowledge if it feels rusty
  • Stronger -- your stats knowledge is strong but your class stopped before you got to good stuff you see used in lots of the papers in this group (e.g. regression)

Depending on which strand(s) best apply to you, we provide different recommended readings and assignments each week.

Meet at http://meet.jit.si/cdsc on Monday at 11:00 Pacific, 1:00 Central.

Schedule:

Week 1

All:

  • Read: Preventing harassment and increasing group participation through social norms in 2,190 online science discussions J. Nathan Matias

PNAS May 14, 2019 116 (20) 9785-9789; first published April 29, 2019 -- Mattias--Harassment Prevention

Learn R:

Learn Stats:

Refresh:

Stronger:

  • Skim Problem Set 1 -- since we may discuss it f2f. Take a look at the text's Chapter 1 if you find any of the questions to be confusing or the answer you came up with is different than the key.


Extra Resources:

Week 2: Probability and Visualization

All:

  • Shaw, Aaron and Yochai Benkler. 2012. A tale of two blogospheres: Discursive practices on the left and right. American Behavioral Scientist. 56(4): 459-487. [1]

Learn R:

Learn Stats:

Refresh:

Stronger:


Extra Resources:

  • Seeing Theory §1 (Basic Probability) and §2 (Compound Probability). (Note: this site provides a beautiful visual introduction to core concepts in probability and statistics).
  • Buechley, Leah and Benjamin Mako Hill. 2010. “LilyPad in the Wild: How Hardware’s Long Tail Is Supporting New Engineering and Design Communities.” Pp. 199–207 in Proceedings of the 8th ACM Conference on Designing Interactive Systems. Aarhus, Denmark: ACM. PDF
  • Mine Çetinkaya-Rundel's OpenIntro §2 Lecture Notes
  • Video Lectures including 2 short videos for §2

Week 3: Distributions

All: (N/A)

Learn R:

Learn Stats:

Refresh:

Stronger:


Extra Resources:

Week 4: Statistical significance and hypothesis testing

All:

  • Read Diez, Barr, and Çetinkaya-Rundel: §4 (Foundations for inference) (I suggest everyone read this chapter -- this topic is a source of much confusion. -khc)
  • Gelman, Andrew and Hal Stern. 2006. “The Difference Between ‘Significant’ and ‘Not Significant’ Is Not Itself Statistically Significant.” The American Statistician 60(4):328–31. [Available via your library]

Learn R: N/A

Learn Stats:

Refresh:

Stronger:

Resources:

Week 5: Continuous Numeric Data & ANOVA

All:

  • Sweetser, K. D., & Metzgar, E. (2007). Communicating during crisis: Use of blogs as a relationship management tool. Public Relations Review, 33(3), 340–342. [Available through NU Libraries]

Learn R:

Learn Stats:

Refresh:

Stronger:

Resources:

Week 6: Categorical data

All:

Learn R:

Learn Stats:

Refresh and get Stronger:

Resources

Week 7: Linear Regression

All:

  • Diez, Barr, and Çetinkaya-Rundel: §7 (Introduction to linear regression)
  • OpenIntro eschews a mathematical approach to correlation. Look over the Wikipedia article on correlation and dependence and pay attention to the formulas. It's tedious to compute, but you should be aware of what goes into it.
  • Lampe, Cliff, and Paul Resnick. 2004. “Slash(Dot) and Burn: Distributed Moderation in a Large Online Conversation Space.” In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '04), 543–550. New York, NY, USA: ACM. doi:10.1145/985692.985761. [Available via library]

Learn Stats:

Learn R:

Resources:


Week 8

Polynomial Terms, Interactions, and Logistic Regression

All:

Learn Stats:

Learn R:

Resources