Editing Statistics and Statistical Programming (Spring 2019)

From CommunityData

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

Latest revision Your text
Line 156: Line 156:
 
;Maximum length: 6000 words (~20 pages)
 
;Maximum length: 6000 words (~20 pages)
  
;Presentation due date: Thursday, May 30 or Thursday, June 6, 2019
+
;Presentation due date: Thursday, June 6, 2019
;Maximum length: 8 minutes
+
;Maximum length: 12 minutes
  
  
Line 170: Line 170:
 
I do not have strong preferences about the style or formatting guidelines you follow for the paper and its bibliography. However, ''your paper must follow a standard format'' (e.g., [https://cscw.acm.org/2019/submit-papers.html ACM SIGCHI CSCW format] or [https://www.apastyle.org/index APA 6th edition] ([https://templates.office.com/en-us/APA-style-report-6th-edition-TM03982351 Word] and [https://www.overleaf.com/latex/templates/sample-apa-paper/fswjbwygndyq LaTeX] templates)) that is applicable for a peer-reviewed journal or conference proceedings in which you aim to publish the work (they all have formatting or submission guidelines published online and you should follow them). This includes the references. I also strongly recommend that you use reference management software to handle your bibliographic sources.
 
I do not have strong preferences about the style or formatting guidelines you follow for the paper and its bibliography. However, ''your paper must follow a standard format'' (e.g., [https://cscw.acm.org/2019/submit-papers.html ACM SIGCHI CSCW format] or [https://www.apastyle.org/index APA 6th edition] ([https://templates.office.com/en-us/APA-style-report-6th-edition-TM03982351 Word] and [https://www.overleaf.com/latex/templates/sample-apa-paper/fswjbwygndyq LaTeX] templates)) that is applicable for a peer-reviewed journal or conference proceedings in which you aim to publish the work (they all have formatting or submission guidelines published online and you should follow them). This includes the references. I also strongly recommend that you use reference management software to handle your bibliographic sources.
  
'' [[Statistics_and_Statistical_Programming_(Spring_2019)/Final_project_presentations|The presentation:]]'' The presentation will provide an opportunity to share a brief summary of your project and findings with the other members of the class. Since you will all give other research presentations throughout your career, I strongly encourage you to take the opportunity to refine your academic presentation skills. The document [https://canvas.northwestern.edu Creating a Successful Scholarly Presentation] (file will be posted to Canvas) may be useful.
+
'' The presentation:'' The presentation will provide an opportunity to share a brief summary of your project and findings with the other members of the class. Since you will all give other research presentations throughout your career, I strongly encourage you to take the opportunity to refine your academic presentation skills. The document [https://canvas.northwestern.edu Creating a Successful Scholarly Presentation] (file will be posted to Canvas) may be useful.
 
 
: More details about the presentation goals, format suggestions, and more are available [[Statistics_and_Statistical_Programming_(Spring_2019)/Final_project_presentations|on this page]]
 
  
 
=== Grading ===
 
=== Grading ===
Line 313: Line 311:
  
 
=== Week 4: Thursday April 25: Statistical significance and hypothesis testing ===
 
=== Week 4: Thursday April 25: Statistical significance and hypothesis testing ===
* [[Statistics and Statistical Programming (Spring 2019)/Session plan: Week 4]]
+
* [[Statistics and Statistical Programming (Spring 2019)/Session plan: Week 4|Session plan]]
  
 
'''Required Readings:'''
 
'''Required Readings:'''
Line 370: Line 368:
 
=== Week 6: Thursday May 9: Categorical data ===
 
=== Week 6: Thursday May 9: Categorical data ===
  
* [[Statistics and Statistical Programming (Spring 2019)/Session plan: Week 6|Session plan]]
 
 
'''Required Readings:'''
 
'''Required Readings:'''
  
* Diez, Barr, and Çetinkaya-Rundel: §6.1-6.4 (Inference for categorical data).
+
* Diez, Barr, and Çetinkaya-Rundel: §6 (Inference for categorical data)
 
* Buechley, Leah and Benjamin Mako Hill. 2010. “LilyPad in the Wild: How Hardware’s Long Tail Is Supporting New Engineering and Design Communities.” Pp. 199–207 in ''Proceedings of the 8th ACM Conference on Designing Interactive Systems.'' Aarhus, Denmark: ACM. [[https://mako.cc/academic/buechley_hill_DIS_10.pdf PDF available on Hill's personal website]]
 
* Buechley, Leah and Benjamin Mako Hill. 2010. “LilyPad in the Wild: How Hardware’s Long Tail Is Supporting New Engineering and Design Communities.” Pp. 199–207 in ''Proceedings of the 8th ACM Conference on Designing Interactive Systems.'' Aarhus, Denmark: ACM. [[https://mako.cc/academic/buechley_hill_DIS_10.pdf PDF available on Hill's personal website]]
 
* Reinhart, §4 and §5.
 
* Reinhart, §4 and §5.
  
'''Recommended Readings:
+
'''Recommended Readings:'''
* Diez, Barr, and Çetinkaya-Rundel: §6.5-6.6 (Small samples and randomization inference)
 
 
* Verzani: §3.4 (Bivariate categorical data); §10.1-10.2 (Goodness of fit)
 
* Verzani: §3.4 (Bivariate categorical data); §10.1-10.2 (Goodness of fit)
 
* Gelman, Andrew and Eric Loken. 2014. “The Statistical Crisis in Science Data-Dependent Analysis—a ‘garden of Forking Paths’—explains Why Many Statistically Significant Comparisons Don’t Hold Up.” ''American Scientist'' 102(6):460. [[https://www.americanscientist.org/issues/pub/2014/6/the-statistical-crisis-in-science/1 Available through NU Libraries]] (This is a reworked version of [http://www.stat.columbia.edu/~gelman/research/unpublished/p_hacking.pdf this unpublished manuscript] which provides a more detailed examples.)
 
* Gelman, Andrew and Eric Loken. 2014. “The Statistical Crisis in Science Data-Dependent Analysis—a ‘garden of Forking Paths’—explains Why Many Statistically Significant Comparisons Don’t Hold Up.” ''American Scientist'' 102(6):460. [[https://www.americanscientist.org/issues/pub/2014/6/the-statistical-crisis-in-science/1 Available through NU Libraries]] (This is a reworked version of [http://www.stat.columbia.edu/~gelman/research/unpublished/p_hacking.pdf this unpublished manuscript] which provides a more detailed examples.)
Line 395: Line 391:
  
 
=== Week 7: Thursday May 16: Linear Regression ===
 
=== Week 7: Thursday May 16: Linear Regression ===
* [[Statistics and Statistical Programming (Spring 2019)/Session plan: Week 7|Session plan]]
+
 
 
'''Required Readings:'''
 
'''Required Readings:'''
  
* Diez, Barr, and Çetinkaya-Rundel: §7 (Introduction to linear regression)
+
* Diez, Barr, and Çetinkaya-Rundel: §7 (Introduction to linear regression); §8.1-8.3 (Multiple regression)
* OpenIntro eschews a mathematical approach to correlation. Look over [https://en.wikipedia.org/wiki/Correlation_and_dependence the Wikipedia article on correlation and dependence] and pay attention to the formulas. It's tedious to compute, but you should be aware of what goes into it.
+
* OpenIntro eschews a mathematical introduction to correlation. Look over [https://en.wikipedia.org/wiki/Correlation_and_dependence the Wikipedia article on correlation and dependence] and pay attention to the formulas. It's tedious to compute, but I'd like to you to at least see what goes into it.
 
* Lampe, Cliff, and Paul Resnick. 2004. “Slash(Dot) and Burn: Distributed Moderation in a Large Online Conversation Space.” In ''Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '04)'', 543–550. New York, NY, USA: ACM. doi:10.1145/985692.985761. [[http://dx.doi.org/10.1145/985692.985761 Available via NU libraries]]
 
* Lampe, Cliff, and Paul Resnick. 2004. “Slash(Dot) and Burn: Distributed Moderation in a Large Online Conversation Space.” In ''Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '04)'', 543–550. New York, NY, USA: ACM. doi:10.1145/985692.985761. [[http://dx.doi.org/10.1145/985692.985761 Available via NU libraries]]
 +
* Reinhart, §8 and §9.
  
 
'''Recommended Readings:'''
 
'''Recommended Readings:'''
Line 412: Line 409:
  
 
'''Lectures:'''
 
'''Lectures:'''
* [https://communitydata.cc/~ads/teaching/2019/stats/r_lectures/w07-R_lecture.Rmd Week 7 R lecture materials]
+
<!---
 +
* [[Statistics and Statistical Programming (Spring 2019)/R lecture outline: Week 7]]
 +
* [https://communitydata.cc/~mako/2017-COM521/com521-week_07-linear_regression.ogv Week 7 R lecture screencast: linear regression] (~42 minutes)
 +
--->
  
 
'''Resources:'''
 
'''Resources:'''
Line 420: Line 420:
  
 
=== Week 8: Thursday May 23: Polynomial Terms, Interactions, and Logistic Regression ===
 
=== Week 8: Thursday May 23: Polynomial Terms, Interactions, and Logistic Regression ===
* [[Statistics_and_Statistical_Programming_(Spring_2019)/Session plan: Week 8|Session plan]]
 
  
 
'''Required Readings:'''
 
'''Required Readings:'''
* Diez, Barr, and Çetinkaya-Rundel: §8 (Multiple and logistic regression)
+
 
 
* [https://onlinecourses.science.psu.edu/stat501/node/301 Lesson 8: Categorical Predictors] and [https://onlinecourses.science.psu.edu/stat501/node/318 Lesson 9: Data Transformations] from the PennState Eberly College of Science STAT 501 Regression Methods Course. There are several subparts (many quite short), please read them all carefully.
 
* [https://onlinecourses.science.psu.edu/stat501/node/301 Lesson 8: Categorical Predictors] and [https://onlinecourses.science.psu.edu/stat501/node/318 Lesson 9: Data Transformations] from the PennState Eberly College of Science STAT 501 Regression Methods Course. There are several subparts (many quite short), please read them all carefully.
* (Revisit) Lampe, Cliff, and Paul Resnick. 2004. “Slash(Dot) and Burn: Distributed Moderation in a Large Online Conversation Space.” In ''Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '04)'', 543–550. New York, NY, USA: ACM. doi:10.1145/985692.985761. [[http://dx.doi.org/10.1145/985692.985761 Available via NU libraries]]
+
* Diez, Barr, and Çetinkaya-Rundel: §8.4 (Multiple and logistic regression)
* Reinhart, §8 and §9.
+
* Lampe, Cliff, and Paul Resnick. 2004. “Slash(Dot) and Burn: Distributed Moderation in a Large Online Conversation Space.” In ''Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '04)'', 543–550. New York, NY, USA: ACM. doi:10.1145/985692.985761. [[http://dx.doi.org/10.1145/985692.985761 Available via NU libraries]]
  
 
'''Recommended Readings:'''
 
'''Recommended Readings:'''
Line 438: Line 437:
  
 
'''Lectures:'''
 
'''Lectures:'''
*[https://communitydata.science/~ads/teaching/2019/stats/r_lectures/w08-R_lecture.Rmd Week 8 R lecture materials]
+
 
 +
<!---
 +
* [[Statistics and Statistical Programming (Spring 2019)/R lecture outline: Week 8]]
 +
* [https://communitydata.cc/~mako/2017-COM521/com521-week_08-more_regression_anova_redux.ogv Week 8 R lecture screencast: more on linear regression, including interactions, polynomials, log transformations; anova] (~28 minutes)
 +
--->
  
 
'''Resources:'''
 
'''Resources:'''
Line 446: Line 449:
 
* Mako Hill wrote this document which will likely be useful for many of you: [https://communitydata.cc/~mako/2017-COM521/logistic_regression_interpretation.html Interpreting Logistic Regression Coefficients with Examples in R]
 
* Mako Hill wrote this document which will likely be useful for many of you: [https://communitydata.cc/~mako/2017-COM521/logistic_regression_interpretation.html Interpreting Logistic Regression Coefficients with Examples in R]
  
=== Week 9: Thursday May 30: Loose ends and Final Presentations (part 1)  ===
+
=== Week 9: Thursday May 30: TBA ===
  
* [[Statistics_and_Statistical_Programming_(Spring_2019)/Session plan: Week 9|Session plan]]
+
Reserved for catch-up, supplementary topics, and maybe some final presentations.
  
 
'''Required readings:'''
 
'''Required readings:'''
 
* Reinhart, §10 and §11.
 
* Reinhart, §10 and §11.
  
'''[[Statistics_and_Statistical_Programming_(Spring_2019)/Final_project_presentations|Final presentations]]: (part 1)'''
+
=== Week 10: Thursday June 6: Final Presentations ===
* First batch today. The rest next week.
 
 
 
'''Resources:'''
 
* [https://communitydata.cc/~ads/teaching/2019/stats/r_lectures/w09-R_lecture.html Week 9 R-lecture] (we will use this in class)
 
 
 
=== Week 10: Thursday June 6: Fully reproducible research example, Replications, Final Presentations (part 2), and wrap-up ===
 
 
 
* Fully [https://www.overleaf.com/read/tkdpdcspwtkp reproducible research example].
 
* [https://canvas.northwestern.edu/courses/90927/files/folder/resources/Straub-Cook%20Replication Research replication study] by Polly Straub-Cook (UW Comm. Ph.D. student)
 
:: (n.b.: cluster & heteroscedasticity robust standard errors!)
 
 
 
* '''[[Statistics_and_Statistical_Programming_(Spring_2019)/Final_project_presentations|Final presentations]]: (part 2)'''
 
:: Second batch of presenters today.
 
* Closing thoughts
 
:: What next? Beyond your final projects...
 
:: Class social gathering
 
  
 
Followed by much rejoicing!
 
Followed by much rejoicing!

Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see CommunityData:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)