Editing Statistics and Statistical Programming (Spring 2019)
From CommunityData
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 142: | Line 142: | ||
;Due date: Thursday, May 16, 2019 | ;Due date: Thursday, May 16, 2019 | ||
;Maximum length: | ;Maximum length: 5 pages | ||
The project planning document is a basic shell/outline of an empirical quantitative research paper. Your planning document should should have the following sections: (a) Rationale, (b) Objectives; (b.1) General objectives; (b.2) Specific objectives; (c) (Null) hypotheses; (d) Conceptual diagram and explanation of the relationship | The project planning document is a basic shell/outline of an empirical quantitative research paper. Your planning document should should have the following sections: (a) Rationale, (b) Objectives; (b.1) General objectives; (b.2) Specific objectives; (c) (Null) hypotheses; (d) Conceptual diagram and/or explanation of the relationship you plan to test; (e) Measures; (e) Dummy tables/figures. Descriptions of each of these planning document sections as well as an exemplary example will be available [[TODO-planningdoc|on this wiki page]]. | ||
<!--- | |||
An exemplary planning document from public health researcher Mika Matsuzaki is [https://canvas.northwestern.edu online in Canvas]. Your diagram will likely be much less complicated than Matsuzaki's. Also, please don't be distracted by the fact that Matsuzaki does public health research. You can (and should!) emulate the form rather than the content. You can also check out [http://ajcn.nutrition.org/content/99/6/1450.full the published paper] to see how the project wound up. | |||
Please note that the Matsuzaki planning document includes everything except a "Measures" section. Your Measures section should include a two column table where column 1 is the name of each variable in your analysis and column 2 describes the operationalization of each measures and (if necessary) how you will create it. | |||
---> | |||
==== Project presentation and paper ==== | ==== Project presentation and paper ==== | ||
Line 156: | Line 157: | ||
;Maximum length: 6000 words (~20 pages) | ;Maximum length: 6000 words (~20 pages) | ||
;Presentation due date: | ;Presentation due date: Thursday, June 6, 2019 | ||
;Maximum length: | ;Maximum length: 12 minutes | ||
Line 170: | Line 171: | ||
I do not have strong preferences about the style or formatting guidelines you follow for the paper and its bibliography. However, ''your paper must follow a standard format'' (e.g., [https://cscw.acm.org/2019/submit-papers.html ACM SIGCHI CSCW format] or [https://www.apastyle.org/index APA 6th edition] ([https://templates.office.com/en-us/APA-style-report-6th-edition-TM03982351 Word] and [https://www.overleaf.com/latex/templates/sample-apa-paper/fswjbwygndyq LaTeX] templates)) that is applicable for a peer-reviewed journal or conference proceedings in which you aim to publish the work (they all have formatting or submission guidelines published online and you should follow them). This includes the references. I also strongly recommend that you use reference management software to handle your bibliographic sources. | I do not have strong preferences about the style or formatting guidelines you follow for the paper and its bibliography. However, ''your paper must follow a standard format'' (e.g., [https://cscw.acm.org/2019/submit-papers.html ACM SIGCHI CSCW format] or [https://www.apastyle.org/index APA 6th edition] ([https://templates.office.com/en-us/APA-style-report-6th-edition-TM03982351 Word] and [https://www.overleaf.com/latex/templates/sample-apa-paper/fswjbwygndyq LaTeX] templates)) that is applicable for a peer-reviewed journal or conference proceedings in which you aim to publish the work (they all have formatting or submission guidelines published online and you should follow them). This includes the references. I also strongly recommend that you use reference management software to handle your bibliographic sources. | ||
'' | '' The presentation:'' The presentation will provide an opportunity to share a brief summary of your project and findings with the other members of the class. Since you will all give other research presentations throughout your career, I strongly encourage you to take the opportunity to refine your academic presentation skills. The document [https://canvas.northwestern.edu Creating a Successful Scholarly Presentation] (file will be posted to Canvas) may be useful. | ||
=== Grading === | === Grading === | ||
Line 313: | Line 312: | ||
=== Week 4: Thursday April 25: Statistical significance and hypothesis testing === | === Week 4: Thursday April 25: Statistical significance and hypothesis testing === | ||
* [[Statistics and Statistical Programming (Spring 2019)/Session plan: Week 4]] | * [[Statistics and Statistical Programming (Spring 2019)/Session plan: Week 4|Session plan]] | ||
'''Required Readings:''' | '''Required Readings:''' | ||
Line 370: | Line 369: | ||
=== Week 6: Thursday May 9: Categorical data === | === Week 6: Thursday May 9: Categorical data === | ||
'''Required Readings:''' | '''Required Readings:''' | ||
* Diez, Barr, and Çetinkaya-Rundel: §6 | * Diez, Barr, and Çetinkaya-Rundel: §6 (Inference for categorical data) | ||
* Buechley, Leah and Benjamin Mako Hill. 2010. “LilyPad in the Wild: How Hardware’s Long Tail Is Supporting New Engineering and Design Communities.” Pp. 199–207 in ''Proceedings of the 8th ACM Conference on Designing Interactive Systems.'' Aarhus, Denmark: ACM. [[https://mako.cc/academic/buechley_hill_DIS_10.pdf PDF available on Hill's personal website]] | * Buechley, Leah and Benjamin Mako Hill. 2010. “LilyPad in the Wild: How Hardware’s Long Tail Is Supporting New Engineering and Design Communities.” Pp. 199–207 in ''Proceedings of the 8th ACM Conference on Designing Interactive Systems.'' Aarhus, Denmark: ACM. [[https://mako.cc/academic/buechley_hill_DIS_10.pdf PDF available on Hill's personal website]] | ||
* Reinhart, §4 and §5. | * Reinhart, §4 and §5. | ||
'''Recommended Readings: | '''Recommended Readings:''' | ||
* Verzani: §3.4 (Bivariate categorical data); §10.1-10.2 (Goodness of fit) | * Verzani: §3.4 (Bivariate categorical data); §10.1-10.2 (Goodness of fit) | ||
* Gelman, Andrew and Eric Loken. 2014. “The Statistical Crisis in Science Data-Dependent Analysis—a ‘garden of Forking Paths’—explains Why Many Statistically Significant Comparisons Don’t Hold Up.” ''American Scientist'' 102(6):460. [[https://www.americanscientist.org/issues/pub/2014/6/the-statistical-crisis-in-science/1 Available through NU Libraries]] (This is a reworked version of [http://www.stat.columbia.edu/~gelman/research/unpublished/p_hacking.pdf this unpublished manuscript] which provides a more detailed examples.) | * Gelman, Andrew and Eric Loken. 2014. “The Statistical Crisis in Science Data-Dependent Analysis—a ‘garden of Forking Paths’—explains Why Many Statistically Significant Comparisons Don’t Hold Up.” ''American Scientist'' 102(6):460. [[https://www.americanscientist.org/issues/pub/2014/6/the-statistical-crisis-in-science/1 Available through NU Libraries]] (This is a reworked version of [http://www.stat.columbia.edu/~gelman/research/unpublished/p_hacking.pdf this unpublished manuscript] which provides a more detailed examples.) | ||
Line 387: | Line 384: | ||
'''Lectures:''' | '''Lectures:''' | ||
*[https://communitydata.cc/~ | <!--- | ||
* [[Statistics and Statistical Programming (Spring 2019)/R lecture outline: Week 6]] | |||
* [https://communitydata.cc/~mako/2017-COM521/com521-week_06-tables_chisq_debugging.ogv Week 6 R lecture screencast: Tables, <math>\chi^2</math>-tests, and debugging.] (~40 minutes) | |||
---> | |||
'''Resources:''' | '''Resources:''' | ||
Line 395: | Line 394: | ||
=== Week 7: Thursday May 16: Linear Regression === | === Week 7: Thursday May 16: Linear Regression === | ||
'''Required Readings:''' | '''Required Readings:''' | ||
* Diez, Barr, and Çetinkaya-Rundel: §7 (Introduction to linear regression) | * Diez, Barr, and Çetinkaya-Rundel: §7 (Introduction to linear regression); §8.1-8.3 (Multiple regression) | ||
* OpenIntro eschews a mathematical | * OpenIntro eschews a mathematical introduction to correlation. Look over [https://en.wikipedia.org/wiki/Correlation_and_dependence the Wikipedia article on correlation and dependence] and pay attention to the formulas. It's tedious to compute, but I'd like to you to at least see what goes into it. | ||
* Lampe, Cliff, and Paul Resnick. 2004. “Slash(Dot) and Burn: Distributed Moderation in a Large Online Conversation Space.” In ''Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '04)'', 543–550. New York, NY, USA: ACM. doi:10.1145/985692.985761. [[http://dx.doi.org/10.1145/985692.985761 Available via NU libraries]] | * Lampe, Cliff, and Paul Resnick. 2004. “Slash(Dot) and Burn: Distributed Moderation in a Large Online Conversation Space.” In ''Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '04)'', 543–550. New York, NY, USA: ACM. doi:10.1145/985692.985761. [[http://dx.doi.org/10.1145/985692.985761 Available via NU libraries]] | ||
* Reinhart, §8 and §9. | |||
'''Recommended Readings:''' | '''Recommended Readings:''' | ||
Line 412: | Line 412: | ||
'''Lectures:''' | '''Lectures:''' | ||
* [https://communitydata.cc/~ | <!--- | ||
* [[Statistics and Statistical Programming (Spring 2019)/R lecture outline: Week 7]] | |||
* [https://communitydata.cc/~mako/2017-COM521/com521-week_07-linear_regression.ogv Week 7 R lecture screencast: linear regression] (~42 minutes) | |||
---> | |||
'''Resources:''' | '''Resources:''' | ||
Line 420: | Line 423: | ||
=== Week 8: Thursday May 23: Polynomial Terms, Interactions, and Logistic Regression === | === Week 8: Thursday May 23: Polynomial Terms, Interactions, and Logistic Regression === | ||
'''Required Readings:''' | '''Required Readings:''' | ||
* [https://onlinecourses.science.psu.edu/stat501/node/301 Lesson 8: Categorical Predictors] and [https://onlinecourses.science.psu.edu/stat501/node/318 Lesson 9: Data Transformations] from the PennState Eberly College of Science STAT 501 Regression Methods Course. There are several subparts (many quite short), please read them all carefully. | * [https://onlinecourses.science.psu.edu/stat501/node/301 Lesson 8: Categorical Predictors] and [https://onlinecourses.science.psu.edu/stat501/node/318 Lesson 9: Data Transformations] from the PennState Eberly College of Science STAT 501 Regression Methods Course. There are several subparts (many quite short), please read them all carefully. | ||
* ( | * Diez, Barr, and Çetinkaya-Rundel: §8.4 (Multiple and logistic regression) | ||
* Lampe, Cliff, and Paul Resnick. 2004. “Slash(Dot) and Burn: Distributed Moderation in a Large Online Conversation Space.” In ''Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '04)'', 543–550. New York, NY, USA: ACM. doi:10.1145/985692.985761. [[http://dx.doi.org/10.1145/985692.985761 Available via NU libraries]] | |||
'''Recommended Readings:''' | '''Recommended Readings:''' | ||
Line 438: | Line 440: | ||
'''Lectures:''' | '''Lectures:''' | ||
*[https://communitydata. | |||
<!--- | |||
* [[Statistics and Statistical Programming (Spring 2019)/R lecture outline: Week 8]] | |||
* [https://communitydata.cc/~mako/2017-COM521/com521-week_08-more_regression_anova_redux.ogv Week 8 R lecture screencast: more on linear regression, including interactions, polynomials, log transformations; anova] (~28 minutes) | |||
---> | |||
'''Resources:''' | '''Resources:''' | ||
Line 446: | Line 452: | ||
* Mako Hill wrote this document which will likely be useful for many of you: [https://communitydata.cc/~mako/2017-COM521/logistic_regression_interpretation.html Interpreting Logistic Regression Coefficients with Examples in R] | * Mako Hill wrote this document which will likely be useful for many of you: [https://communitydata.cc/~mako/2017-COM521/logistic_regression_interpretation.html Interpreting Logistic Regression Coefficients with Examples in R] | ||
=== Week 9: Thursday May 30: | === Week 9: Thursday May 30: TBA === | ||
Reserved for catch-up, supplementary topics, and maybe some final presentations. | |||
'''Required readings:''' | '''Required readings:''' | ||
* Reinhart, §10 and §11. | * Reinhart, §10 and §11. | ||
=== Week 10: Thursday June 6: Final Presentations === | |||
=== Week 10: Thursday June 6: | |||
Followed by much rejoicing! | Followed by much rejoicing! |