Editing Statistics and Statistical Programming (Fall 2020)

From CommunityData
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 18: Line 18:
:Also usually available via chat during "business hours."
:Also usually available via chat during "business hours."


;'''Teaching Assistant:''' [http://nickmvincent.com Nick Vincent] ([mailto:nickvincent@u.northwestern.edu nickvincent@u.northwestern.edu])
:'''Teaching Assistant:''' [http://nickmvincent.com Nick Vincent] ([mailto:nickvincent@u.northwestern.edu nickvincent@u.northwestern.edu])
:Office Hours: Monday 10am-12pm and by appointment. I'll try to respond to any asynchronous questions in a timely fashion during "business hours" (9a-5p Central Time), and will also have OH by appointment. I'll respond best to email (above), but am also happy to use Discord for quicker back-and-forth.
::Office Hours: Monday 10am-12pm and by appointment. I'll try to respond to any asynchronous questions in a timely fashion during "business hours" (9a-5p Central Time), and will also have OH by appointment. I'll respond best to email (above), but am also happy to use Discord for quicker back-and-forth.
:I am happy to try out alternative communication software for OH!
::I am happy to try out alternative communication software for OH!


<br>
<br>
Line 223: Line 223:
==== Research project paper ====
==== Research project paper ====


;Paper due date: December 10, 2020, 5pm CT
;Paper due date: December 8, 2020, 5pm CT
;Maximum length: 6000 words (~20 pages)
;Maximum length: 6000 words (~20 pages)


Line 235: Line 235:


I do not have strong preferences about the style or formatting guidelines you follow for the paper and its bibliography. However, ''your paper must follow a standard format'' (e.g., [https://cscw.acm.org/2019/submit-papers.html ACM SIGCHI CSCW format] or [https://www.apastyle.org/index APA 6th edition] ([https://templates.office.com/en-us/APA-style-report-6th-edition-TM03982351 Word] and [https://www.overleaf.com/latex/templates/sample-apa-paper/fswjbwygndyq LaTeX] templates)) that is applicable for a peer-reviewed journal or conference proceedings in which you might aim to publish the work (they all have formatting or submission guidelines published online and you should follow them). This includes the references. I also strongly recommend that you use reference management software like Zotero to handle your bibliographic sources.
I do not have strong preferences about the style or formatting guidelines you follow for the paper and its bibliography. However, ''your paper must follow a standard format'' (e.g., [https://cscw.acm.org/2019/submit-papers.html ACM SIGCHI CSCW format] or [https://www.apastyle.org/index APA 6th edition] ([https://templates.office.com/en-us/APA-style-report-6th-edition-TM03982351 Word] and [https://www.overleaf.com/latex/templates/sample-apa-paper/fswjbwygndyq LaTeX] templates)) that is applicable for a peer-reviewed journal or conference proceedings in which you might aim to publish the work (they all have formatting or submission guidelines published online and you should follow them). This includes the references. I also strongly recommend that you use reference management software like Zotero to handle your bibliographic sources.


==== Human subjects research, IRB, and ethics ====
==== Human subjects research, IRB, and ethics ====
Line 328: Line 329:
* Read Diez, Çetinkaya-Rundel, and Barr: §1.1-1.3 (Introduction to data).  
* Read Diez, Çetinkaya-Rundel, and Barr: §1.1-1.3 (Introduction to data).  
* Watch [https://www.youtube.com/playlist?list=PLkIselvEzpM6pZ76FD3NoCvvgkj_p-dE8 Lecture materials for §1.1-3 (Videos 1-4 in the playlist)].
* Watch [https://www.youtube.com/playlist?list=PLkIselvEzpM6pZ76FD3NoCvvgkj_p-dE8 Lecture materials for §1.1-3 (Videos 1-4 in the playlist)].
* Complete '''exercises from OpenIntro §1:''' 1.6, 1.9, 1.10, 1.16, 1.21, 1.40, 1.42, 1.43 (and remember that solutions to odd-numbered problems are in the book!)
* Submit, review, and respond to questions or requests for discussion via Discord or some other means.
* Submit, review, and respond to questions or requests for discussion via Discord or some other means.


Line 364: Line 366:


=== Week 4 (10/6, 10/8) ===
=== Week 4 (10/6, 10/8) ===
;[[Statistics_and_Statistical_Programming_(Fall_2020)/w04_session_plan|Session plans]]
==== October 6: Emotional contagion and more advanced R fundamentals: import, tidy, transform, and simulate data; write functions ====
==== October 6: Emotional contagion and more advanced R fundamentals: import, tidy, transform, and simulate data; write functions ====
'''Required'''
'''Required'''
Line 388: Line 388:


=== Week 5 (10/13, 10/15) ===
=== Week 5 (10/13, 10/15) ===
;[[Statistics_and_Statistical_Programming_(Fall_2020)/w05_session_plan|Session plans]]
==== October 13: Descriptive analysis of policing data ====
==== October 13: Descriptive analysis and visualization of data ====
'''Required'''
'''Required'''
* Complete [[Statistics_and_Statistical_Programming_(Fall_2020)/pset3|problem set #3]] (due Monday, October 12 at 1pm CT)
* Complete problem set #3


'''Recommended'''
'''Recommended'''
* [https://communitydata.science/~ads/teaching/2020/stats/r_tutorials/w05-R_tutorial.html Week 5 R tutorial] and [https://communitydata.science/~ads/teaching/2020/stats/r_tutorials/w05a-R_tutorial.html Week 5 R tutorial supplement] (both, as usual, also available as .rmd or .pdf).
* Week 5 R tutorial.  


==== October 15: Foundations for (frequentist) inference ====
==== October 15: Foundations for (frequentist) inference ====
Line 401: Line 400:
* Watch [https://www.youtube.com/watch?v=oLW_uzkPZGA&list=PLkIselvEzpM4SHQojH116fYAQJLaN_4Xo foundations for inference] (videos 1-3 in the playlist) OpenIntro lectures.
* Watch [https://www.youtube.com/watch?v=oLW_uzkPZGA&list=PLkIselvEzpM4SHQojH116fYAQJLaN_4Xo foundations for inference] (videos 1-3 in the playlist) OpenIntro lectures.
* Complete [https://www.openintro.org/book/stat/why05/ Why .05?] OpenIntro video/exercise.
* Complete [https://www.openintro.org/book/stat/why05/ Why .05?] OpenIntro video/exercise.
* Complete '''exercises from OpenIntro §5:''' 5.4, 5.8, 5.10, 5.17, 5.30, 5.35, 5.36
* Complete '''exercises from OpenIntro §5:''''


'''Resources'''
'''Resources'''
Line 408: Line 407:


=== Week 6 (10/20, 10/22) ===
=== Week 6 (10/20, 10/22) ===
;[[Statistics_and_Statistical_Programming_(Fall_2020)/w06_session_plan|Session plans]]
==== October 20: <Topic> ====
==== October 20: Reinforced foundations for inference ====
'''Required'''
'''Required'''
* Complete [[Statistics_and_Statistical_Programming_(Fall_2020)/pset4|problem set #4]] 
* Complete problem set #4
* Read Reinhart, §1.
* Revisit the Kramer et al. (2014) paper we read a few weeks ago:
* Revisit the Kramer et al. (2014) paper we read a few weeks ago:
:Kramer, Adam D. I., Jamie E. Guillory, and Jeffrey T. Hancock. 2014. “Experimental Evidence of Massive-Scale Emotional Contagion through Social Networks.” ''Proceedings of the National Academy of Sciences'' 111(24):8788–90. [[http://www.pnas.org/content/111/24/8788.full Open access]]
:Kramer, Adam D. I., Jamie E. Guillory, and Jeffrey T. Hancock. 2014. “Experimental Evidence of Massive-Scale Emotional Contagion through Social Networks.” ''Proceedings of the National Academy of Sciences'' 111(24):8788–90. [[http://www.pnas.org/content/111/24/8788.full Open access]]
'''Resources'''


==== October 22: Inference for categorical data ====
==== October 22: Inference for categorical data ====
Line 420: Line 418:
* Read Diez, Çetinkaya-Rundel, and Barr: §6 (Inference for categorical data).  
* Read Diez, Çetinkaya-Rundel, and Barr: §6 (Inference for categorical data).  
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5Gn-sHTw1NF0e8IvMxwHDW&v=_iFAZgpWsx0 inference for categorical data] (videos 1-3 in the playlist) OpenIntro lectures.
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5Gn-sHTw1NF0e8IvMxwHDW&v=_iFAZgpWsx0 inference for categorical data] (videos 1-3 in the playlist) OpenIntro lectures.
* Complete '''exercises from OpenIntro §6:''' 6.10, 6.16, 6.22, 6.30, 6.40 (just parts a and b; part c gets tedious)
* Complete '''exercises from OpenIntro §6:''''


'''Resources'''
'''Resources'''
Line 426: Line 424:


=== Week 7 (10/27, 10/29) ===
=== Week 7 (10/27, 10/29) ===
;[[Statistics_and_Statistical_Programming_(Fall_2020)/w07_session_plan|Session plans]]
==== October 27: <Topics> ====
==== October 27: Applied inference for categorical data ====
'''Required'''
'''Required'''
* Read Reinhart, §4 and §5 (both are quite short).
* Complete problem set #5
* Skim the following (all are referenced in the problem set)
**  Aronow PM, Karlan D, Pinson LE. (2018). The effect of images of Michelle Obama’s face on trick-or-treaters’ dietary choices: A randomized control trial. PLoS ONE 13(1): e0189693. [https://doi.org/10.1371/journal.pone.0189693 https://doi.org/10.1371/journal.pone.0189693]
** Buechley, Leah and Benjamin Mako Hill. 2010. “LilyPad in the Wild: How Hardware’s Long Tail Is Supporting New Engineering and Design Communities.” Pp. 199–207 in ''Proceedings of the 8th ACM Conference on Designing Interactive Systems.'' Aarhus, Denmark: ACM. [[https://mako.cc/academic/buechley_hill_DIS_10.pdf PDF available on Hill's personal website]]
** Shaw, Aaron and Yochai Benkler. 2012. A tale of two blogospheres: Discursive practices on the left and right. ''American Behavioral Scientist''. 56(4): 459-487. [[https://doi.org/10.1177%2F0002764211433793 available via NU libraries]]
* Complete [[Statistics_and_Statistical_Programming_(Fall_2020)/pset5|problem set #5]]
'''Resources'''
'''Resources'''
* [https://communitydata.science/~ads/teaching/2020/stats/r_tutorials/w06-R_tutorial.html Week 06 R tutorial] (it's very short!)


==== October 29: Inference for numerical data (part 1) ====
==== October 29: Inference for numerical data (part 1) ====
Line 442: Line 433:
* Read Diez, Çetinkaya-Rundel, and Barr: §7.1-3 (Inference for numerical data: differences of means).  
* Read Diez, Çetinkaya-Rundel, and Barr: §7.1-3 (Inference for numerical data: differences of means).  
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5G3IO1tzQ-DUThsJKQzQCD&v=uVEj2uBJfq0 inference for numerical data] (videos 1-4 in the playlist) OpenIntro lectures (and featuring one of the textbook authors!).
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5G3IO1tzQ-DUThsJKQzQCD&v=uVEj2uBJfq0 inference for numerical data] (videos 1-4 in the playlist) OpenIntro lectures (and featuring one of the textbook authors!).
* Complete '''exercises from OpenIntro §7:''' 7.12, 7.24, 7.26
* Complete '''exercises from OpenIntro §7:''''


'''Resources'''
'''Resources'''
* [https://gallery.shinyapps.io/CLT_mean/ OpenIntro Central limit theorem for means demo].
* [https://gallery.shinyapps.io/CLT_mean/ OpenIntro Central liumit theorem for means demo].


==== October 30: [[#Research project planning document|Research project planning document]] due 5pm CT====
==== October 30: [[#Research project planning document|Research project planning document]] due 5pm CT====
* Submit via [https://canvas.northwestern.edu/courses/122522/assignments/787297 Canvas] (due by 5pm CT)
* Submit via [https://canvas.northwestern.edu/courses/122522/assignments Canvas] (due by 5pm CT)


=== Week 8 (11/3, 11/5) ===
=== Week 8 (11/3, 11/5) ===
==== November 3: U.S. election day (no class meeting) ====
==== November 3: Self-assessment exercise (no class meeting) ====
 
'''Election Day (U.S.): No class meeting today'''
==== November 4: Interactive self-assessment due ====
* Please submit results [https://canvas.northwestern.edu/courses/122522/assignments/799630 (via Canvas)] from the [https://communitydata.science/~ads/teaching/2020/stats/assessment/interactive_assessment.rmd interactive self-assessment] by 5pm CT.


==== November 5: Inference for numerical data (part 2) ====
==== November 5: Inference for numerical data (part 2) ====
Line 460: Line 449:
* Read Diez, Çetinkaya-Rundel, and Barr: §7.4-5 (Inference for numerical data: power calculations, ANOVA, and multiple comparisons).  
* Read Diez, Çetinkaya-Rundel, and Barr: §7.4-5 (Inference for numerical data: power calculations, ANOVA, and multiple comparisons).  
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5G3IO1tzQ-DUThsJKQzQCD&v=uVEj2uBJfq0 inference for numerical data] (videos 4-8 in the playlist) OpenIntro lectures (and featuring one of the textbook authors!).
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5G3IO1tzQ-DUThsJKQzQCD&v=uVEj2uBJfq0 inference for numerical data] (videos 4-8 in the playlist) OpenIntro lectures (and featuring one of the textbook authors!).
* Complete '''exercises from OpenIntro §7:''' 7.42, 7.44, 7.46
* Complete '''exercises from OpenIntro §7:''''


'''Resources'''
'''Resources'''
Line 466: Line 455:


=== Week 9 (11/10, 11/12) ===
=== Week 9 (11/10, 11/12) ===
==== November 10: Applied inference for numerical data (t-tests, power analysis, ANOVA) ====
==== November 10: <Topic> ====
;[[Statistics_and_Statistical_Programming_(Fall_2020)/w09_session_plan|Session plans]]
 
'''Required'''
'''Required'''
* Complete [[Statistics_and_Statistical_Programming_(Fall_2020)/pset6|problem set #6]]
* Complete problem set #6


'''Resources'''
'''Resources'''
* [https://communitydata.science/~ads/teaching/2020/stats/r_tutorials/w09-R_tutorial.html Week 09 R tutorial]


==== November 12: Linear regression ====
==== November 12: Linear regression ====
Line 480: Line 466:
* Watch [https://www.youtube.com/playlist?list=PLkIselvEzpM63ikRfN41DNIhSgzboELOM linear regression] (videos 1-4 in the playlist) OpenIntro lectures.
* Watch [https://www.youtube.com/playlist?list=PLkIselvEzpM63ikRfN41DNIhSgzboELOM linear regression] (videos 1-4 in the playlist) OpenIntro lectures.
* Read [https://www.openintro.org/go/?id=stat_more_inference_for_linear_regression&referrer=/book/os/index.php More inference for linear regression] (OpenIntro supplement).
* Read [https://www.openintro.org/go/?id=stat_more_inference_for_linear_regression&referrer=/book/os/index.php More inference for linear regression] (OpenIntro supplement).
* Complete '''exercises from OpenIntro §8:''' 8.6, 8.36, 8.40, 8.44
* Complete '''exercises from OpenIntro §8:''''
* Complete '''exercises from OpenIntro supplement:''' 4 and 5 (answers provided in the supplement).
* Complete '''exercises from OpenIntro supplement:''''
   
   
'''Resources'''
'''Resources'''
Line 487: Line 473:


=== Week 10 (11/17, 11/19) ===
=== Week 10 (11/17, 11/19) ===
;[[Statistics_and_Statistical_Programming_(Fall_2020)/w10_session_plan|Session plans]]
==== November 17: <Topic> ====
==== November 17: Applied linear regression ====
'''Required'''
'''Required'''
* Complete [[Statistics_and_Statistical_Programming_(Fall_2020)/pset7|Problem set #7]]
* Complete Problem set #7


'''Resources'''
'''Resources'''
* [https://communitydata.science/~ads/teaching/2020/stats/r_tutorials/w10-R_tutorial.html Week 10 R tutorial]
 
==== November 19: Multiple and logistic regression ====
==== November 19: Multiple and logistic regression ====
'''Required'''
'''Required'''
Line 501: Line 486:
* Read [https://www.openintro.org/go/?id=stat_interaction_terms&referrer=/book/os/index.php Interaction terms] (OpenIntro supplement).
* Read [https://www.openintro.org/go/?id=stat_interaction_terms&referrer=/book/os/index.php Interaction terms] (OpenIntro supplement).
* Read [https://www.openintro.org/go/?id=stat_nonlinear_relationships&referrer=/book/os/index.php Fitting models for non-linear trends] (OpenIntro supplement).
* Read [https://www.openintro.org/go/?id=stat_nonlinear_relationships&referrer=/book/os/index.php Fitting models for non-linear trends] (OpenIntro supplement).
* Complete '''exercises from OpenIntro §9:''' 9.4, 9.13, 9.16, 9.18,
* Complete '''exercises from OpenIntro §9:''''
* Complete '''exercises from OpenIntro supplements:''''


'''Resources'''
'''Resources'''


=== Week 11 (11/24) ===
=== Week 11 (11/24) ===
==== November 24: Applied multiple and logistic regression ====
==== November 24: <Topic> and assessment ====
;[[Statistics_and_Statistical_Programming_(Fall_2020)/w11_session_plan|Session plans]]
'''Required'''
'''Required'''
* Complete [[Statistics_and_Statistical_Programming_(Fall_2020)/pset8|Problem set #8]]
* Complete Problem set #8
* Complete [https://apps3.cehd.umn.edu/artist/user/scale_select.html post-course assessment of statistical concepts] (access code TBA VIA email). '''Submission deadline: December 1, 11:00pm Chicago time'''
'''Resources'''
'''Resources'''
* Mako Hill created (and Aaron updated) a brief tutorial on [https://communitydata.science/~ads/teaching/2020/stats/r_tutorials/logistic_regression_interpretation.html interpreting logistic regression coefficients with examples in R]
* Mako Hill created an example of [https://communitydata.science/~mako/2017-COM521/logistic_regression_interpretation.html interpreting logistic regression coefficients with examples in R]


=== Week 12+ ===
=== Week 12+ ===
==== December 3: [[#Research project presentation|Research project presentation]] due by 5pm CT ====
==== December 3: [[#Research project presentation|Research project presentation]] due by 5pm CT ====
'''[https://canvas.northwestern.edu/courses/122522/discussion_topics/856868 Post your video via this "Discussion" on Canvas]'''. Please view and provide constructive feedback on other's videos!
* '''Post videos directly to the "Discussion."''' The Canvas text editor has an option to upload/record a video. That's what you want.
* '''Please remember not to over-work/think this.''' I mentioned this in class, but just to reiterate, the focus of this assignment should not be your video editing skills. Please do what you can to record and convey your ideas clearly without devoting insane hours to creating the perfect video.
* '''Some resources for recording presentations:''' There are a bunch of ways you might record/share your video. Some ideas include using the embedded media recorder in Canvas (!) that can record with with your webcam (maybe attach a few visuals to accompany this?); recording a "meeting" with yourself in Zoom; and "Panopto," a piece of high-end video recording, sharing, and editing software that NU licenses for campus use. Here are some pointers:
** NU has a "digital learning resource hub" which provides some [https://digitallearning.northwestern.edu/resource-hub#for-students resources for students]. The first item in that list has pointers for recording yourself and posting to Canvas and includes info about the Canvas media recorder and Panopto.
** You should be able to use your NU zoom account to create a zoom meeting, record your meeting (in which you deliver your presentation and share your screen with any visuals), and then share a link to the recording via the "Recordings" item in the left-hand menu of your [https://northwestern.zoom.us/ https://northwestern.zoom.us/] account page.
** If nothing works, please get in touch.
==== December 4: Post-course assessment of statistical concepts due by 11pm CT ====
Complete [https://apps3.cehd.umn.edu/artist/user/scale_select.html post-course assessment] (access code TBA VIA email). Submission deadline: December 4, 11:00pm Chicago time.


==== December 10: [[#Research project paper|Research project paper]] due by 5pm CT ====
==== December 10: [[#Research project paper|Research project paper]] due by 5pm CT ====
'''[https://canvas.northwestern.edu/courses/122522/assignments/812317 Submit your paper, data, and code via Canvas].'''


== Credit and Notes ==
== Credit and Notes ==


This syllabus has, in ways that should be obvious, borrowed and built on the [https://www.openintro.org/stat/index.php OpenInto Statistics curriculum]. Most aspects of this course design extend Benjamin Mako Hill's [[Statistics_and_Statistical_Programming_(Winter_2017)|COM 521 class]] from the University of Washington as well as a [[Statistics_and_Statistical_Programming_(Spring_2019)|prior iteration of the same course]] offered at Northwestern in Spring 2019.
This syllabus has, in ways that should be obvious, borrowed and built on the [https://www.openintro.org/stat/index.php OpenInto Statistics curriculum]. Most aspects of this course design extend Benjamin Mako Hill's [[Statistics_and_Statistical_Programming_(Winter_2017)|COM 521 class]] from the University of Washington as well as a [[Statistics_and_Statistical_Programming_(Spring_2019)|prior iteration of the same course]] offered at Northwestern in Spring 2019.
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see CommunityData:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)