Editing Statistics and Statistical Programming (Winter 2021)
From CommunityData
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 12: | Line 12: | ||
;Instructor: [[Benjamin Mako Hill]] ([mailto:makohill@uw.edu makohill@uw.edu]) | ;Instructor: [[Benjamin Mako Hill]] ([mailto:makohill@uw.edu makohill@uw.edu]) | ||
:Office Hours: By appointment (I'm usually available via chat during "business hours.") You can view out [https://mako.cc/calendar/ my calendar] and/or [https://harmonizely.com/mako put yourself on it] | :Office Hours: By appointment (I'm usually available via chat during "business hours.") You can view out [https://mako.cc/calendar/ my calendar] and/or [https://harmonizely.com/mako put yourself on it]. | ||
<br clear=all> | <br clear=all> | ||
Line 129: | Line 129: | ||
* [https://depts.washington.edu/acelab/proj/Rstats/index.html Statistical Analysis and Reporting in R] — A set of resources created and distributed by Jacob Wobbrock (University of Washington, School of Information) in conjunction with a MOOC he teaches. Contains cheatsheets, code snippets, and data to help execute commonly encountered statistical procedures in R. | * [https://depts.washington.edu/acelab/proj/Rstats/index.html Statistical Analysis and Reporting in R] — A set of resources created and distributed by Jacob Wobbrock (University of Washington, School of Information) in conjunction with a MOOC he teaches. Contains cheatsheets, code snippets, and data to help execute commonly encountered statistical procedures in R. | ||
* [https://www.datacamp.com DataCamp] offers introductory R courses. Northwestern usually has some free accounts that get passed out via Research Data Services each quarter. Apparently, if you are taking or teaching relevant coursework, instructors can [https://www.datacamp.com/groups/education request] free access to DataCamp for their courses from DataCamp. If folks are interested in this, I can reach out. | * [https://www.datacamp.com DataCamp] offers introductory R courses. Northwestern usually has some free accounts that get passed out via Research Data Services each quarter. Apparently, if you are taking or teaching relevant coursework, instructors can [https://www.datacamp.com/groups/education request] free access to DataCamp for their courses from DataCamp. If folks are interested in this, I can reach out. | ||
== Assignments == | == Assignments == | ||
Line 202: | Line 201: | ||
I will also provide example planning documents via our Canvas site: | I will also provide example planning documents via our Canvas site: | ||
* [https://canvas.northwestern.edu/files/9439380/download?download_frd=1 One by public health researcher Mika Matsuzaki]. The first planning document I ever saw and still one of the best. It's missing a measures section. It's also focused on a research context that is probably very different from yours, but try not to get bogged down by that and imagine how you might map the structure of the document to your own work. | * [https://canvas.northwestern.edu/files/9439380/download?download_frd=1 One by public health researcher Mika Matsuzaki]. The first planning document I ever saw and still one of the best. It's missing a measures section. It's also focused on a research context that is probably very different from yours, but try not to get bogged down by that and imagine how you might map the structure of the document to your own work. | ||
* [One provided as an appendix to Gerber and Green's excellent textbook, ''Field Experiments: Design, Analysis, and Interpretation'' (FEDAI)]. It's over-detailed and over-long for the purposes of this assignment, but nevertheless an exemplary approach to planning empirical quantitative research in a careful, intentional way that is worthy of imitation. | * [One by Jim Maddock] created as part of a qualifying exam early in 2019. Jim doesn't provide dummy tables or anticipated findings/contributions, but he has an especially phenomenal explanation of the conceptual relationships and processes he wants to test. {{forthcoming}} | ||
* [One provided as an appendix to Gerber and Green's excellent textbook, ''Field Experiments: Design, Analysis, and Interpretation'' (FEDAI)]. It's over-detailed and over-long for the purposes of this assignment, but nevertheless an exemplary approach to planning empirical quantitative research in a careful, intentional way that is worthy of imitation. {{forthcoming}} | |||
==== Research project presentation ==== | ==== Research project presentation ==== | ||
Line 212: | Line 212: | ||
[[Statistics_and_Statistical_Programming_(Spring_2019)/Final_project_presentations]] | [[Statistics_and_Statistical_Programming_(Spring_2019)/Final_project_presentations]] | ||
---> | ---> | ||
You will also create and record a short presentation of your final project. The presentation will provide an opportunity to share a brief overview of your project and findings with the other members of the class. Since you will all give other research presentations throughout your career, I strongly encourage you to take the opportunity to refine your academic presentation skills. The document [ | You will also create and record a short presentation of your final project. The presentation will provide an opportunity to share a brief overview of your project and findings with the other members of the class. Since you will all give other research presentations throughout your career, I strongly encourage you to take the opportunity to refine your academic presentation skills. The document [Creating a Successful Scholarly Presentation] (file posted to Canvas) may be useful. {{forthcoming}} | ||
Additional details about the presentation goals, format suggestions, resources, and more will be provided later in the quarter. | |||
==== Research project paper ==== | ==== Research project paper ==== | ||
Line 282: | Line 284: | ||
'''Homework:''' | '''Homework:''' | ||
* Complete '''Problem set 2''': exercises from OpenIntro §1: (1.6, 1.9, 1.10, 1.16, 1.21, 1.40, 1.42, 1.43). Remember that solutions to odd-numbered problems are in the book! | * Complete '''Problem set 2''': exercises from OpenIntro §1: (1.6, 1.9, 1.10, 1.16, 1.21, 1.40, 1.42, 1.43). Remember that solutions to odd-numbered problems are in the book! | ||
* | * Worked solutions [[https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_02.html HTML], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_02.rmd RMarkdown], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_02.pdf PDF]] | ||
=== Day 3: Monday January 11: Numerical and categorical data === | === Day 3: Monday January 11: Numerical and categorical data === | ||
Line 310: | Line 312: | ||
* Complete [[/Problem set 3]] (OpenIntro questions & programming challenges) | * Complete [[/Problem set 3]] (OpenIntro questions & programming challenges) | ||
* | * Worked solutions [[https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_03.html HTML], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_03.rmd RMarkdown], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_03.pdf PDF]] | ||
=== Day 4: Wednesday January 13: Applied data manipulation === | === Day 4: Wednesday January 13: Applied data manipulation === | ||
Line 327: | Line 329: | ||
* Complete [[/Problem set 4]] (programming challenges and statistical questions) | * Complete [[/Problem set 4]] (programming challenges and statistical questions) | ||
* | * Worked solutions [[https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_04.html HTML], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_04.rmd RMarkdown], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_04.pdf PDF]] | ||
<!--- | <!--- | ||
'''Resources''' | '''Resources''' | ||
Line 351: | Line 353: | ||
'''Homework:''' | '''Homework:''' | ||
* Complete [[/Problem set 5]] (OpenIntro excercises & programming challenges) | * Complete [[/Problem set 5]] (OpenIntro excercises & programming challenges) | ||
* | * Worked solutions [[https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_05-pt1.html HTML], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_05-pt1.rmd RMarkdown], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_05-pt1.pdf PDF]] | ||
=== Day 6: Monday January 25: Distributions === | === Day 6: Monday January 25: Distributions === | ||
Line 367: | Line 369: | ||
'''Homework:''' | '''Homework:''' | ||
* Complete '''Problem set 6''': exercises from OpenIntro §4: 4.4, 4.6, 4.15, 4.22 | * Complete '''Problem set 6''': exercises from OpenIntro §4: 4.4, 4.6, 4.15, 4.22 | ||
=== Day 7: Wednesday January 27: Descriptive analysis and visualization === | === Day 7: Wednesday January 27: Descriptive analysis and visualization === | ||
'''Class material:''' | |||
* [[/Day 7 session plan]] | * [[/Day 7 session plan]] | ||
'''Required tasks:''' | '''Required tasks:''' | ||
* COM520 R Tutorial #5 | |||
* COM520 R Tutorial #5 {{forthcoming}} <!-- w05-R_tutorial.html w05a-R_tutorial.html --> | |||
'''Homework:''' | '''Homework:''' | ||
* Complete [[/Problem set 7]] | * Complete [[/Problem set 7]] <!-- Statistics_and_Statistical_Programming_(Winter_2020)/pset3 --> | ||
=== Day 8: Monday February 1: Foundations for inference === | === Day 8: Monday February 1: Foundations for inference === | ||
Line 400: | Line 402: | ||
=== Day 9: Wednesday February 3: Reinforced foundations for inference === | === Day 9: Wednesday February 3: Reinforced foundations for inference === | ||
''' | '''Class material:''' | ||
* | * [[/Day 9 session plan]] | ||
''' | '''Required:''' | ||
* | * Read Reinhart, §1. | ||
* Complete [[/Problem set 9]] <!-- Statistics_and_Statistical_Programming_(Winter_2020)/pset4 --> | |||
=== Day 10: Monday February 8: Inference for categorical data === | |||
'''Class material:''' | |||
* [[/Day 10 session plan]] | |||
'''Required tasks:''' | '''Required tasks:''' | ||
* Read Diez, Çetinkaya-Rundel, and Barr: §6 (Inference for categorical data). | * Read Diez, Çetinkaya-Rundel, and Barr: §6 (Inference for categorical data). | ||
* Complete [[/Problem set 10]]: '''exercises from OpenIntro §6:''' 6.10, 6.16, 6.22, 6.30, 6.40 (just parts a and b; part c gets tedious) | |||
'''Recommended tasks:''' | '''Recommended tasks:''' | ||
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5Gn-sHTw1NF0e8IvMxwHDW&v=_iFAZgpWsx0 inference for categorical data] (videos 1-3 in the playlist) OpenIntro lectures. | * Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5Gn-sHTw1NF0e8IvMxwHDW&v=_iFAZgpWsx0 inference for categorical data] (videos 1-3 in the playlist) OpenIntro lectures. | ||
* [https://gallery.shinyapps.io/CLT_prop/ OpenIntro Central limit theorem for proportions demo]. | * [https://gallery.shinyapps.io/CLT_prop/ OpenIntro Central limit theorem for proportions demo]. | ||
=== Day 11: Wednesday February 10: Applied inference for categorical data === | |||
'''Class material:''' | |||
* [[/Day 11 session plan]] | |||
'''Required tasks:''' | '''Required tasks:''' | ||
* Read Reinhart, §4 and §5 (both are quite short). | * Read Reinhart, §4 and §5 (both are quite short). | ||
* Skim the following (all are referenced in the problem set) | * Skim the following (all are referenced in the problem set) | ||
** Aronow PM, Karlan D, Pinson LE. (2018). The effect of images of Michelle Obama’s face on trick-or-treaters’ dietary choices: A randomized control trial. | ** Aronow PM, Karlan D, Pinson LE. (2018). The effect of images of Michelle Obama’s face on trick-or-treaters’ dietary choices: A randomized control trial. PLoS ONE 13(1): e0189693. [https://doi.org/10.1371/journal.pone.0189693 https://doi.org/10.1371/journal.pone.0189693] | ||
** Buechley, Leah and Benjamin Mako Hill. 2010. “LilyPad in the Wild: How Hardware’s Long Tail Is Supporting New Engineering and Design Communities.” Pp. 199–207 in ''Proceedings of the 8th ACM Conference on Designing Interactive Systems.'' Aarhus, Denmark: ACM. | ** Buechley, Leah and Benjamin Mako Hill. 2010. “LilyPad in the Wild: How Hardware’s Long Tail Is Supporting New Engineering and Design Communities.” Pp. 199–207 in ''Proceedings of the 8th ACM Conference on Designing Interactive Systems.'' Aarhus, Denmark: ACM. [[https://mako.cc/academic/buechley_hill_DIS_10.pdf PDF available on Hill's personal website]] | ||
* Complete [[/Problem set 11]] <!-- Statistics_and_Statistical_Programming_(Winter_2020)/pset5 --> | |||
* COM520 R Tutorial #6 (it's very short!)) {{forthcoming}} <!-- w06-R_tutorial.html --> | |||
''' | '''Resources''' | ||
=== NO CLASS: Monday February 15: Presidents' Day === | |||
=== Day 12: Wednesday February 17: Inference for numerical data === | |||
'''Class material:''' | |||
* [[/Day 12 session plan]] | |||
* [https://www.openintro.org/go/?id=stat_better_understand_anova&referrer=/book/os/index.php OpenIntro supplement on ANOVA calculations] (particularly useful if you think you'll be doing more ANOVAs). | |||
'''Required tasks:''' | '''Required tasks:''' | ||
* Read Diez, Çetinkaya-Rundel, and Barr: §7.1-5 (Inference for numerical data: differences of means; power calculations, ANOVA, and multiple comparisons). | * Read Diez, Çetinkaya-Rundel, and Barr: §7.1-5 (Inference for numerical data: differences of means; power calculations, ANOVA, and multiple comparisons). | ||
* Complete '''Problem set 12''': exercises from OpenIntro §7: 7.12, 7.24, 7.26, 7.42, 7.44, 7.46 | |||
'''Recommended tasks:''' | '''Recommended tasks:''' | ||
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5G3IO1tzQ-DUThsJKQzQCD&v=uVEj2uBJfq0 inference for numerical data] (videos 1-8 in the playlist) OpenIntro lectures (and featuring one of the textbook authors!). | * Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5G3IO1tzQ-DUThsJKQzQCD&v=uVEj2uBJfq0 inference for numerical data] (videos 1-8 in the playlist) OpenIntro lectures (and featuring one of the textbook authors!). | ||
* | * Check out [https://gallery.shinyapps.io/CLT_mean/ OpenIntro Central limit theorem for means demo]. | ||
=== Day 13: Monday February 22: t-tests, power analysis, ANOVA === | |||
'''Class material:''' | |||
* [[/Day 13 session plan]] | |||
''' | '''Required tasks:''' | ||
* Complete [[/Problem set | * Complete [[/Problem set 13]] (programming challenges) <!-- Statistics_and_Statistical_Programming_(Winter_2020)/pset6 --> | ||
* COM520 R Tutorial #7 {{forthcoming}} <!-- w09-R_tutorial.html --> | |||
=== Day 14: Wednesday February 24: Linear regression === | |||
'''Class material:''' | |||
* [[/Day 14 session plan]] | |||
* [[/Day | |||
'''Required tasks:''' | '''Required tasks:''' | ||
* Read Diez, Çetinkaya-Rundel, and Barr: §8 (Linear regression). | * Read Diez, Çetinkaya-Rundel, and Barr: §8 (Linear regression). | ||
* Watch [https://www.youtube.com/playlist?list=PLkIselvEzpM63ikRfN41DNIhSgzboELOM linear regression] (videos 1-4 in the playlist) OpenIntro lectures. | |||
* Read [https://www.openintro.org/go/?id=stat_more_inference_for_linear_regression&referrer=/book/os/index.php More inference for linear regression] (OpenIntro supplement). | * Read [https://www.openintro.org/go/?id=stat_more_inference_for_linear_regression&referrer=/book/os/index.php More inference for linear regression] (OpenIntro supplement). | ||
* Complete '''Problem set 14''': exercises from OpenIntro §8 (8.6, 8.36, 8.40, 8.44) and OpenIntro supplement (4 and 5). Answers to the latter and provided in the supplement. | |||
'''Recommended tasks:''' | '''Recommended tasks:''' | ||
* Read [https://seeing-theory.brown.edu/index.html#secondPage/chapter6 Seeing Theory §6 (Regression analysis)] | * Read [https://seeing-theory.brown.edu/index.html#secondPage/chapter6 Seeing Theory §6 (Regression analysis)] | ||
=== Day 15: Monday March 1: Applied linear regression === | |||
'''Class material:''' | '''Class material:''' | ||
* [[/Day | |||
* [[/Day 15 session plan]] | |||
'''Required tasks:''' | '''Required tasks:''' | ||
* COM520 R Tutorial #8 {{forthcoming}} <!-- w10-R_tutorial.html --> | |||
* Complete [[/Problem set 15]] <!-- Statistics_and_Statistical_Programming_(Winter_2020)/pset7 --> | |||
=== Day 16: Wednesday March 3: Multiple and logistic regression === | |||
'''Class material:''' | |||
* [[/Day 16 session plan]] | |||
* [[/Day | |||
'''Required tasks:''' | '''Required tasks:''' | ||
Line 493: | Line 497: | ||
* Read [https://www.openintro.org/go/?id=stat_interaction_terms&referrer=/book/os/index.php Interaction terms] (OpenIntro supplement). | * Read [https://www.openintro.org/go/?id=stat_interaction_terms&referrer=/book/os/index.php Interaction terms] (OpenIntro supplement). | ||
* Read [https://www.openintro.org/go/?id=stat_nonlinear_relationships&referrer=/book/os/index.php Fitting models for non-linear trends] (OpenIntro supplement). | * Read [https://www.openintro.org/go/?id=stat_nonlinear_relationships&referrer=/book/os/index.php Fitting models for non-linear trends] (OpenIntro supplement). | ||
* Complete '''Problem set 16''': exercises from OpenIntro §9: 9.4, 9.13, 9.16, 9.18, | |||
'''Recommended tasks:''' | '''Recommended tasks:''' | ||
* Watch [https://www.youtube.com/playlist?list=PLkIselvEzpM5f1HYzIjFt52SD4izsJ2_I multiple and logistic regression] (videos 1-4 in the playlist) OpenIntro lectures. | * Watch [https://www.youtube.com/playlist?list=PLkIselvEzpM5f1HYzIjFt52SD4izsJ2_I multiple and logistic regression] (videos 1-4 in the playlist) OpenIntro lectures. | ||
=== Day 17: Monday March 8: Applied multiple and logistic regression === | |||
'''Class material:''' | |||
* [[/Day 17 session plan]] | |||
* [[/Day | |||
'''Required tasks:''' | '''Required tasks:''' | ||
* COM520 R Tutorial # | * COM520 R Tutorial #9: Tutorial on interpreting logistic regression in R {{forthcoming}} <!-- logistic_regression_interpretation.html --> | ||
* Complete [[/Problem set 17]] <!-- Statistics_and_Statistical_Programming_(Winter_2020)/pset8 --> | |||
=== Day 18: Wednesday March 10: Final Presentations === | |||
=== Day | |||
'''Class material:''' | '''Class material:''' | ||
* [[/Day 18 session plan]] | * [[/Day 18 session plan]] | ||
'''Post your video via this "Discussion" on Canvas]''' {{forthcoming}} — Please view and provide constructive feedback on other's videos! | |||
* '''Post videos directly to the "Discussion."''' The Canvas text editor has an option to upload/record a video. That's what you want. | * '''Post videos directly to the "Discussion."''' The Canvas text editor has an option to upload/record a video. That's what you want. | ||
* '''Please remember not to over-work/think this.''' I mentioned this in class, but just to reiterate, the focus of this assignment should not be your video editing skills. Please do what you can to record and convey your ideas clearly without devoting insane hours to creating the perfect video. | * '''Please remember not to over-work/think this.''' I mentioned this in class, but just to reiterate, the focus of this assignment should not be your video editing skills. Please do what you can to record and convey your ideas clearly without devoting insane hours to creating the perfect video. | ||
* '''Some resources for recording presentations:''' There are a bunch of ways you might record/share your video. Some ideas include using the embedded media recorder in Canvas (!) that can record with with your webcam (maybe attach a few visuals to accompany this?); recording a "meeting" with yourself in Zoom; and "Panopto," a piece of high-end video recording, sharing, and editing software that UW licenses for campus use. Here are some pointers: | * '''Some resources for recording presentations:''' There are a bunch of ways you might record/share your video. Some ideas include using the embedded media recorder in Canvas (!) that can record with with your webcam (maybe attach a few visuals to accompany this?); recording a "meeting" with yourself in Zoom; and "Panopto," a piece of high-end video recording, sharing, and editing software that UW licenses for campus use. Here are some pointers: | ||
** You should be able to use your UW zoom account to create a zoom meeting, record your meeting (in which you deliver your presentation and share your screen with any visuals), and then share a link to the recording via the "Recordings" item in the left-hand menu of your | ** You should be able to use your UW zoom account to create a zoom meeting, record your meeting (in which you deliver your presentation and share your screen with any visuals), and then share a link to the recording via the "Recordings" item in the left-hand menu of your [https://northwestern.zoom.us/ https://northwestern.zoom.us/] account page. | ||
** If nothing works, please get in touch. | ** If nothing works, please get in touch. | ||
== Special Notes == | == Special Notes == |