Editing Statistics and Statistical Programming (Winter 2021)

From CommunityData

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 9: Line 9:
:* [https://discord.com Discord] — for synchronous course meetings as well as asyncronous discussion and chat.
:* [https://discord.com Discord] — for synchronous course meetings as well as asyncronous discussion and chat.
:* [https://wiki.communitydata.science/Statistics_and_Statistical_Programming_(Winter_2021) This syllabus wiki page] — for nearly everything else.
:* [https://wiki.communitydata.science/Statistics_and_Statistical_Programming_(Winter_2021) This syllabus wiki page] — for nearly everything else.
:* [https://www.dropbox.com/home/COM520-shared_files-UW-2021-Q1 Dropbox] —Filesharing via Dropbox.


;Instructor: [[Benjamin Mako Hill]] ([mailto:makohill@uw.edu makohill@uw.edu])
;Instructor: [[Benjamin Mako Hill]] ([mailto:makohill@uw.edu makohill@uw.edu])
:Office Hours: By appointment (I'm usually available via chat during "business hours.") You can view out [https://mako.cc/calendar/ my calendar] and/or [https://harmonizely.com/mako put yourself on it]. If you schedule a meeting, we'll meet in the Jitsi link you'll get through the scheduling app.
:Office Hours: {{tbd}} and by appointment (I'm usually available via chat during "business hours.")


<br clear=all>
<br clear=all>
Line 54: Line 53:
You should expect this syllabus to be a dynamic document. Although the core expectations for this class are fixed, the details of readings and assignments ''will'' shift based on how the class goes, guest speakers that I might arrange, my own readings in this area, etc. As a result, there are three important things to keep in mind:
You should expect this syllabus to be a dynamic document. Although the core expectations for this class are fixed, the details of readings and assignments ''will'' shift based on how the class goes, guest speakers that I might arrange, my own readings in this area, etc. As a result, there are three important things to keep in mind:


* Although details on this syllabus will change, I will try to ensure that I never change readings more than six days before they are due. We will send an announcement '''no later than before we go to sleep each Tuesday evening''' that fixes the schedule for the next week. This means that if I don't fill in a reading marked "{{tbd}}" or "{{forthcoming}}" six days before it's due, it is dropped. If we don't change something marked "{{tentative}}" before the deadline, then it is assigned. This also means that if you plan to read more than six days ahead, contact the teaching team first.
* Although details on this syllabus will change, I will try to ensure that I never change readings more than six days before they are due. We will send an announcement '''no later than before we go to sleep each Tuesday evening''' that fixes the schedule for the next week. This means that if I don't fill in a reading marked "{{tbd}}" six days before it's due, it is dropped. If we don't change something marked "{{tentative}}" before the deadline, then it is assigned. This also means that if you plan to read more than six days ahead, contact the teaching team first.
* Because this syllabus a wiki, you will be able to track every change by clicking the history button on this page when I make changes. I will summarize these changes in the weekly [https://canvas.uw.edu/courses/1369415/announcements an announcement on Canvas] sent that will be emailed to everybody in the class. Closely monitor your email or the announcements section on the [https://canvas.uw.edu/courses/1369415/announcements course website on Canvas] to make sure you don't miss these announcements.
* Because this syllabus a wiki, you will be able to track every change by clicking the history button on this page when I make changes. I will summarize these changes in the weekly [https://canvas.uw.edu/courses/1369415/announcements an announcement on Canvas] sent that will be emailed to everybody in the class. Closely monitor your email or the announcements section on the [https://canvas.uw.edu/courses/1369415/announcements course website on Canvas] to make sure you don't miss these announcements.
* I will ask the class for voluntary anonymous feedback frequently — especially toward the beginning of the quarter. Please let me know what is working and what can be improved. In the past, I have made many adjustments to courses that I teach while the quarter progressed based on this feedback.
* I will ask the class for voluntary anonymous feedback frequently — especially toward the beginning of the quarter. Please let me know what is working and what can be improved. In the past, I have made many adjustments to courses that I teach while the quarter progressed based on this feedback.
Line 102: Line 101:
This class will use a freely-licensed textbook:
This class will use a freely-licensed textbook:


* Diez, David M., Christopher D. Barr, and Mine Çetinkaya-Rundel. 2019. ''OpenIntro Statistics''. 4th edition. OpenIntro, Inc. {{avail-free|https://www.openintro.org/book/os/}}
* Diez, David M., Christopher D. Barr, and Mine Çetinkaya-Rundel. 2019. [https://www.openintro.org/book/os/ ''OpenIntro Statistics'']. 4th edition. OpenIntro, Inc.


The texbook (in any format) is required for the course. You can [https://www.openintro.org/go?id=os4&referrer=/book/os/index.php download it] at no cost and purchase hard copy versions in either [https://www.openintro.org/go?id=os4_color_pb&referrer=/book/os/index.php full color ($60)] or in [https://www.openintro.org/go?id=os4_bw_pb&referrer=/book/os/index.php black and white ($20)]. The B&W version is very affordable and I strongly recommend buying a hard copy for the purposes of the course and subsequent reference use. The book is excellent and has been adopted widely. It has also developed a large online community of students and teachers who have shared other resources. Lecture slides, videos, notes, and more are all freely licensed (many through the website and others elsewhere).
The texbook (in any format) is required for the course. You can [https://www.openintro.org/go?id=os4&referrer=/book/os/index.php download it] at no cost and purchase hard copy versions in either [https://www.openintro.org/go?id=os4_color_pb&referrer=/book/os/index.php full color ($60)] or in [https://www.openintro.org/go?id=os4_bw_pb&referrer=/book/os/index.php black and white ($20)]. The B&W version is very affordable and I strongly recommend buying a hard copy for the purposes of the course and subsequent reference use. The book is excellent and has been adopted widely. It has also developed a large online community of students and teachers who have shared other resources. Lecture slides, videos, notes, and more are all freely licensed (many through the website and others elsewhere).
Line 108: Line 107:
I will also assigning several chapters from the following:
I will also assigning several chapters from the following:


* Reinhart, Alex. 2015. ''Statistics Done Wrong: The Woefully Complete Guide''. SF, CA: No Starch Press. {{avail-uw|1=https://alliance-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=CP71226818410001451&context=L&vid=UW&lang=en_US&search_scope=all&adaptor=Local%20Search%20Engine&tab=default_tab&query=any,contains,statistics%20done%20wrong}}
* Reinhart, Alex. 2015. ''Statistics Done Wrong: The Woefully Complete Guide''. SF, CA: No Starch Press. ([https://alliance-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=CP71226818410001451&context=L&vid=UW&lang=en_US&search_scope=all&adaptor=Local%20Search%20Engine&tab=default_tab&query=any,contains,statistics%20done%20wrong Safari online via UW libraries])


This book provides a readable conceptual introduction to some common failures in statistical analysis that you should learn to recognize and avoid. It was also written by a Ph.D. student. You have access to an electronic copy via the UW libraries (you'll need to sign-in and/or use the [[#VPN Notice|UW VPN]] to access it), but you may find it helpful to purchase as well.
This book provides a readable conceptual introduction to some common failures in statistical analysis that you should learn to recognize and avoid. It was also written by a Ph.D. student. You have access to an electronic copy via the NU library (you'll need to sign-in and/or use the NU VPN to access it), but you may find it helpful to purchase as well.


A few other books may be useful resources while you're learning to analyze, visualize, and interpret statistical data with R. I will share some advice about these during the first class meeting:
A few other books may be useful resources while you're learning to analyze, visualize, and interpret statistical data with R. I will share some advice about these during the first class meeting:


* Babbie, Earl R. 2015. '''The Practice of Social Research''', 14th edition. Boston, MA: Cengage Learning. [''Chapters will be made available in Canvas'']
* Babbie, Earl R. 2015. '''The Practice of Social Research''', 14th edition. Boston, MA: Cengage Learning. [Chapters will be made available in Canvas]
* Healy, Kieran. 2019. ''Data Visualization: A Practical Introduction''. Princeton, NJ: Princeton University Press.{{avail-free|https://kieranhealy.org/publications/dataviz/}}
* Healy, Kieran. 2019. ''Data Visualization: A Practical Introduction''. Princeton, NJ: Princeton University Press. [[https://kieranhealy.org/publications/dataviz/ Available free online]]
* Teetor, Paul. 2011. ''R Cookbook: Proven Recipes for Data Analysis, Statistics, and Graphics''. 1 edition. Sebastopol, CA: O’Reilly Media. {{avail-uw|http://proquest.safaribooksonline.com/9780596809287}}; [''[https://en.wikipedia.org/wiki/Special:BookSources/978-0-596-80915-7 Available for purchase through various sources]'']; [''[https://www.amazon.com/Cookbook-Analysis-Statistics-Graphics-Cookbooks/dp/0596809158/ref=sr_1_1?ie=UTF8&qid=1482802812&sr=8-1&keywords=r+cookbook Available for purchase through Amazon]'']
* Teetor, Paul. 2011. ''R Cookbook: Proven Recipes for Data Analysis, Statistics, and Graphics''. 1 edition. Sebastopol, CA: O’Reilly Media. [[http://proquest.safaribooksonline.com/9780596809287 Available through UW libraries]; [https://en.wikipedia.org/wiki/Special:BookSources/978-0-596-80915-7 Available for purchase through various sources]; [https://www.amazon.com/Cookbook-Analysis-Statistics-Graphics-Cookbooks/dp/0596809158/ref=sr_1_1?ie=UTF8&qid=1482802812&sr=8-1&keywords=r+cookbook Available for purchase through Amazon])
* Verzani, John. 2014. ''Using R for Introductory Statistics, Second Edition''. 2 edition. Boca Raton: Chapman and Hall/CRC. [''[https://en.wikipedia.org/wiki/Special:BookSources/978-1-4665-9073-1 Available for purchase through various sources]'']; [''[https://www.amazon.com/Using-Introductory-Statistics-Second-Chapman/dp/1466590734/ref=mt_hardcover?_encoding=UTF8&me= Available for purchase through Amazon]'']
* Verzani, John. 2014. ''Using R for Introductory Statistics, Second Edition''. 2 edition. Boca Raton: Chapman and Hall/CRC. [[https://en.wikipedia.org/wiki/Special:BookSources/978-1-4665-9073-1 Available for purchase through various sources]; [https://www.amazon.com/Using-Introductory-Statistics-Second-Chapman/dp/1466590734/ref=mt_hardcover?_encoding=UTF8&me= Available for purchase thro8ugh Amazon])
* Wickham, Hadley. 2010. ''ggplot2: Elegant Graphics for Data Analysis''. 1st ed. 2009. Corr. 3rd printing 2010 edition. New York: Springer. {{avail-uw|https://link.springer.com/book/10.1007%2F978-3-319-24277-4}}; [''[https://en.wikipedia.org/wiki/Special:BookSources/978-0-596-80915-7 Available for purchase through various sources]'']
* Wickham, Hadley. 2010. ''ggplot2: Elegant Graphics for Data Analysis''. 1st ed. 2009. Corr. 3rd printing 2010 edition. New York: Springer. ([https://link.springer.com/book/10.1007%2F978-3-319-24277-4 Available through UW libraries]; [https://en.wikipedia.org/wiki/Special:BookSources/978-0-596-80915-7 Available for purchase through various sources]]
* Wickham, Hadly and Grolemund, Garret. 2017. ''R for Data Science''. Sebastopol, CA: O'Reilly. {{avail-free|https://r4ds.had.co.nz/}}
* Wickham, Hadly and Grolemund, Garret. 2017. ''R for Data Science''. Sebastopol, CA: O'Reilly. [[https://r4ds.had.co.nz/ Available free online]]


There are also some invaluable non-textbook resources:
There are also some invaluable non-textbook resources:
Line 129: Line 128:
* [https://depts.washington.edu/acelab/proj/Rstats/index.html Statistical Analysis and Reporting in R] — A set of resources created and distributed by Jacob Wobbrock (University of Washington, School of Information) in conjunction with a MOOC he teaches. Contains cheatsheets, code snippets, and data to help execute commonly encountered statistical procedures in R.
* [https://depts.washington.edu/acelab/proj/Rstats/index.html Statistical Analysis and Reporting in R] — A set of resources created and distributed by Jacob Wobbrock (University of Washington, School of Information) in conjunction with a MOOC he teaches. Contains cheatsheets, code snippets, and data to help execute commonly encountered statistical procedures in R.
* [https://www.datacamp.com DataCamp] offers introductory R courses. Northwestern usually has some free accounts that get passed out via Research Data Services each quarter. Apparently, if you are taking or teaching relevant coursework, instructors can [https://www.datacamp.com/groups/education request] free access to DataCamp for their courses from DataCamp. If folks are interested in this, I can reach out.
* [https://www.datacamp.com DataCamp] offers introductory R courses. Northwestern usually has some free accounts that get passed out via Research Data Services each quarter. Apparently, if you are taking or teaching relevant coursework, instructors can [https://www.datacamp.com/groups/education request] free access to DataCamp for their courses from DataCamp. If folks are interested in this, I can reach out.
* [https://brownmath.com/swt/symbol.htm Statistics symbols you need to know] which is just what it says on the tin. Thanks Kate Rich!


== Assignments ==
== Assignments ==
=== Weekly Assignments ===


There are two types of assignments in the course: (a) problem sets that we will discuss during each class session; and (b) a large course project.
In order to support continuous progress towards the learning goals for the course, I have assigned some textbook exercises or a problem set ahead of every class. These assignments will provide the basis on which I will assess and provide feedback on your participation and engagement with the course material but I will not grade you on whether you get these answers correct or incorrect. In general, we will cover the problem sets in the first session of the week and the textbook materials in the second session.  


=== Problem Sets===
==== Textbook exercises ====
Each week I will assign textbook exercises. The focus is on self-assessment of your understanding of the textbook material and you will not hand in your answers. I expect that you will work on the exercises, review and discuss solutions, and submit any questions ahead of or during class. Please note that solutions to odd-numbered problems appear in the back of the book. I will distribute solutions to even-numbered problems as well.


In order to support continuous progress towards the learning goals for the course, I have assigned problem sets for each class. These problem sets include some textbook exercises, some programming challenges, and some other questions.
==== Problem sets ====
 
The course will include problem sets and these may incorporate several kinds of questions:
Problem sets and these may incorporate several kinds of questions:


* '''Statistics questions''' about statistical concepts and principles.
* '''Statistics questions''' about statistical concepts and principles.
* '''Programming challenges''' that you should solve using R.
* '''Programming challenges''' that you should solve using R.
* '''Empirical paper questions''' about other assigned readings.  
* '''Empirical paper questions''' about other assigned readings.  
<!-- For the problem sets, I ask that you submit your work [https://canvas.uw.edu/courses/1434003/assignments via Canvas 24 hours before class] (i.e., Monday afternoon for our Tuesday class sessions). Details of exactly how this will work will be elaborated during the first class. -->


Although you will never hand these in to be graded, I will ''randomly'' call on students to share your answers to these questions and I will assess your preparedness after every single class meeting. I will ''not'' grade you on whether you get these answers correct or incorrect. Although the problem sets will not be assigned a letter grade, they are the central focus of the course and completing them will support your mastery of the material in multiple ways. These assignments will provide the basis on which I will assess and provide feedback on your participation and engagement with the course material.
For the problem sets, I ask that you submit your work [https://canvas.uw.edu/courses/1434003/assignments via Canvas 24 hours before class] (i.e., Monday afternoon for our Tuesday class sessions). Details of exactly how this will work will be elaborated during the first class. For the programming challenges, you should submit code and text for your solutions (again, more on how later). If you get completely stuck on a problem, that's okay, but please provide whatever you have.


For the programming challenges, be ready to share code and text for your solutions via screen share. If you get completely stuck on a problem, that's okay, but be ready to provide whatever you have and describe what tripped you up. In general, we will cover the problem sets in the first session of the week and the textbook materials in the second session.
Problem sets will be evaluated on a complete/incomplete basis. Although the problem sets will not be assigned a letter grade, they are a central focus of the course and completing them will support your mastery of the material in multiple ways. Working through them on schedule will also make it possible for you to participate in the synchronous course meetings and online discussions of course material effectively. Your ability to do so will figure prominently in your participation grade for the course (see the section on grading and assessment below).


=== Research project ===
=== Research project (major) assignments ===


==== Overview ====
As a demonstration of your learning in this course, you will design and carry out a quantitative research project, start to finish. This means you will all:
As a demonstration of your learning in this course, you will design and carry out a quantitative research project, start to finish. This means you will all:


Line 161: Line 160:
* '''Ensure that your work is replicable''' — You will need to provide code and data for your analysis in a way that makes your work replicable by other researchers.
* '''Ensure that your work is replicable''' — You will need to provide code and data for your analysis in a way that makes your work replicable by other researchers.


''I strongly urge you'' to produce a project that will further your academic career outside of the class. There are many ways that this can happen. Some obvious options are to prepare a project that you can submit for publication, use as pilot analysis that you can report in a grant or thesis proposal, and/or use to fulfill a degree requirement. The last time I taught a statistical course, a majority of students in the class used their course projects either to satisfy a general examination requirement, as a published paper, or both.
''I strongly urge you'' to produce a project that will further your academic career outside of the class. There are many ways that this can happen. Some obvious options are to prepare a project that you can submit for publication, use as pilot analysis that you can report in a grant or thesis proposal, and/or use to fulfill a degree requirement.
 
There are several intermediate milestones, deliverables, and deadlines to help you accomplish a successful research project. Unless otherwise noted, all deliverables should be submitted via Canvas by 5pm CT on the day they are due.


There are several intermediate milestones, deliverables, and deadlines to help you accomplish a successful research project. Unless otherwise noted, all deliverables should be submitted via Canvas at 11:59pm Seattle time on the day they are due.


==== Research project plan and dataset identification ====
==== Research project plan and dataset identification ====


;Due date: Friday January 15, 2021
;Due date: October 9, 2020, 5pm CT
;Maximum length: 500 words (~1-2 pages)
;Maximum length: 500 words (~1-2 pages)


Very early on, I want you to identify and describe your final project. Your description should be short and can be either paragraphs or bullets. It should include the following:
Early on, I want you to identify and describe your final project. Your description should be short and can be either paragraphs or bullets. It should include the following:


* An abstract of the proposed study including the topic, research question, theoretical motivation, object(s) of study, and anticipated research contribution.
* An abstract of the proposed study including the topic, research question, theoretical motivation, object(s) of study, and anticipated research contribution.
* An identification of the dataset you will use and a description of the rows and columns or type(s) of data it will include. If you do not currently have access to these data, explain why and when you will.
* An identification of the dataset you will use and a description of the rows and columns or type(s) of data it will include. If you do not currently have access to these data, explain why and when you will.
* A short (several sentences?) description of how the project will fit into your career trajectory.
* A short (several sentences?) description of how the project will fit into your career trajectory.


===== Notes on finding a dataset =====
===== Notes on finding a dataset =====
Line 184: Line 185:
* Do some Google Scholar and normal internet searching for datasets in your research area. You'll probably be surprised at what's available.
* Do some Google Scholar and normal internet searching for datasets in your research area. You'll probably be surprised at what's available.
* Take a look at datasets available in the [https://dataverse.harvard.edu/ Harvard Dataverse] (a very large collection of social science research data) or one of the other members of the [http://dataverse.org/ Dataverse network].
* Take a look at datasets available in the [https://dataverse.harvard.edu/ Harvard Dataverse] (a very large collection of social science research data) or one of the other members of the [http://dataverse.org/ Dataverse network].
* Look at the collection of social scientific datasets at [https://www.icpsr.umich.edu/icpsrweb/ICPSR/ ICPSR at the University of Michigan] (UW is a member). There are an enormous number of very rich datasets.
* Look at the collection of social scientific datasets at [https://www.icpsr.umich.edu/icpsrweb/ICPSR/ ICPSR at the University of Michigan] (NU is a member). There are an enormous number of very rich datasets.
* Use the [http://scientificdata.isa-explorer.org/index.html ISA Explorer] to find datasets. Keep in mind the large majority of datasets it will search are drawn from the natural sciences.
* Use the [http://scientificdata.isa-explorer.org/index.html ISA Explorer] to find datasets. Keep in mind the large majority of datasets it will search are drawn from the natural sciences.
* The City of Seattle has one of the best [https://data.seattle.gov/ data portal sites] of any municipality in the U.S. (and better than many federal agencies). There are also numerous administrative datasets released by other public entities (try searching!) that you might find inspiring.
* The City of Chicago has one of the best [https://data.cityofchicago.org/ data portal sites] of any municipality in the U.S. (and better than many federal agencies). There are also numerous administrative datasets released by other public entities (try searching!) that you might find inspiring.
* [http://fivethirtyeight.com FiveThirtyEight.com] has published a [https://cran.r-project.org/web/packages/fivethirtyeight/vignettes/fivethirtyeight.html GitHub repository and an R package] with pre-processed and cleaned versions of many of the datasets they use for articles published on their website.
* [http://fivethirtyeight.com FiveThirtyEight.com] has published a [https://cran.r-project.org/web/packages/fivethirtyeight/vignettes/fivethirtyeight.html GitHub repository and an R package] with pre-processed and cleaned versions of many of the datasets they use for articles published on their website.
* If you interested in studying online communities, there are some great resources for accessing data from Reddit, Wikipedia, and StackExchange. See [https://files.pushshift.io/reddit/ pushshift] for dumps of Reddit data, [https://meta.wikimedia.org/wiki/Research:Data here] for an overview of Wikipedia's data resources, and [https://data.stackexchange.com/ Stack Exchange's data portal].
* If you interested in studying online communities, there are some great resources for accessing data from Reddit, Wikipedia, and StackExchange. See [https://files.pushshift.io/reddit/ pushshift] for dumps of Reddit data, [https://meta.wikimedia.org/wiki/Research:Data here] for an overview of Wikipedia's data resources, and [https://data.stackexchange.com/ Stack Exchange's data portal].
Line 195: Line 196:
==== Research project planning document ====
==== Research project planning document ====


;Due date: February 12, 2021
;Due date: October 30, 2020, 5pm CT
;Suggested length: ~5 pages
;Suggested length: ~5 pages


The project planning document is a shell/outline of an empirical quantitative research paper. Your planning document should should have the following sections: (a) Rationale, (b) Objectives; (b.1) General objectives; (b.2) Specific objectives; (c) (Null) hypotheses; (d) Conceptual diagram and explanation of the relationship(s) you plan to test; (e) Measures; (f) Dummy tables/figures; (g) anticipated finding(s) and research contribution(s). Longer descriptions of each of these planning document sections (as well as a few others) can be found [[CommunityData:Planning document|on this wiki page]].
The project planning document is a shell/outline of an empirical quantitative research paper. Your planning document should should have the following sections: (a) Rationale, (b) Objectives; (b.1) General objectives; (b.2) Specific objectives; (c) (Null) hypotheses; (d) Conceptual diagram and explanation of the relationship(s) you plan to test; (e) Measures; (f) Dummy tables/figures; (g) anticipated finding(s) and research contribution(s). Longer descriptions of each of these planning document sections (as well as a few others) can be found [[CommunityData:Planning document|on this wiki page]].


I will also provide example planning documents via our Canvas site:
I will also provide three example planning documents via our Canvas site (links to-be-updated for 2020 edition of the course):
* [https://canvas.northwestern.edu/files/9439380/download?download_frd=1 One by public health researcher Mika Matsuzaki]. The first planning document I ever saw and still one of the best. It's missing a measures section. It's also focused on a research context that is probably very different from yours, but try not to get bogged down by that and imagine how you might map the structure of the document to your own work.
* [https://canvas.northwestern.edu/files/9439380/download?download_frd=1 One by public health researcher Mika Matsuzaki]. The first planning document I ever saw and still one of the best. It's missing a measures section. It's also focused on a research context that is probably very different from yours, but try not to get bogged down by that and imagine how you might map the structure of the document to your own work.
* [One provided as an appendix to Gerber and Green's excellent textbook, ''Field Experiments: Design, Analysis, and Interpretation'' (FEDAI)]. It's over-detailed and over-long for the purposes of this assignment, but nevertheless an exemplary approach to planning empirical quantitative research in a careful, intentional way that is worthy of imitation.
* [https://canvas.northwestern.edu/files/9421229/download?download_frd=1 One by Jim Maddock] created as part of a qualifying exam early in 2019. Jim doesn't provide dummy tables or anticipated findings/contributions, but he has an especially phenomenal explanation of the conceptual relationships and processes he wants to test.
* [https://canvas.northwestern.edu/files/9439379/download?download_frd=1 One provided as an appendix to Gerber and Green's excellent textbook, ''Field Experiments: Design, Analysis, and Interpretation'' (FEDAI)]. It's over-detailed and over-long for the purposes of this assignment, but nevertheless an exemplary approach to planning empirical quantitative research in a careful, intentional way that is worthy of imitation.


==== Research project presentation ====
==== Research project presentation ====


;Presentation due date: March 10, 2021
;Presentation due date: December 3, 2020, 5pm CT
;Maximum length: 15 minutes
;Maximum length: 10 minutes


<!-- TODO revisit old presentations page to update/adapt  
<!-- TODO revisit old presentations page to update/adapt  
[[Statistics_and_Statistical_Programming_(Spring_2019)/Final_project_presentations]]
[[Statistics_and_Statistical_Programming_(Spring_2019)/Final_project_presentations]]
--->
--->
You will also create and record a short presentation of your final project. The presentation will provide an opportunity to share a brief overview of your project and findings with the other members of the class. Since you will all give other research presentations throughout your career, I strongly encourage you to take the opportunity to refine your academic presentation skills. The document [https://canvas.uw.edu/files/74392679/download?download_frd=1 Creating a Successful Scholarly Presentation] (file posted to Canvas) may be useful.
You will also create and record a short presentation of your final project. The presentation will provide an opportunity to share a brief overview of your project and findings with the other members of the class. Since you will all give other research presentations throughout your career, I strongly encourage you to take the opportunity to refine your academic presentation skills. The document [https://canvas.northwestern.edu/files/9439377/download?download_frd=1 Creating a Successful Scholarly Presentation] (file posted to Canvas) may be useful.
 
Additional details about the presentation goals, format suggestions, resources, and more will be provided later in the quarter.


==== Research project paper ====
==== Research project paper ====


;Paper due date: March 19, 2021
;Paper due date: December 10, 2020, 5pm CT
;Maximum length: 6000 words (~20 pages)
;Maximum length: 6000 words (~20 pages)


I expect you to produce a short, high quality research paper that you might revise, extend, and submit for publication and/or a dissertation milestone like a methods general examination. I do not expect the paper to be ready for publication, but it should contain polished drafts of all the necessary components of a scholarly quantitative empirical research study. In terms of the structure, please see the page on the [[structure of a quantitative empirical research paper]].
I expect you to produce a short, high quality research paper that you might revise, extend, and submit for publication and/or a dissertation milestone. I do not expect the paper to be ready for publication, but it should contain polished drafts of all the necessary components of a scholarly quantitative empirical research study. In terms of the structure, please see the page on the [[structure of a quantitative empirical research paper]].


As noted above, you should also provide data, code, and any documentation sufficient to enable the replication of all analysis and visualizations. If that is not possible/appropriate for some reason, please talk to me so that we can find another solution.
As noted above, you should also provide data, code, and any documentation sufficient to enable the replication of all analysis and visualizations. If that is not possible/appropriate for some reason, please talk to me so that we can find another solution.
Line 227: Line 231:
I have a strong preference for you to write the paper individually, but I'm open to the idea that you may want to work with others in the class. Please contact me ''before'' you attempt to pursue a collaborative final paper.
I have a strong preference for you to write the paper individually, but I'm open to the idea that you may want to work with others in the class. Please contact me ''before'' you attempt to pursue a collaborative final paper.


I do not have strong preferences about the style or formatting guidelines you follow for the paper and its bibliography. However, ''your paper must follow a standard format'' (e.g., [https://www.apastyle.org/index APA 6th edition] ([https://templates.office.com/en-us/APA-style-report-6th-edition-TM03982351 Word] and [https://www.overleaf.com/latex/templates/sample-apa-paper/fswjbwygndyq LaTeX] templates) or [https://cscw.acm.org/2019/submit-papers.html ACM SIGCHI CSCW format]) that is applicable for a peer-reviewed journal or conference proceedings in which you might aim to publish the work (they all have formatting or submission guidelines published online and you should follow them). This includes the references. I also strongly recommend that you use reference management software like Zotero to handle your bibliographic sources.
I do not have strong preferences about the style or formatting guidelines you follow for the paper and its bibliography. However, ''your paper must follow a standard format'' (e.g., [https://cscw.acm.org/2019/submit-papers.html ACM SIGCHI CSCW format] or [https://www.apastyle.org/index APA 6th edition] ([https://templates.office.com/en-us/APA-style-report-6th-edition-TM03982351 Word] and [https://www.overleaf.com/latex/templates/sample-apa-paper/fswjbwygndyq LaTeX] templates)) that is applicable for a peer-reviewed journal or conference proceedings in which you might aim to publish the work (they all have formatting or submission guidelines published online and you should follow them). This includes the references. I also strongly recommend that you use reference management software like Zotero to handle your bibliographic sources.


==== Human subjects research, IRB, and ethics ====
==== Human subjects research, IRB, and ethics ====
Line 234: Line 238:
Class projects generally do not need IRB approval, but research for publications, dissertations, and sometimes even pilot studies do fall under IRB purview. You should ''not'' plan to seek IRB approval/determination retroactively. If your study may involve human subjects and you may ever publish it in any form, you will need IRB oversight of some sort.
Class projects generally do not need IRB approval, but research for publications, dissertations, and sometimes even pilot studies do fall under IRB purview. You should ''not'' plan to seek IRB approval/determination retroactively. If your study may involve human subjects and you may ever publish it in any form, you will need IRB oversight of some sort.


Secondary analysis of anonymized data is generally not considered human subjects research, but I strongly suggest that you get a determination from [https://www.washington.edu/research/hsd/ Human Subjects Division] (the UW IRB) before you start. For work that is not considered human subjects research, this can often happen in a few hours or days. If you need to list a faculty sponsor or Principal Investigator, that should ideally be your advisor. If that doesn't make sense for some reason, please talk to me.
Secondary analysis of anonymized data is generally not considered human subjects research, but I strongly suggest that you get a determination from [https://irb.northwestern.edu/ the Northwestern IRB] before you start. For work that is not considered human subjects research, this can often happen in a few hours or days. If you need to list a faculty sponsor or Principal Investigator, that should ideally be your advisor. If that doesn't make sense for some reason, please talk to me.


Research ethics are broad and complex topic. We'll talk about issues related to ethics and quantitative empirical research a bit more during class, but will likely only scratch the surface. I strongly encourage you to pursue further reading, conversation, coursework, and reflection as you consider how to understand and apply ethical principles in the context of your own research and teaching.
Research ethics are broad and complex topic. We'll talk about issues related to ethics and quantitative empirical research a bit more during class, but will likely only scratch the surface. I strongly encourage you to pursue further reading, conversation, coursework, and reflection as you consider how to understand and apply ethical principles in the context of your own research and teaching.
Line 240: Line 244:
=== Grading and assessment ===
=== Grading and assessment ===


I will assign grades (typically on the UW 4.0 grade scale) for each of the following aspects of your performance. The percentage values in parentheses are weights that will be applied to calculate your overall grade for the course.
I will assign grades (usually a numeric value ranging from 0-10) for each of the following aspects of your performance. The percentage values in parentheses are weights that will be applied to calculate your overall grade for the course.


* Problem set discussion: 40%
* Weekly participation: 40%
* Project identification: 5%
* Proposal identification: 5%
* Final project planning document: 5%
* Final project planning document: 5%
* Final project presentation: 15%
* Final project presentation: 10%
* Final project paper: 35%
* Final project paper: 40%


I will jointly and holistically evaluate your participation in problem set discussions along four dimensions: participation, preparation, engagement, and contribution. These are quite similar to the dimensions described in the "Participation Rubric" section of [[User:Benjamin Mako Hill/Assessment|my assessment page]]. Exceptional participation means excelling along all four dimensions. Please note that participation ≠ talking/typing more and I encourage all of us to seek balance in our discussions.
The teaching team will jointly and holistically evaluate your participation along four dimensions: attendance, preparation, engagement, and contribution. These are quite similar to the dimensions described in the "Participation Rubric" section of [https://mako.cc/teaching/assessment.html Benjamin Mako Hill's assessment page] and [https://reagle.org/joseph/zwiki/Teaching/Assessment/Participation.html Joseph Reagle's participation assessment rubric]. Exceptional participation means excelling along all four dimensions. Please note that participation ≠ talking/typing more and I encourage all of us to seek [https://reagle.org/joseph/zwiki/Teaching/Best_Practices/Learning/Balance_in_Discussion.html balance in our discussions].


My assessment of your final project proposal, planning document, presentation, and paper will reflect the clarity of the work, the effective execution and presentation of quantitative empirical analysis, as well as the quality and originality of the analysis. Throughout the quarter, we will talk about the qualities of exemplary quantitative research. In general, I expect your final project to embody these exemplary qualities.
The teaching team's assessment of your final project proposal, planning document, presentation, and paper will reflect the clarity of the work, the effective execution and presentation of quantitative empirical analysis, as well as the quality and originality of the analysis. A more detailed assessment rubric will be provided. Throughout the quarter, we will talk about the qualities of exemplary quantitative research. In general, I expect your final project to embody these exemplary qualities.


== Schedule ==
=== Policies ===


When reading the schedule below, the following key might help resolve ambiguity: §n denotes chapter n; §n.x denotes section x of chapter n; §n.x-y denotes sections x through y (inclusive) of chapter n.
==== General course policies ====


The required and recommended tasks are meant to be completed '''before class''' and will typically be necessary to complete the problem sets for each day.
[[User:Aaronshaw/Classroom_policies|General policies]] on a wide variety of topics including classroom equity, attendance, academic integrity, accommodations, late assignments, and more are provided [[User:Aaronshaw/Classroom_policies|on Aaron's class policies page]]. Below are some policy statements specific to this course and quarter.


=== Day 1: Monday January 4: Intro and setup ===
==== Teaching and learning in a pandemic ====


'''Class material:'''
The Covid-19 pandemic will impact this course in various ways, some of them obvious and tangible and others harder to pin down. On the obvious and tangible front, we have things like a mix of remote and (a)synchronous instruction, the fact that many of us will not be anywhere near campus or each other this year, and the unusual academic calendar. These will reshape our collective "classroom" experience in major ways.


* [[/Day 1 session plan]]
On the "harder to pin down" side, many of us may experience elevated levels of exhaustion, stress, uncertainty and/or distraction. We may need to provide unexpected support to family, friends, or others in our communities. I have personally experienced all of these things at various times over the past six months and I expect that some of you have too. It is a difficult time.


'''Required tasks:'''
I believe it is important to acknowledge these realities of the situation and create the space to discuss and process them in the context of our class throughout the quarter. As your instructor and colleague, I commit to do my best to approach the course in an adaptive, generous, and empathetic way. I will try to be transparent and direct with you throughout—both with respect to the course material as well as the pandemic and the university's evolving response to it. I ask that you try to extend a similar attitude towards everyone in the course. When you have questions, feedback, or concerns, please try to share them in an appropriate way. If you require accommodations of any kind at any time (directly related to the pandemic or not), please contact the teaching team.


* Read this syllabus, discuss any questions/concerns with the teaching team.
==== Expectations for synchronous remote sessions ====
* Confirm course registration and access to [https://www.openintro.org/book/os/ the textbook] (pdf download available for $0 and b&w paperbacks for $20) as well as any software and web-services you'll need for course (Discord, Canvas, this wiki, R, RStudio). Discord invites will be sent via email.


=== Day 2: Wednesday January 6: Data and R ===
The following are some baseline expectations for our synchronous remote class sessions. I expect that these can and will evolve. Please feel free to ask questions, suggest changes, or raise concerns during the quarter. I welcome all input.
* All members of the class are expected to create a supportive and welcoming environment that is respectful of the conditions under which we are participating in this class.
* All members of the class are expected to take reasonable steps to create an effective teaching/learning environment for themselves and others.


'''Class material:'''
And here are suggested protocols for any video/audio portions of our class:
* [[/Day 2 session plan]]
* Please mute your microphone whenever you're not speaking and learn to use [https://en.wikipedia.org/wiki/Push-to-talk "push-to-talk"] if/when possible.
* Video is optional for all students at all times, although if you're willing/able to keep the instructor company in the video channel that would be nice.
* If you need to excuse yourself at any time and for any reason you may do so.
* Children, family, pets, roommates, and others with whom you may share your workspace are welcome to join our class as needed.


'''Required readings and resources:'''
==== Syllabus revisions ====
* Read Diez, Çetinkaya-Rundel, and Barr: §1.1-1.3 (Introduction to data)


'''Recommended readings and resources:'''
This syllabus will be a dynamic document that will evolve throughout the quarter. Although the core expectations are fixed, the details will shift. As a result, please keep in mind the following:
* Watch [https://www.youtube.com/playlist?list=PLkIselvEzpM6pZ76FD3NoCvvgkj_p-dE8 lecture materials for §1.1-3 (Videos 1-4 in the playlist)].


'''Homework:'''
# '''Assignments and readings are ''frozen'' 1 week before they are due.''' I will not add readings or assignments less than one week before they are due. If I forget to add something or fill in a "To Be Determined" less than one week before it's due, it is dropped. If you plan to read or work more than one week ahead, contact me first.
* Complete '''Problem set 2''': exercises from OpenIntro §1: (1.6, 1.9, 1.10, 1.16, 1.21, 1.40, 1.42, 1.43). Remember that solutions to odd-numbered problems are in the book!
# '''Substantial changes to the syllabus or course materials will be announced.''' Please closely monitor your email and/or [https://canvas.northwestern.edu the announcements section on the course website on Canvas]. When I make changes, these changes will be recorded in [https://wiki.communitydata.science/index.php?title=Statistics_and_Statistical_Programming_(Fall_2020)&action=history  the edit history of this page] so that you can track what has changed. I will also do my best to summarize these changes in an announcement on Canvas that will be emailed to everybody in the class.
* Problem set 2 worked solutions [[https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_02.html HTML], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_02.rmd RMarkdown], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_02.pdf PDF]]
# '''The course design may adapt throughout the quarter.''' As this is a new format for this course, I may iterate and prototype course design elements rapidly along the way. To this end, I will ask you for voluntary anonymous feedback — especially toward the beginning of the quarter. Please let me know what is working and what can be improved. In the past, I have made many adjustments based on this feedback and I expect to do so again.


=== Day 3: Monday January 11: Numerical and categorical data  ===
==== Statistics and power ====


'''Class material:'''
The subject matter of this course—statistics and statistical programming—has historical and present-day affinities with a variety of oppressive ideologies and projects, including white supremacy, discrimination on the basis of gender and sexuality, state violence, genocide, and colonialism. It has also been used to challenge and undermine these projects in various ways. I will work throughout the quarter to acknowledge and represent these legacies accurately, at the same time as I also strive to advance equity, inclusion, and justice through my teaching practice, the selection of curricular materials, and the cultivation of an inclusive classroom environment. Please see my [[User:Aaronshaw/Classroom_policies|general classroom policies]] for more on some of these topics.


* [[/Day 3 session plan]]
== Schedule (with all the details) ==


'''Required tasks:'''
When reading the schedule below, the following key might help resolve ambiguity: §n denotes chapter n; §n.x denotes section x of chapter; §n.x-y denotes sections x through y (inclusive) of chapter n.
* Read Diez, Çetinkaya-Rundel, and Barr: §2.1-2 (Numerical and categorical data).  
* The R tutorial webcast and RMarkdown tutorial that I've put together including:
** COM520 R Tutorial #1: What is R, RStudio, etc? [[https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-01.html HTML], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-01.rmd RMarkdown], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-01.pdf PDF]]
** COM520 R Tutorial #2: Intro to R [[https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-02.html HTML], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-02.rmd RMarkdown], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-02.pdf PDF]]


'''Recommended tasks:'''
=== Week 1 (9/17) ===
==== September 17: Intro and setup ====


* Watch [https://www.youtube.com/playlist?list=PLkIselvEzpM6pZ76FD3NoCvvgkj_p-dE8 Lecture materials for §2.1 and §2.2 (Videos 6-7 in the playlist)].
;[[Statistics_and_Statistical_Programming_(Fall_2020)/w01_session_plan|Session plan]]
* Watch COM520 [https://uw.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=62fcbfcf-0b7e-4c6c-bf1a-aca500828992 R Tutorial #1 Screencast] on Panopto
* Watch COM520 [https://uw.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=4d383a06-3df5-4607-9b16-aca5008289be R Tutorial #2 Screencast] on Panopto
* If you want additional material that will provide an introductions to R, these are great resources:
** Modern Dive [https://moderndive.netlify.app/index.html Statistical inference via data science] Chapter 1: [https://moderndive.netlify.app/1-getting-started.html Getting started with R].
** [https://rladiessydney.org/courses/ryouwithme/ RYouWithMe] course [https://rladiessydney.org/courses/ryouwithme/01-basicbasics-0/ "Basic basics" 1 & 2] (and maybe 3 if you're feeling ambitious)
** Verzani §1 (Getting started)
** Healy §2 (Get started)


'''Homework:'''
<blockquote>''Note: Aaron doesn't actually expect you to complete these before class on September 17''</blockquote>


* Complete [[/Problem set 3]] (OpenIntro questions & programming challenges)
'''Required'''
* Problem set 3 worked solutions [[https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_03.html HTML], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_03.rmd RMarkdown], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_03.pdf PDF]]
* Read this syllabus, discuss any questions/concerns with the teaching team.
* Complete [https://apps3.cehd.umn.edu/artist/user/scale_select.html pre-course assessment of statistical concepts] (access code TBA via email). Estimated time to do this is 30-40 minutes. '''Submission deadline: September 18, 11:00pm Chicago time'''
* Confirm course registration and access to [https://www.openintro.org/book/os/ the textbook] (pdf download available for $0 and b&w paperbacks for $20) as well as any software and web-services you'll need for course (Zoom, Discord, Canvas, this wiki, R, RStudio). Discord invites will be sent via email.
* Complete [https://wiki.communitydata.science/Statistics_and_Statistical_Programming_(Fall_2020)/pset0 problem set #0]  


=== Day 4: Wednesday January 13: Applied data manipulation ===
'''Recommended'''
* Work through one (or more) introduction(s) to R and Rstudio so that you can complete problem set 0. Here are several suggestions:
** '''From Aaron:''' The [https://communitydata.science/~ads/teaching/2020/stats/r_tutorials/w01-R_tutorial.html Week 01 R tutorial] (you should also download the [https://communitydata.science/~ads/teaching/2020/stats/r_tutorials/w01-R_tutorial.rmd .rmd version of the tutorial] that you can open and read/edit in RStudio). These are accompanied by the R and Rstudio intro screencasts ([https://communitydata.cc/~ads/teaching/2019/stats/screencasts/w01-s01-intro.webm Part 1] and [https://communitydata.cc/~ads/teaching/2019/stats/screencasts/w01-s02-intro.webm Part 2]) Aaron created for the 2019 version of the course.
** Modern Dive [https://moderndive.netlify.app/index.html Statistical inference via data science] Chapter 1: [https://moderndive.netlify.app/1-getting-started.html Getting started with R].
** [https://rladiessydney.org/courses/ryouwithme/ RYouWithMe] course [https://rladiessydney.org/courses/ryouwithme/01-basicbasics-0/ "Basic basics" 1 & 2] (and maybe 3 if you're feeling ambitious).
** Verzani §1 (Getting started).
** Healy §2 (Get started).


'''Class material:'''
=== Week 2 (9/22, 9/24) ===
;[[Statistics_and_Statistical_Programming_(Fall_2020)/w02_session_plan|Session plans]]
==== September 22: Data and variables ====
'''Required'''
* Read Diez, Çetinkaya-Rundel, and Barr: §1.1-1.3 (Introduction to data).
* Watch [https://www.youtube.com/playlist?list=PLkIselvEzpM6pZ76FD3NoCvvgkj_p-dE8 Lecture materials for §1.1-3 (Videos 1-4 in the playlist)].
* Complete '''exercises from OpenIntro §1:''' 1.6, 1.9, 1.10, 1.16, 1.21, 1.40, 1.42, 1.43 (and remember that solutions to odd-numbered problems are in the book!)
* Submit, review, and respond to questions or requests for discussion via Discord or some other means.


* [[/Day 4 session plan]]
==== September 24: Numerical and categorical data ====
'''Required'''
* Read Diez, Çetinkaya-Rundel, and Barr: §2.1-2 (Numerical and categorical data).
* Review [https://www.youtube.com/playlist?list=PLkIselvEzpM6pZ76FD3NoCvvgkj_p-dE8 Lecture materials for §2.1 and §2.2 (Videos 6-7 in the playlist)].
* Complete '''exercises from OpenIntro §2:''' 2.12, 2.13, 2.16, 2.20, 2.23, 2.30 (and remember that solutions to odd-numbered problems are in the book!)
* Submit, review, and respond to questions or requests for discussion via Discord or some other means.


'''Required tasks:'''
=== Week 3 (9/29, 10/1) ===
* COM520 R Tutorial #3 Intro to R [[https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-03.html HTML], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-03.rmd RMarkdown], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-03.pdf PDF], [https://uw.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=c9ce270f-9192-4034-bcf7-acae0073b049 Screencast]]


'''Recommended tasks:'''
;[[Statistics_and_Statistical_Programming_(Fall_2020)/w03_session_plan|Session plans]]
* Additional material from any of the recommended R learning resources suggested last week or elsewhere in the syllabus. In particular, you may find the ModernDive, RYouWithMe, Healy, and/or Wickham and Grolemund resources valuable.


'''Homework:'''
==== September 29: R fundamentals: Import, transform, tidy, and describe data ====
'''Required'''
* Complete [[Statistics_and_Statistical_Programming_(Fall_2020)/pset1|problem set #1]] (due Monday, September 28 at 1pm Central)


* Complete [[/Problem set 4]] (programming challenges and statistical questions)
'''Recommended'''
* Problem set 4 worked solutions [[https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_04.html HTML], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_04.rmd RMarkdown], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_04.pdf PDF]]
* [https://communitydata.science/~ads/teaching/2020/stats/r_tutorials/w03-R_tutorial.html Week 3 R tutorial] (note that you can access .rmd or .pdf versions by replacing the suffix of the URL accordingly).
* Additional material from any of the recommended R learning resources suggested last week or elsewhere in the syllabus. In particular, you may find the ModernDive, RYouWithMe, Healy, and/or Wickham and Grolemund resources valuable.
<!---
<!---
'''Resources'''
'''Resources'''
Line 334: Line 352:
--->
--->


=== NO CLASS: Monday January 18: Martin Luther King Jr Day  ===
==== October 1: Probability ====
=== Day 5: Wednesday Janaury 20: Probability and R fundamentals ===
'''Required'''
 
'''Required tasks:'''
* Read Diez, Çetinkaya-Rundel, and Barr: §3 (Probability).  
* Read Diez, Çetinkaya-Rundel, and Barr: §3 (Probability).  
* COM520 R Tutorial #4: Additional R fundamentals [[https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-04.html HTML], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-04.rmd RMarkdown], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-04.pdf PDF]]
'''Recommended tasks:'''
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5EgoOajhw83Ax_FktnlD6n&v=rG-SLQ2uF8U Probability introduction] and [https://www.youtube.com/watch?v=HxEz4ZHUY5Y&list=PLkIselvEzpM5EgoOajhw83Ax_FktnlD6n&index=2 Probability trees] OpenIntro lectures (just videos 1 and 2 in the playlist).
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5EgoOajhw83Ax_FktnlD6n&v=rG-SLQ2uF8U Probability introduction] and [https://www.youtube.com/watch?v=HxEz4ZHUY5Y&list=PLkIselvEzpM5EgoOajhw83Ax_FktnlD6n&index=2 Probability trees] OpenIntro lectures (just videos 1 and 2 in the playlist).
* Watch COM520 [https://uw.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=25a9cdb2-0dcd-493e-b031-acb3004215b5 R Tutorial #4.1 Screencast] on Panopto
* Complete '''exercises from OpenIntro §3:''' 3.12, 3.15, 3.22, 3.28, 3.34, 3.38
* Watch COM520 [https://uw.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=02bae034-911a-4ddf-85b3-acb300886150 R Tutorial #4.2 Screencast] on Panopto


'''Resources'''
'''Resources'''
* [https://seeing-theory.brown.edu/index.html#secondPage Seeing Theory §1-2 (Basic Probability and Compound Probability)]
* [https://seeing-theory.brown.edu/index.html#secondPage Seeing Theory §1-2 (Basic Probability and Compound Probability)]


'''Homework:'''
=== Week 4 (10/6, 10/8) ===
* Complete [[/Problem set 5]] (OpenIntro excercises & programming challenges)
;[[Statistics_and_Statistical_Programming_(Fall_2020)/w04_session_plan|Session plans]]
* Problem set 5 worked solutions [[https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_05-pt1.html HTML], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_05-pt1.rmd RMarkdown], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/worked_solutions/worked_solutions-pset_05-pt1.pdf PDF]]


=== Day 6: Monday January 25: Distributions ===
==== October 6: Emotional contagion and more advanced R fundamentals: import, tidy, transform, and simulate data; write functions ====
'''Required'''
* Read the paper below as well as the attendant [https://www.pnas.org/content/111/29/10779.1 "Expression of editorial concern"] and [https://www.pnas.org/content/111/29/10779.2 "Correction"] that were subsequently appended to it.
:Kramer, Adam D. I., Jamie E. Guillory, and Jeffrey T. Hancock. 2014. “Experimental Evidence of Massive-Scale Emotional Contagion through Social Networks.” ''Proceedings of the National Academy of Sciences'' 111(24):8788–90. [[http://www.pnas.org/content/111/24/8788.full Open access]]
* Complete [[Statistics_and_Statistical_Programming_(Fall_2020)/pset2|problem set #2]] (due Monday, October 5 at 1pm CT)


<!-- '''Class material:'''
'''Recommended'''
* [[/Day 6 session plan]] -->
* [https://communitydata.science/~ads/teaching/2020/stats/r_tutorials/w04-R_tutorial.html Week 4 R tutorial] (as usual, also available as .rmd or .pdf)


'''Required tasks:'''
==== October 8: Distributions ====
'''Required'''
* Read Diez, Çetinkaya-Rundel, and Barr: §4.1-3 (Normal and binomial distributions).  
* Read Diez, Çetinkaya-Rundel, and Barr: §4.1-3 (Normal and binomial distributions).  
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM6V9h55s0l9Kzivih9BUWeW&v=S_p5D-YXLS4 normal and binomial distributions] OpenIntro lectures (videos 1-3 in the playlist).
* Complete '''exercises from OpenIntro §4:''' 4.4, 4.6, 4.15, 4.22


'''Recommended tasks:'''
'''Resources'''
 
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM6V9h55s0l9Kzivih9BUWeW&v=S_p5D-YXLS4 normal and binomial distributions] OpenIntro lectures (videos 1-3 in the playlist)
* [https://seeing-theory.brown.edu/index.html#secondPage/chapter3 Seeing Theory §3 (Probability distributions)]
* [https://seeing-theory.brown.edu/index.html#secondPage/chapter3 Seeing Theory §3 (Probability distributions)]


'''Homework:'''
==== October 9: [[#Research project plan and dataset identification|Research project plan and dataset identification]] due by 5pm CT ====
* Go back and complete any questions from [[/Problem set 5]] that you were not able to get last time.
*'''Submit via [https://canvas.uw.edu/courses/1434003/assignments Canvas]''' (due by 5pm CT)
* Complete '''Problem set 6''': exercises from OpenIntro §4: 4.4, 4.6, 4.15, 4.22
 
=== Day 7: Wednesday January 27: Descriptive analysis and visualization ===
 
<!-- '''Class material:'''
* [[/Day 7 session plan]]
-->
'''Required tasks:'''
* COM520 R Tutorial #5: Visualization using ''ggplot2'' [[https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-05.html HTML], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-05.rmd RMarkdown], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-05.pdf PDF]]


'''Homework:'''
=== Week 5 (10/13, 10/15) ===
* Complete [[/Problem set 7]]
;[[Statistics_and_Statistical_Programming_(Fall_2020)/w05_session_plan|Session plans]]
==== October 13: Descriptive analysis and visualization of data ====
'''Required'''
* Complete [[Statistics_and_Statistical_Programming_(Fall_2020)/pset3|problem set #3]] (due Monday, October 12 at 1pm CT)


=== Day 8: Monday February 1: Foundations for inference ===
'''Recommended'''
* [https://communitydata.science/~ads/teaching/2020/stats/r_tutorials/w05-R_tutorial.html Week 5 R tutorial] and [https://communitydata.science/~ads/teaching/2020/stats/r_tutorials/w05a-R_tutorial.html Week 5 R tutorial supplement] (both, as usual, also available as .rmd or .pdf).


<!--'''Class material:'''
==== October 15: Foundations for (frequentist) inference ====
* [[/Day 8 session plan]]
'''Required'''
-->
'''Required tasks:'''
* Read Diez, Çetinkaya-Rundel, and Barr: §5 (Foundations for inference).  
* Read Diez, Çetinkaya-Rundel, and Barr: §5 (Foundations for inference).  
* Watch [https://www.youtube.com/watch?v=oLW_uzkPZGA&list=PLkIselvEzpM4SHQojH116fYAQJLaN_4Xo foundations for inference] (videos 1-3 in the playlist) OpenIntro lectures.
* Complete [https://www.openintro.org/book/stat/why05/ Why .05?] OpenIntro video/exercise.
* Complete [https://www.openintro.org/book/stat/why05/ Why .05?] OpenIntro video/exercise.
* Complete '''exercises from OpenIntro §5:''' 5.4, 5.8, 5.10, 5.17, 5.30, 5.35, 5.36


'''Recommended tasks:'''
'''Resources'''
* Read Kelly M., [https://rss.onlinelibrary.wiley.com/doi/pdf/10.1111/j.1740-9713.2013.00693.x Emily Dickinson and monkeys on the stair Or: What is the significance of the 5% significance level?] ''Significance'' 10:5. 2013.
* Kelly M., [https://rss.onlinelibrary.wiley.com/doi/pdf/10.1111/j.1740-9713.2013.00693.x Emily Dickinson and monkeys on the stair Or: What is the significance of the 5% significance level?] ''Significance'' 10:5. 2013.
* [https://seeing-theory.brown.edu/index.html#secondPage/chapter4 Seeing Theory §4 (Frequentist Inference)]
* [https://seeing-theory.brown.edu/index.html#secondPage/chapter4 Seeing Theory §4 (Frequentist Inference)]
* Watch [https://www.youtube.com/watch?v=oLW_uzkPZGA&list=PLkIselvEzpM4SHQojH116fYAQJLaN_4Xo foundations for inference] (videos 1-3 in the playlist) OpenIntro lectures.
'''Homework:'''
* Complete '''Problem set 8''': exercises from OpenIntro §5: 5.4, 5.8, 5.10, 5.17, 5.30, 5.35, 5.36
=== Day 9: Wednesday February 3: Reinforced foundations for inference ===
'''Required tasks:'''
* COM520 R Tutorial #6: Distributions in R and more [[https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-06.html HTML], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-06.rmd RMarkdown], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-06.pdf PDF], [https://uw.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=4d5cdd94-0e2a-4c21-b912-acc301383f9b Screencast]]
* Read Reinhart, §1. {{avail-uw|1=https://alliance-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=CP71226818410001451&context=L&vid=UW&lang=en_US&search_scope=all&adaptor=Local%20Search%20Engine&tab=default_tab&query=any,contains,statistics%20done%20wrong}}
* Read the following paper (it will be familiar to those of you in COM501): Kramer, Adam D. I., Jamie E. Guillory, and Jeffrey T. Hancock. 2014. “Experimental Evidence of Massive-Scale Emotional Contagion through Social Networks.” ''Proceedings of the National Academy of Sciences'' 111 (24): 8788–90. https://doi.org/10.1073/pnas.1320040111. {{avail-uw|https://doi.org/10.1073/pnas.1320040111}}


'''Recommended tasks:'''
=== Week 6 (10/20, 10/22) ===
* Check out [https://gallery.shinyapps.io/CLT_mean/ OpenIntro Central limit theorem for means demo].
;[[Statistics_and_Statistical_Programming_(Fall_2020)/w06_session_plan|Session plans]]
==== October 20: Reinforced foundations for inference ====
'''Required'''
* Complete [[Statistics_and_Statistical_Programming_(Fall_2020)/pset4|problem set #4]] 
* Read Reinhart, §1.
* Revisit the Kramer et al. (2014) paper we read a few weeks ago:
:Kramer, Adam D. I., Jamie E. Guillory, and Jeffrey T. Hancock. 2014. “Experimental Evidence of Massive-Scale Emotional Contagion through Social Networks.” ''Proceedings of the National Academy of Sciences'' 111(24):8788–90. [[http://www.pnas.org/content/111/24/8788.full Open access]]


'''Homework:'''
==== October 22: Inference for categorical data ====
* Complete [[/Problem set 9]]
'''Required'''
 
=== Day 10: Monday February 8: Inference for categorical data ===
 
'''Required tasks:'''
* Read Diez, Çetinkaya-Rundel, and Barr: §6 (Inference for categorical data).  
* Read Diez, Çetinkaya-Rundel, and Barr: §6 (Inference for categorical data).  
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5Gn-sHTw1NF0e8IvMxwHDW&v=_iFAZgpWsx0 inference for categorical data] (videos 1-3 in the playlist) OpenIntro lectures.
* Complete '''exercises from OpenIntro §6:''' 6.10, 6.16, 6.22, 6.30, 6.40 (just parts a and b; part c gets tedious)


'''Recommended tasks:'''
'''Resources'''
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5Gn-sHTw1NF0e8IvMxwHDW&v=_iFAZgpWsx0 inference for categorical data] (videos 1-3 in the playlist) OpenIntro lectures.
* [https://gallery.shinyapps.io/CLT_prop/ OpenIntro Central limit theorem for proportions demo].
* [https://gallery.shinyapps.io/CLT_prop/ OpenIntro Central limit theorem for proportions demo].


'''Homework:'''
=== Week 7 (10/27, 10/29) ===
;[[Statistics_and_Statistical_Programming_(Fall_2020)/w07_session_plan|Session plans]]
==== October 27: Applied inference for categorical data ====
'''Required'''
* Read Reinhart, §4 and §5 (both are quite short).
* Skim the following (all are referenced in the problem set)
**  Aronow PM, Karlan D, Pinson LE. (2018). The effect of images of Michelle Obama’s face on trick-or-treaters’ dietary choices: A randomized control trial. PLoS ONE 13(1): e0189693. [https://doi.org/10.1371/journal.pone.0189693 https://doi.org/10.1371/journal.pone.0189693]
** Buechley, Leah and Benjamin Mako Hill. 2010. “LilyPad in the Wild: How Hardware’s Long Tail Is Supporting New Engineering and Design Communities.” Pp. 199–207 in ''Proceedings of the 8th ACM Conference on Designing Interactive Systems.'' Aarhus, Denmark: ACM. [[https://mako.cc/academic/buechley_hill_DIS_10.pdf PDF available on Hill's personal website]]
** Shaw, Aaron and Yochai Benkler. 2012. A tale of two blogospheres: Discursive practices on the left and right. ''American Behavioral Scientist''. 56(4): 459-487. [[https://doi.org/10.1177%2F0002764211433793 available via NU libraries]]
* Complete [[Statistics_and_Statistical_Programming_(Fall_2020)/pset5|problem set #5]]
'''Resources'''
* [https://communitydata.science/~ads/teaching/2020/stats/r_tutorials/w06-R_tutorial.html Week 06 R tutorial] (it's very short!)


* Complete '''Problem set 10''': exercises from OpenIntro §6: 6.10, 6.16, 6.22, 6.30, 6.40 (just parts a and b; part c gets tedious)
==== October 29: Inference for numerical data (part 1) ====
'''Required'''
* Read Diez, Çetinkaya-Rundel, and Barr: §7.1-3 (Inference for numerical data: differences of means).
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5G3IO1tzQ-DUThsJKQzQCD&v=uVEj2uBJfq0 inference for numerical data] (videos 1-4 in the playlist) OpenIntro lectures (and featuring one of the textbook authors!).
* Complete '''exercises from OpenIntro §7:''' 7.12, 7.24, 7.26


=== Day 11: Wednesday February 10: Applied inference for categorical data ===
'''Resources'''
* [https://gallery.shinyapps.io/CLT_mean/ OpenIntro Central limit theorem for means demo].


'''Required tasks:'''
==== October 30: [[#Research project planning document|Research project planning document]] due 5pm CT====
* Submit via [https://canvas.uw.edu/courses/1434003/assignments/ Canvas] (due by 5pm CT)


* COM520 R Tutorial #7: Categorical data [[https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-07.html HTML], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-07.rmd RMarkdown], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-07.pdf PDF]]
=== Week 8 (11/3, 11/5) ===
* Read Reinhart, §4 and §5 (both are quite short).
==== November 3: U.S. election day (no class meeting) ====
* Skim the following (all are referenced in the problem set)
**  Aronow PM, Karlan D, Pinson LE. (2018). The effect of images of Michelle Obama’s face on trick-or-treaters’ dietary choices: A randomized control trial. ''PLoS ONE'' 13(1): e0189693. https://doi.org/10.1371/journal.pone.0189693. {{avail-free|https://doi.org/10.1371/journal.pone.0189693}}
** Buechley, Leah and Benjamin Mako Hill. 2010. “LilyPad in the Wild: How Hardware’s Long Tail Is Supporting New Engineering and Design Communities.” Pp. 199–207 in ''Proceedings of the 8th ACM Conference on Designing Interactive Systems.'' Aarhus, Denmark: ACM. {{avail-free|https://mako.cc/academic/buechley_hill_DIS_10.pdf}}


'''Homework:'''
==== November 4: Interactive self-assessment due ====
* Please submit results [https://canvas.uw.edu/courses/1434003/assignments/799630 (via Canvas)] [FIXME] from the [https://communitydata.science/~ads/teaching/2020/stats/assessment/interactive_assessment.rmd interactive self-assessment] by 5pm CT.


* Complete [[/Problem set 11]]
==== November 5: Inference for numerical data (part 2) ====
'''Required'''
* Read Diez, Çetinkaya-Rundel, and Barr: §7.4-5 (Inference for numerical data: power calculations, ANOVA, and multiple comparisons).
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5G3IO1tzQ-DUThsJKQzQCD&v=uVEj2uBJfq0 inference for numerical data] (videos 4-8 in the playlist) OpenIntro lectures (and featuring one of the textbook authors!).
* Complete '''exercises from OpenIntro §7:''' 7.42, 7.44, 7.46


=== NO CLASS: Monday February 15: Presidents' Day ===
'''Resources'''
=== Day 12: Wednesday February 17: Inference for numerical data (t-tests and ANOVA) ===
* [https://www.openintro.org/go/?id=stat_better_understand_anova&referrer=/book/os/index.php OpenIntro supplement on ANOVA calculations] (useful if you think you'll be doing more ANOVAs).
<!--'''Class material:'''
This is a combo of two days, basically...
* [[/Day 12 session plan]] -->


'''Required tasks:'''
=== Week 9 (11/10, 11/12) ===
* Read Diez, Çetinkaya-Rundel, and Barr: §7.1-5 (Inference for numerical data: differences of means; power calculations, ANOVA, and multiple comparisons).
==== November 10: Applied inference for numerical data (t-tests, power analysis, ANOVA) ====
;[[Statistics_and_Statistical_Programming_(Fall_2020)/w09_session_plan|Session plans]]


'''Recommended tasks:'''
'''Required'''
* [https://www.openintro.org/go/?id=stat_better_understand_anova&referrer=/book/os/index.php OpenIntro supplement on ANOVA calculations] (particularly useful if you think you'll be doing more ANOVAs).
* Complete [[Statistics_and_Statistical_Programming_(Fall_2020)/pset6|problem set #6]]
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5G3IO1tzQ-DUThsJKQzQCD&v=uVEj2uBJfq0 inference for numerical data] (videos 1-8 in the playlist) OpenIntro lectures (and featuring one of the textbook authors!).
* COM520 R Tutorial #8: t-tests and ANOVA [[https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-08.html HTML], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-08.rmd RMarkdown], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-08.pdf PDF], [https://uw.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=70b906b1-88fc-475c-9541-acd100247d13 Screencast]]


'''Homework:'''
'''Resources'''
* Complete [[/Problem set 12]]
* [https://communitydata.science/~ads/teaching/2020/stats/r_tutorials/w09-R_tutorial.html Week 09 R tutorial]
 
=== Day 13: Monday February 22: Linear regression ===
<!-- '''Class material:'''
* [[/Day 13 session plan]] -->


'''Required tasks:'''
==== November 12: Linear regression ====
'''Required'''
* Read Diez, Çetinkaya-Rundel, and Barr: §8 (Linear regression).
* Read Diez, Çetinkaya-Rundel, and Barr: §8 (Linear regression).
* Watch [https://www.youtube.com/playlist?list=PLkIselvEzpM63ikRfN41DNIhSgzboELOM linear regression] (videos 1-4 in the playlist) OpenIntro lectures.
* Read [https://www.openintro.org/go/?id=stat_more_inference_for_linear_regression&referrer=/book/os/index.php More inference for linear regression] (OpenIntro supplement).
* Read [https://www.openintro.org/go/?id=stat_more_inference_for_linear_regression&referrer=/book/os/index.php More inference for linear regression] (OpenIntro supplement).
* Complete '''exercises from OpenIntro §8:''' 8.6, 8.36, 8.40, 8.44
* Complete '''exercises from OpenIntro supplement:''' 4 and 5 (answers provided in the supplement).
'''Resources'''
* [https://seeing-theory.brown.edu/index.html#secondPage/chapter6 Seeing Theory §6 (Regression analysis)]


'''Recommended tasks:'''
=== Week 10 (11/17, 11/19) ===
* Watch [https://www.youtube.com/playlist?list=PLkIselvEzpM63ikRfN41DNIhSgzboELOM linear regression] (videos 1-4 in the playlist) OpenIntro lectures.
;[[Statistics_and_Statistical_Programming_(Fall_2020)/w10_session_plan|Session plans]]
* Read [https://seeing-theory.brown.edu/index.html#secondPage/chapter6 Seeing Theory §6 (Regression analysis)]
==== November 17: Applied linear regression ====
'''Required'''
* Complete [[Statistics_and_Statistical_Programming_(Fall_2020)/pset7|Problem set #7]]


'''Homework:'''
'''Resources'''
* Complete [[/Problem set 13]]
* [https://communitydata.science/~ads/teaching/2020/stats/r_tutorials/w10-R_tutorial.html Week 10 R tutorial]
 
==== November 19: Multiple and logistic regression ====
=== Day 14: Wednesday February 24: Applied linear regression ===
'''Required'''
<!--
'''Class material:'''
* [[/Day 14 session plan]]
-->
 
'''Required tasks:'''
 
* COM520 R Tutorial #9: Linear regression [[https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-09.html HTML], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-09.rmd RMarkdown], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-09.pdf PDF], [https://uw.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=3fd1ff6d-ba80-46fd-b7e1-acd100247d42 Screencast]]
 
'''Homework:'''
* Complete [[/Problem set 14]]
 
=== Day 15: Monday March 1: Multiple and logistic regression ===
<!-- '''Class material:'''
 
* [[/Day 15 session plan]]-->
 
'''Required tasks:'''
* Read Diez, Çetinkaya-Rundel, and Barr: §9 (Multiple and logistic regression). (Skim §9.2-9.4)  
* Read Diez, Çetinkaya-Rundel, and Barr: §9 (Multiple and logistic regression). (Skim §9.2-9.4)  
** '''Disclaimer:''' Mako doesn't like §9.2-9.3, but it should be useful to understand and discuss them, so we'll do that.  
** '''Disclaimer:''' Aaron doesn't like §9.2-9.3, but it should be useful to understand and discuss them, so we'll do that.
* Watch [https://www.youtube.com/playlist?list=PLkIselvEzpM5f1HYzIjFt52SD4izsJ2_I multiple and logistic regression] (videos 1-4 in the playlist) OpenIntro lectures.
* Read [https://www.openintro.org/go/?id=stat_interaction_terms&referrer=/book/os/index.php Interaction terms] (OpenIntro supplement).
* Read [https://www.openintro.org/go/?id=stat_interaction_terms&referrer=/book/os/index.php Interaction terms] (OpenIntro supplement).
* Read [https://www.openintro.org/go/?id=stat_nonlinear_relationships&referrer=/book/os/index.php Fitting models for non-linear trends] (OpenIntro supplement).
* Read [https://www.openintro.org/go/?id=stat_nonlinear_relationships&referrer=/book/os/index.php Fitting models for non-linear trends] (OpenIntro supplement).
* Complete '''exercises from OpenIntro §9:''' 9.4, 9.13, 9.16, 9.18,


'''Recommended tasks:'''
'''Resources'''
* Watch [https://www.youtube.com/playlist?list=PLkIselvEzpM5f1HYzIjFt52SD4izsJ2_I multiple and logistic regression] (videos 1-4 in the playlist) OpenIntro lectures.


'''Homework:'''
=== Week 11 (11/24) ===
* Complete '''Problem set 16''': exercises from OpenIntro §9: 9.4, 9.13, 9.16, 9.18,
==== November 24: Applied multiple and logistic regression ====
 
;[[Statistics_and_Statistical_Programming_(Fall_2020)/w11_session_plan|Session plans]]
'''Required tasks:'''
'''Required'''
 
* Complete [[Statistics_and_Statistical_Programming_(Fall_2020)/pset8|Problem set #8]]
=== Day 16: Wednesday March 3: Applied multiple and logistic regression ===
'''Resources'''
<!-- '''Class material:'''
* Mako Hill created (and Aaron updated) a brief tutorial on [https://communitydata.science/~ads/teaching/2020/stats/r_tutorials/logistic_regression_interpretation.html interpreting logistic regression coefficients with examples in R]
* [[/Day 16 session plan]] -->
 
'''Required tasks:'''
* COM520 R Tutorial #10: Tutorial on interpreting logistic regression in R [[https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-10.html HTML], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-10.rmd RMarkdown], [https://www.dropbox.com/preview/COM520-shared_files-UW-2021-Q1/r_tutorials/com520-r_tutorial-10.pdf PDF]]
 
'''Homework:'''
* Complete [[/Problem set 16]]
 
=== Day 17: Monday March 8:  Consulting Day ===


We'll forgo meeting as a group. Instead, I will meet one-on-one with each of you to work through challenges you're having with your own projects.
=== Week 12+ ===


* COM520 R Tutorial #11: Bonus material {{forthcoming}} <!-- logistic_regression_interpretation.html -->
==== December 3: [[#Research project presentation|Research project presentation]] due by 5pm CT ====
 
'''[https://canvas.uw.edu/courses/1434003/discussion_topics/856868 Post your video via this "Discussion" on Canvas]'''. Please view and provide constructive feedback on other's videos!
=== Day 18: Wednesday March 10: Final Presentations ===
<!--
'''Class material:'''
* [[/Day 18 session plan]]
-->
 
<strike>'''Post your video via this "Discussion" on Canvas]''' {{forthcoming}} — Please view and provide constructive feedback on other's videos!


* '''Post videos directly to the "Discussion."''' The Canvas text editor has an option to upload/record a video. That's what you want.
* '''Post videos directly to the "Discussion."''' The Canvas text editor has an option to upload/record a video. That's what you want.
* '''Please remember not to over-work/think this.''' I mentioned this in class, but just to reiterate, the focus of this assignment should not be your video editing skills. Please do what you can to record and convey your ideas clearly without devoting insane hours to creating the perfect video.  
* '''Please remember not to over-work/think this.''' I mentioned this in class, but just to reiterate, the focus of this assignment should not be your video editing skills. Please do what you can to record and convey your ideas clearly without devoting insane hours to creating the perfect video.  
* '''Some resources for recording presentations:''' There are a bunch of ways you might record/share your video. Some ideas include using the embedded media recorder in Canvas (!) that can record with with your webcam (maybe attach a few visuals to accompany this?); recording a "meeting" with yourself in Zoom; and "Panopto," a piece of high-end video recording, sharing, and editing software that UW licenses for campus use. Here are some pointers:
* '''Some resources for recording presentations:''' There are a bunch of ways you might record/share your video. Some ideas include using the embedded media recorder in Canvas (!) that can record with with your webcam (maybe attach a few visuals to accompany this?); recording a "meeting" with yourself in Zoom; and "Panopto," a piece of high-end video recording, sharing, and editing software that NU licenses for campus use. Here are some pointers:
** You should be able to use your UW zoom account to create a zoom meeting, record your meeting (in which you deliver your presentation and share your screen with any visuals), and then share a link to the recording via the "Recordings" item in the left-hand menu of your Zoom account page.
** NU has a "digital learning resource hub" which provides some [https://digitallearning.northwestern.edu/resource-hub#for-students resources for students]. The first item in that list has pointers for recording yourself and posting to Canvas and includes info about the Canvas media recorder and Panopto.
** You should be able to use your NU zoom account to create a zoom meeting, record your meeting (in which you deliver your presentation and share your screen with any visuals), and then share a link to the recording via the "Recordings" item in the left-hand menu of your [https://northwestern.zoom.us/ https://northwestern.zoom.us/] account page.
** If nothing works, please get in touch.
** If nothing works, please get in touch.
</strike>
Since the class is small, we'll meet and give presentations during class. Everyone should plan to give a ~15 minutes conference style presentation.
== Special Notes ==
== Teaching and learning in a pandemic ==
The COVID-19 pandemic will impact this course in various ways, some of them obvious and tangible and others harder to pin down. On the obvious and tangible front, we have things like a mix of remote, synchronous, and asynchronous instruction and the fact that many of us will not be anywhere near campus or each other this year. These will reshape our collective "classroom" experience in major ways.
On the "harder to pin down" side, many of us may experience elevated levels of exhaustion, stress, uncertainty and distraction. We may need to provide unexpected support to family, friends, or others in our communities. I have personally experienced all of these things at various times over the past six months and I expect that some of you have too. It is a difficult time.
I believe it is important to acknowledge these realities of the situation and create the space to discuss and process them in the context of our class throughout the quarter. As your instructor and colleague, I commit to do my best to approach the course in an adaptive, generous, and empathetic way. I will try to be transparent and direct with you throughout—both with respect to the course material as well as the pandemic and the university's evolving response to it. I ask that you try to extend a similar attitude towards everyone in the course. When you have questions, feedback, or concerns, please try to share them in an appropriate way. If you require accommodations of any kind at any time (directly related to the pandemic or not), please contact the teaching team.
:<div style="font-size: 80%; font-style: italic">This text is borrowed and adapted from [[Statistics and Statistical Programming (Fall 2020)|Aaron Shaw's statistics course]].</div>
== Expectations for synchronous remote sessions ==
The following are some baseline expectations for our synchronous remote class sessions. I expect that these can and will evolve. Please feel free to ask questions, suggest changes, or raise concerns during the quarter. I welcome all input:
* All members of the class are expected to create a supportive and welcoming environment that is respectful of the conditions under which we are participating in this class.
* All members of the class are expected to take reasonable steps to create an effective teaching/learning environment for themselves and others.
And here are suggested protocols for any video/audio portions of our class:
* Please mute your microphone whenever you're not speaking and learn to use [https://en.wikipedia.org/wiki/Push-to-talk "push-to-talk"] if/when possible ([https://www.howtogeek.com/662101/how-to-enable-push-to-talk-in-discord/ Discord supports the feature]).
* Video is optional for all students at all times, although if you're willing/able to keep the instructor company in the video channel, it would be nice.
* If you need to excuse yourself at any time and for any reason you may do so.
* Children, family, pets, roommates, and others with whom you may share your workspace are welcome to join our class as needed.
== Statistics and power ==
The subject matter of this course—statistics and statistical programming—has historical and present-day affinities with a variety of oppressive ideologies and projects, including white supremacy, discrimination on the basis of gender and sexuality, state violence, genocide, and colonialism. It has also been used to challenge and undermine these projects in various ways. I will work throughout the quarter to acknowledge and represent these legacies accurately, at the same time as I also strive to advance equity, inclusion, and justice through my teaching practice, the selection of curricular materials, and the cultivation of an inclusive classroom environment.
== Administrative Notes ==
=== Your Presence in Class ===
As detailed in [[#Assignments|section on assignments]] and in [[User:Benjamin Mako Hill/Assessment|my detailed page on assessment]], your homework in the class is to prepare for discussion of problem sets which means that presence is an important way that I will assess learning. Obviously, you must be in class in order to participate. In the event of an absence, you are responsible for obtaining class notes, handouts, assignments, etc.
<!-- === Devices in Class ===
Electronic devices (e.g., phones, tablets, laptops) are '''not''' going to permitted in class. If you have a documented need to use a device, please contact me ahead of time to let me know. If you do get permission to use a device, I will ask you to sit in the very back of the classroom.
The goal of this policy is to help you stay focused and avoid distractions for yourself and your peers in the classroom. This is really important and turns out to be much more difficult in the presence of powerful computing devices with brightly glowing screens and fast connections to the Internet. For more on the rationale behind this policy, please read [https://medium.com/@cshirky/why-i-just-asked-my-students-to-put-their-laptops-away-7f5f7c50f368 Clay Shirky’s thoughtful discussion of his approach to this issue].
Of course, we will discuss assignments and topics that involve referring to things online. Toward that end, you might find it convenient to bring a laptop or tablet to class. If you want to look something up on your device outside of a time I clearly point out are device-allowed, please ask me. I will always point out explicitly in class if it's OK to use devices.
'''Except during these parts of class — which  — I ask that you refrain from using your laptops, tablets, phones, and pretty much any (digital) device with a screen.'''
-->
=== Religious Accommodations ===
Washington state law requires that UW develop a policy for accommodation of student absences or significant hardship due to reasons of faith or conscience, or for organized religious activities. The UW’s policy, including more information about how to request an accommodation, is available at [https://registrar.washington.edu/staffandfaculty/religious-accommodations-policy/ Religious Accommodations Policy]. Accommodations must be requested within the first two weeks of this course using the [https://registrar.washington.edu/students/religious-accommodations-request/ Religious Accommodations Request form].
=== Student Conduct ===
The University of Washington Student Conduct Code (WAC 478-121) defines prohibited academic and behavioral conduct and describes how the University holds students accountable as they pursue their academic goals. Allegations of misconduct by students may be referred to the appropriate campus office for investigation and resolution. More information can be found online at https://www.washington.edu/studentconduct/
Safety
Call SafeCampus at 206-685-7233 anytime–no matter where you work or study–to anonymously discuss safety and well-being concerns for yourself or others. SafeCampus’s team of caring professionals will provide individualized support, while discussing short- and long-term solutions and connecting you with additional resources when requested.
=== Academic Dishonesty ===
The University takes academic integrity very seriously. Behaving with integrity is part of our responsibility to our shared learning community. If you’re uncertain about if something is academic misconduct, ask us. We are willing to discuss questions you might have.
Acts of academic misconduct may include but are not limited to:
* Cheating (working collaboratively on quizzes/exams and discussion submissions, sharing answers and previewing quizzes/exams)
* Plagiarism (representing the work of others as your own without giving appropriate credit to the original author(s))
* Unauthorized collaboration (working with each other on assignments)
Concerns about these or other behaviors prohibited by the Student Conduct Code will be referred for investigation and adjudication by the College’s Director of Community Standards and Student Conduct.
=== Disability Resources ===
If you have already established accommodations with Disability Resources for Students (DRS), please communicate your approved accommodations to uw at your earliest convenience so we can discuss your needs in this course.
If you have not yet established services through DRS, but have a temporary health condition or permanent disability that requires accommodations (conditions include but not limited to; mental health, attention-related, learning, vision, hearing, physical or health impacts), you are welcome to contact DRS at 206-543-8924 or uwdrs@uw.edu or disability.uw.edu. DRS offers resources and coordinates reasonable accommodations for students with disabilities and/or temporary health conditions. Reasonable accommodations are established through an interactive process between you, your instructor(s) and DRS. It is the policy and practice of the University of Washington to create inclusive and accessible learning environments consistent with federal and state law.
=== Other Student Support ===
Any student who has difficulty affording groceries or accessing sufficient food to eat every day, or who lacks a safe and stable place to live, and believes this may affect their performance in the course, is urged to contact the graduate program advisor for support. Furthermore, please notify the professors if you are comfortable in doing so. This will enable us to provide any resources that we may possess (adapted from Sara Goldrick-Rab). Please also note the student food pantry, Any Hungry Husky at the ECC.
=== VPN Notice ===


Students should ensure that they can access all Internet resources required for this course reliably and safely before registering for this course. Participation in this course requires students to access Internet resources that may not be accessible directly in some places outside of the UW campus. Specifically, students in this course will need to access UW resources including Canvas, UW Libraries which require users to login with a UW NetID, and some external resources such as Zoom, Google Docs, YouTube, and/or eBook websites. For students who are off-campus and are in a situation where direct access to these required resources is not possible, UW IT recommends that students use the official UW VPN, called Husky OnNet VPN (see instructions below). However, students who are outside the US while taking this course should be aware that they may be subject to laws, policies and/or technological systems which restrict the use of any VPNs. UW does not guarantee students’ access to UW resources when students are off-campus, and [https://itconnect.uw.edu/work/appropriate-use/ students are responsible for their own compliance with all laws] regarding the use of Husky OnNet and all other UW resources.
==== December 4: Post-course assessment of statistical concepts due by 11pm CT ====
Complete [https://apps3.cehd.umn.edu/artist/user/scale_select.html post-course assessment] (access code TBA VIA email). Submission deadline: December 4, 11:00pm Chicago time.


UW-IT provides the Husky OnNet VPN free for UW students [https://itconnect.uw.edu/connect/uw-networks/about-husky-onnet/use-husky-onnet/ via this link], and advises students to use it with the “All Internet Traffic” option enabled (see the [https://www.lib.washington.edu/help/connect/husky-onnet UW Libraries instructions] and UW-IT’s [https://itconnect.uw.edu/connect/uw-networks/about-husky-onnet/faqs/ FAQs regarding the Husky OnNet VPN]). Doing so will route all incoming and outgoing Internet through UW servers while it is enabled.
==== December 10: [[#Research project paper|Research project paper]] due by 5pm CT ====
'''[https://canvas.uw.edu/courses/1434003/assignments/812317 Submit your paper, data, and code via Canvas].''' [FIXME]


== Credit and Notes ==
== Credit and Notes ==


This syllabus has, in ways that should be obvious, borrowed and built on the [https://www.openintro.org/stat/index.php OpenInto Statistics curriculum]. Many aspects of this course design extend from a version of [[Statistics_and_Statistical_Programming_(Winter_2017)|COM 521]] I taught in 2017 as well versions of this course taught at Northwestern University in [[Statistics and Statistical Programming (Spring 2019)|Spring 2019]] and  [[Statistics and Statistical Programming (Fall 2020)|Fall 2020]] by [[User:Aaronshaw|Aaron Shaw]].
This syllabus has, in ways that should be obvious, borrowed and built on the [https://www.openintro.org/stat/index.php OpenInto Statistics curriculum]. Most aspects of this course design extend Benjamin Mako Hill's [[Statistics_and_Statistical_Programming_(Winter_2017)|COM 521 class]] from the University of Washington as well as a [[Statistics_and_Statistical_Programming_(Spring_2019)|prior iteration of the same course]] offered at Northwestern in Spring 2019.
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see CommunityData:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)