Editing Statistics and Statistical Programming (Fall 2020)
From CommunityData
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 15: | Line 15: | ||
;'''Instructor:''' [http://aaronshaw.org Aaron Shaw] ([mailto:aaronshaw@northwestern.edu aaronshaw@northwestern.edu]) | ;'''Instructor:''' [http://aaronshaw.org Aaron Shaw] ([mailto:aaronshaw@northwestern.edu aaronshaw@northwestern.edu]) | ||
:Office Hours: Thursday 10am-12pm and by appointment | :Office Hours: Thursday 10am-12pm and by appointment | ||
: | :[[User:Aaronshaw/OH|Office hours signups and location information]] | ||
:'''Teaching Assistant:''' [http://nickmvincent.com Nick Vincent] ([mailto:nickvincent@u.northwestern.edu nickvincent@u.northwestern.edu]) | |||
:Office Hours: | ::Office Hours: I'll try to respond to any asynchronous questions in a timely fashion during "business hours" (9a-5p Central Time), and will also have OH by appointment. I'll also try to schedule some fixed time during which I'll hang out on a video call, hours TBA. | ||
:I am happy to try out alternative communication software for OH! | ::I'll likely use whatever conference we use for class sessions, but am happy to try out alternative communication software for OH! | ||
<br> | <br> | ||
Line 58: | Line 57: | ||
---> | ---> | ||
This course will proceed in a '''remote''' format that includes ''asynchronous'' and ''synchronous'' elements (more on those below). In general, the organization of the course adopts a "flipped" approach where | This course will proceed in a '''remote''' format that includes ''asynchronous'' and ''synchronous'' elements (more on those below). In general, the organization of the course adopts a "flipped" approach where you consume instructional materials on your own or in groups and we use synchronous meetings to answer questions, address challenges or concerns, work through solutions, and hold semi-structured discussions. | ||
The course introduces ''both'' basic statistical concepts as well as applications of those concepts through statistical programming. As a result, we will | The course introduces ''both'' basic statistical concepts as well as basic applications of those concepts through statistical programming. As a result, we will dedicate part of each week to a particular set of concepts and part of each week to applied data analysis and/or interpretation. A brief description of how I expect it all to work follows below. We'll talk about it more during the first class session. | ||
====Asynchronous elements of the course==== | ====Asynchronous elements of the course==== | ||
These include all readings, recorded lectures/slides, tutorials | These include all readings, recorded lectures/slides, tutorials, problem sets, and other assignments. I expect you to complete (or at least attempt to complete!) these outside of our class meeting times. For nearly all of the "instructional" material introducing particular statistical concepts and techniques, you are expected to use the OpenIntro textbook and lecture materials created by the textbook authors. Please note that this means I will not deliver lectures during our class meetings. Please also note that this means you are responsible for coordinating your problem set groups and any collaborative work with other members of the class outside of our class meeting times. | ||
For nearly all of the "instructional" material introducing particular statistical concepts and techniques, you are | |||
====Synchronous elements of the course==== | ====Synchronous elements of the course==== | ||
The synchronous elements of the course will be the two weekly class meetings that will happen via video conference ( | The synchronous elements of the course will be the two weekly class meetings that will happen via video conference (platform TBD). These are scheduled to run for a maximum of 110 minutes. Each session will include multiple short breaks. | ||
We will use the class meetings to discuss and work through any questions or challenges you encounter in the materials assigned for that day. This means that I will ask you to submit completed problem sets and questions '''the night before each class meeting'''. Doing so will give the teaching team time to sift, sort, and organize your submissions into a hopefully-cohesive plan for each class session that is tailored to the specific questions and concerns you encounter in the material. | |||
<!--- | <!--- | ||
Line 117: | Line 99: | ||
* Verzani, John. 2014. ''Using R for Introductory Statistics, Second Edition''. 2 edition. Boca Raton: Chapman and Hall/CRC. ([https://en.wikipedia.org/wiki/Special:BookSources/978-1-4665-9073-1 Various Sources]; [https://www.amazon.com/Using-Introductory-Statistics-Second-Chapman/dp/1466590734/ref=mt_hardcover?_encoding=UTF8&me= Amazon]) | * Verzani, John. 2014. ''Using R for Introductory Statistics, Second Edition''. 2 edition. Boca Raton: Chapman and Hall/CRC. ([https://en.wikipedia.org/wiki/Special:BookSources/978-1-4665-9073-1 Various Sources]; [https://www.amazon.com/Using-Introductory-Statistics-Second-Chapman/dp/1466590734/ref=mt_hardcover?_encoding=UTF8&me= Amazon]) | ||
* Wickham, Hadley. 2010. ''ggplot2: Elegant Graphics for Data Analysis''. 1st ed. 2009. Corr. 3rd printing 2010 edition. New York: Springer. ([https://link.springer.com/book/10.1007%2F978-3-319-24277-4 Springer/NU Libraries]; [https://en.wikipedia.org/wiki/Special:BookSources/978-0-596-80915-7 Various Sources]) | * Wickham, Hadley. 2010. ''ggplot2: Elegant Graphics for Data Analysis''. 1st ed. 2009. Corr. 3rd printing 2010 edition. New York: Springer. ([https://link.springer.com/book/10.1007%2F978-3-319-24277-4 Springer/NU Libraries]; [https://en.wikipedia.org/wiki/Special:BookSources/978-0-596-80915-7 Various Sources]) | ||
There are also some invaluable non-textbook resources: | There are also some invaluable non-textbook resources: | ||
Line 131: | Line 112: | ||
* If you are planning to analyze large-scale data (i.e., data that won't fit in memory on your laptop) then you will want to sign up for a research allocation on Quest, which is Northwestern's high-performance computing cluster. Instructions on how to do that are [[Statistics_and_Statistical_Programming_(Spring_2019)/Quest_at_Northwestern|here]]. | * If you are planning to analyze large-scale data (i.e., data that won't fit in memory on your laptop) then you will want to sign up for a research allocation on Quest, which is Northwestern's high-performance computing cluster. Instructions on how to do that are [[Statistics_and_Statistical_Programming_(Spring_2019)/Quest_at_Northwestern|here]]. | ||
=== | === Assignments === | ||
The | The assignments in this class focus on applied statistical concepts, analysis, and interpretation. Unless otherwise noted, all assignments are due at the end of the day (i.e., 11:59pm on the day they are due). | ||
==== | ==== Weekly problem sets and participation ==== | ||
Each week I will post problem sets incorporating three kinds of questions: | |||
* '''Statistics questions''' about statistical concepts and | * '''Statistics questions''' about statistical concepts, principles, and interpretation. | ||
* '''Programming challenges''' that you should solve using R. | * '''Programming challenges''' that you should solve using R. | ||
* '''Empirical paper questions''' about other assigned readings. | * '''Empirical paper questions''' about other assigned readings. | ||
Some of these (usually just the statistics questions) will be taken from the textbooks and some will not. In general, we will cover the programming material and empirical papers in the first session of the week and the statistical concepts and principles in the second session. You will need to submit your solutions to the relevant questions ahead of the relevant class session. Details of exactly how this will work will be provided in the course schedule and we'll go over them during the first class. | |||
At the start of the course you will be assigned to a working group. This will be a group of 2-3 students (exact numbers will depend on the final enrollment) with whom you will meet outside of class time to discuss, complete, and/or review your problem sets as well as other assignments. The groups will rotate roughly every two weeks during the quarter to ensure that you get to work with different members of the class. The main idea is to support collaborative learning, peer teaching, and accountability. While the specifics of exactly when and how you work with your working group will largely be up to you, the teaching team will provide suggestions and a template that you can use as a starting point. | |||
Because randomness is extremely important in statistics, I may occasionally use a small R program to '''randomly assign''' different working groups to share and discuss their solutions to select questions during class sessions. These assignments will be announced at least a few days ahead of time so that the group has an opportunity to prepare. The idea here is to structure some participation in the synchronous sessions to ensure an equitable distribution of the responsibility to discuss questions, answers, points of confusion, and alternatives. | |||
For the programming challenges, you should submit code for your solutions (more on how in a moment) so we can walk through the material together. If you get completely stuck on a problem, that's okay, but please share whatever code you have so that you can tell us what you did and what you were thinking. | |||
=== Research project | Attendance in the synchronous portion of the class will be important to supporting your mastery of the material. Although the problem sets will not be assigned a letter grade, it is critical that you be present and able to discuss your answers to each of the questions. Your ability to do so will figure prominently in your participation grade for the course (see the section on grading and assessment below). | ||
==== Research project ==== | |||
As a demonstration of your learning in this course, you will design and carry out a quantitative research project, start to finish. This means you will all: | As a demonstration of your learning in this course, you will design and carry out a quantitative research project, start to finish. This means you will all: | ||
Line 165: | Line 147: | ||
''I strongly urge you'' to produce a project that will further your academic career outside of the class. There are many ways that this can happen. Some obvious options are to prepare a project that you can submit for publication, use as pilot analysis that you can report in a grant or thesis proposal, and/or use to fulfill a degree requirement. | ''I strongly urge you'' to produce a project that will further your academic career outside of the class. There are many ways that this can happen. Some obvious options are to prepare a project that you can submit for publication, use as pilot analysis that you can report in a grant or thesis proposal, and/or use to fulfill a degree requirement. | ||
There are several intermediate milestones | There are several intermediate milestones and deadlines to help you accomplish a successful research project. Unless otherwise noted, all deliverables should be submitted via Canvas. | ||
==== | ===== Project plan and dataset identification ===== | ||
;Due date: | ;Due date: TBA | ||
;Maximum length: 500 words (~1-2 pages) | ;Maximum length: 500 words (~1-2 pages) | ||
Line 180: | Line 162: | ||
''' Notes on finding a dataset ''' | |||
In order to complete your final project, you will each need a dataset. If you already have a dataset for the project you plan to conduct, great! If not, fear not! There are many datasets to draw from. Some ideas are below (please suggest others, provide updated links, or report problems). The teaching team will also be available to help you brainstorm/find resources if needed: | In order to complete your final project, you will each need a dataset. If you already have a dataset for the project you plan to conduct, great! If not, fear not! There are many datasets to draw from. Some ideas are below (please suggest others, provide updated links, or report problems). The teaching team will also be available to help you brainstorm/find resources if needed: | ||
Line 191: | Line 173: | ||
* Use the [http://scientificdata.isa-explorer.org/index.html ISA Explorer] to find datasets. Keep in mind the large majority of datasets it will search are drawn from the natural sciences. | * Use the [http://scientificdata.isa-explorer.org/index.html ISA Explorer] to find datasets. Keep in mind the large majority of datasets it will search are drawn from the natural sciences. | ||
* The City of Chicago has one of the best [https://data.cityofchicago.org/ data portal sites] of any municipality in the U.S. (and better than many federal agencies). There are also numerous administrative datasets released by other public entities (try searching!) that you might find inspiring. | * The City of Chicago has one of the best [https://data.cityofchicago.org/ data portal sites] of any municipality in the U.S. (and better than many federal agencies). There are also numerous administrative datasets released by other public entities (try searching!) that you might find inspiring. | ||
<!--- | |||
* <TODO fix/update accordingly> Set up a meeting with Jennifer Muilenburg — Data Curriculum and Communications Librarian who runs [https://www.lib.washington.edu/digitalscholarship/services/data research data services at the UW libraries]. Her email is: libdata@uw.edu I've have talked to her about this course and she is excited about meeting with you to help. | |||
--> | |||
* [http://fivethirtyeight.com FiveThirtyEight.com] has published a [https://cran.r-project.org/web/packages/fivethirtyeight/vignettes/fivethirtyeight.html GitHub repository and an R package] with pre-processed and cleaned versions of many of the datasets they use for articles published on their website. | * [http://fivethirtyeight.com FiveThirtyEight.com] has published a [https://cran.r-project.org/web/packages/fivethirtyeight/vignettes/fivethirtyeight.html GitHub repository and an R package] with pre-processed and cleaned versions of many of the datasets they use for articles published on their website. | ||
* If you interested in studying online communities, there are some great resources for accessing data from Reddit, Wikipedia, and StackExchange. See [https://files.pushshift.io/reddit/ pushshift] for dumps of Reddit data, [https://meta.wikimedia.org/wiki/Research:Data here] for an overview of Wikipedia's data resources, and [https://data.stackexchange.com/ Stack Exchange's data portal]. | * If you interested in studying online communities, there are some great resources for accessing data from Reddit, Wikipedia, and StackExchange. See [https://files.pushshift.io/reddit/ pushshift] for dumps of Reddit data, [https://meta.wikimedia.org/wiki/Research:Data here] for an overview of Wikipedia's data resources, and [https://data.stackexchange.com/ Stack Exchange's data portal]. | ||
==== | ===== Project planning document ===== | ||
;Due date: | ;Due date: TBA | ||
; | ;Maximum length: ~5 pages | ||
The project planning document is a shell/outline of an empirical quantitative research paper. Your planning document should should have the following sections: (a) Rationale, (b) Objectives; (b.1) General objectives; (b.2) Specific objectives; (c) (Null) hypotheses; (d) Conceptual diagram and explanation of the relationship(s) you plan to test; (e) Measures; (f) Dummy tables/figures; (g) anticipated finding(s) and research contribution(s). Longer descriptions of each of these planning document sections (as well as a few others) can be found [[CommunityData:Planning document|on this wiki page]]. | The project planning document is a basic shell/outline of an empirical quantitative research paper. Your planning document should should have the following sections: (a) Rationale, (b) Objectives; (b.1) General objectives; (b.2) Specific objectives; (c) (Null) hypotheses; (d) Conceptual diagram and explanation of the relationship(s) you plan to test; (e) Measures; (f) Dummy tables/figures; (g) anticipated finding(s) and research contribution(s). Longer descriptions of each of these planning document sections (as well as a few others) can be found [[CommunityData:Planning document|on this wiki page]]. | ||
I will also provide three example planning documents via our Canvas site (links to-be-updated for 2020 edition of the course): | I will also provide three example planning documents via our Canvas site (links to-be-updated for 2020 edition of the course): | ||
* [https://canvas.northwestern.edu/files/ | * [https://canvas.northwestern.edu/files/6908602/download?download_frd=1 One by public health researcher Mika Matsuzaki]. The first planning document I ever saw and still one of the best. It's missing a measures section. It's also focused on a research context that is probably very different from yours, but try not to get bogged down by that and imagine how you might map the structure of the document to your own work. | ||
* [https://canvas.northwestern.edu/files/ | * [https://canvas.northwestern.edu/files/6919735/download?download_frd=1 One by Jim Maddock] created as part of a qualifying exam early in 2019. Jim doesn't provide dummy tables or anticipated findings/contributions, but he has an especially phenomenal explanation of the conceptual relationships and processes he wants to test. | ||
* [https://canvas.northwestern.edu/files/ | * [https://canvas.northwestern.edu/files/6908606/download?download_frd=1 One provided as an appendix to Gerber and Green's excellent textbook, ''Field Experiments: Design, Analysis, and Interpretation'' (FEDAI)]. It's over-detailed and incredibly long for our purposes, but nevertheless an exemplary approach to planning empirical quantitative research in a careful, intentional way that is worthy of imitation. | ||
==== Research | ===== Research paper ===== | ||
;Paper due date: TBA | |||
;Paper due date: | |||
;Maximum length: 6000 words (~20 pages) | ;Maximum length: 6000 words (~20 pages) | ||
Line 236: | Line 206: | ||
I do not have strong preferences about the style or formatting guidelines you follow for the paper and its bibliography. However, ''your paper must follow a standard format'' (e.g., [https://cscw.acm.org/2019/submit-papers.html ACM SIGCHI CSCW format] or [https://www.apastyle.org/index APA 6th edition] ([https://templates.office.com/en-us/APA-style-report-6th-edition-TM03982351 Word] and [https://www.overleaf.com/latex/templates/sample-apa-paper/fswjbwygndyq LaTeX] templates)) that is applicable for a peer-reviewed journal or conference proceedings in which you might aim to publish the work (they all have formatting or submission guidelines published online and you should follow them). This includes the references. I also strongly recommend that you use reference management software like Zotero to handle your bibliographic sources. | I do not have strong preferences about the style or formatting guidelines you follow for the paper and its bibliography. However, ''your paper must follow a standard format'' (e.g., [https://cscw.acm.org/2019/submit-papers.html ACM SIGCHI CSCW format] or [https://www.apastyle.org/index APA 6th edition] ([https://templates.office.com/en-us/APA-style-report-6th-edition-TM03982351 Word] and [https://www.overleaf.com/latex/templates/sample-apa-paper/fswjbwygndyq LaTeX] templates)) that is applicable for a peer-reviewed journal or conference proceedings in which you might aim to publish the work (they all have formatting or submission guidelines published online and you should follow them). This includes the references. I also strongly recommend that you use reference management software like Zotero to handle your bibliographic sources. | ||
==== Human subjects research, IRB, and ethics ==== | ===== Project presentation ===== | ||
;Presentation due date: TBA | |||
;Maximum length: 7 minutes | |||
<!-- TODO revisit old presentations page to update/adapt | |||
[[Statistics_and_Statistical_Programming_(Spring_2019)/Final_project_presentations]] | |||
---> | |||
You will also create and record a short (7-8 minute) presentation of your final project. The presentation will provide an opportunity to share a brief summary of your project and at least preliminary findings with the other members of the class. Since you will all give other research presentations throughout your career, I strongly encourage you to take the opportunity to refine your academic presentation skills. The document [https://canvas.northwestern.edu Creating a Successful Scholarly Presentation] (file will be posted to Canvas) may be useful. | |||
More details about the presentation goals, format suggestions, and more will be provided later in the quarter. | |||
===== Human subjects research, IRB, and ethics ===== | |||
In general, you are responsible for making sure that you're on the right side of the IRB requirements and that your work meets applicable ethical norms and standards. | In general, you are responsible for making sure that you're on the right side of the IRB requirements and that your work meets applicable ethical norms and standards. | ||
Line 249: | Line 231: | ||
I will assign grades (usually a numeric value ranging from 0-10) for each of the following aspects of your performance. The percentage values in parentheses are weights that will be applied to calculate your overall grade for the course. | I will assign grades (usually a numeric value ranging from 0-10) for each of the following aspects of your performance. The percentage values in parentheses are weights that will be applied to calculate your overall grade for the course. | ||
* Weekly participation: 40% | * Weekly participation (includes problem sets): 40% | ||
* Proposal identification: 5% | * Proposal identification: 5% | ||
* Final project planning document: 5% | * Final project planning document: 5% | ||
Line 255: | Line 237: | ||
* Final project paper: 40% | * Final project paper: 40% | ||
The teaching team will jointly | The teaching team will jointly evaluate your participation along four dimensions: attendance, preparation, engagement, and contribution. These are quite similar to the dimensions described in the "Participation Rubric" section of [https://mako.cc/teaching/assessment.html Benjamin Mako Hill's assessment page] and [https://reagle.org/joseph/zwiki/Teaching/Assessment/Participation.html Joseph Reagle's participation assessment rubric]. Exceptional participation means excelling along all four dimensions. Please note that participation ≠ talking more and I encourage all of us to seek [https://reagle.org/joseph/zwiki/Teaching/Best_Practices/Learning/Balance_in_Discussion.html balance in our discussions]. | ||
The teaching team's assessment of your final project proposal, planning document, presentation, and paper will reflect the clarity of the work, the effective execution and presentation of quantitative empirical analysis, as well as the quality and originality of the analysis. A more detailed assessment rubric will be provided. Throughout the quarter, we will talk about the qualities of exemplary quantitative research. In general, I expect your final project to embody these exemplary qualities. | The teaching team's assessment of your final project proposal, planning document, presentation, and paper will reflect the clarity of the work, the effective execution and presentation of quantitative empirical analysis, as well as the quality and originality of the analysis. A more detailed assessment rubric will be provided. Throughout the quarter, we will talk about the qualities of exemplary quantitative research. In general, I expect your final project to embody these exemplary qualities. | ||
=== Policies === | === Policies === | ||
Line 303: | Line 286: | ||
=== Week 1 (9/17) === | === Week 1 (9/17) === | ||
==== September 17: Intro and setup ==== | ==== September 17: Intro and setup ==== | ||
'''Required''' | '''Required''' | ||
* Complete [https://apps3.cehd.umn.edu/artist/user/scale_select.html pre-course assessment of statistical concepts] (access code TBA via email). '''Submission deadline: September 18, 11:00pm Chicago time''' | |||
* Complete [https://apps3.cehd.umn.edu/artist/user/scale_select.html pre-course assessment of statistical concepts] (access code TBA via email) | * Confirm access to software and web-services for course (Zoom, Discord, Canvas, this wiki, R, RStudio). | ||
* Confirm | * Complete problem set #0 | ||
* Complete | ** <TODO> Read and work through Introduction to R and RStudio | ||
** Install R and RStudio, getting help, creating/saving .Rmd, weaving code and text, knitting output into html or pdf. | |||
** <TODO> Refactor 2019 PS1 PC1-4. Update R and Rstudio installation. Add a calculator problem. Clarify task to write code and text and knit output. | |||
''' | '''Resources''' | ||
* Verzani §1 (Getting started) and Healy §2 (Get started) provide helpful background for working with R and RStudio. | |||
=== Week 2 (9/22, 9/24) === | === Week 2 (9/22, 9/24) === | ||
==== September 22: Data and variables ==== | ==== September 22: Data and variables ==== | ||
'''Required''' | '''Required''' | ||
* Read Diez, Çetinkaya-Rundel, and Barr: §1.1-1.3 (Introduction to data). | * Read Diez, Çetinkaya-Rundel, and Barr: §1.1-1.3 (Introduction to data). | ||
* Watch [https://www.youtube.com/playlist?list=PLkIselvEzpM6pZ76FD3NoCvvgkj_p-dE8 Lecture materials for §1.1-3 (Videos 1-4 in the playlist)]. | * Watch [https://www.youtube.com/playlist?list=PLkIselvEzpM6pZ76FD3NoCvvgkj_p-dE8 Lecture materials for §1.1-3 (Videos 1-4 in the playlist)]. | ||
* | * Complete problem set #1 | ||
** SQ from OpenIntro §1: 1.6, ''1.9'', 1.10, 1.16, ''1.21'', 1.40, 1.42, ''1.43'' | |||
==== September 24: Numerical and categorical data ==== | ==== September 24: Numerical and categorical data ==== | ||
Line 334: | Line 309: | ||
* Read Diez, Çetinkaya-Rundel, and Barr: §2.1-2 (Numerical and categorical data). | * Read Diez, Çetinkaya-Rundel, and Barr: §2.1-2 (Numerical and categorical data). | ||
* Review [https://www.youtube.com/playlist?list=PLkIselvEzpM6pZ76FD3NoCvvgkj_p-dE8 Lecture materials for §2.1 and §2.2 (Videos 6-7 in the playlist)]. | * Review [https://www.youtube.com/playlist?list=PLkIselvEzpM6pZ76FD3NoCvvgkj_p-dE8 Lecture materials for §2.1 and §2.2 (Videos 6-7 in the playlist)]. | ||
* Complete ''' | * Complete problem set #2 | ||
** OpenIntro questions: | |||
'''Resources''' | |||
=== Week 3 (9/29, 10/1) === | === Week 3 (9/29, 10/1) === | ||
==== September 29: Working with data and variables in R ==== | |||
==== September 29: R | |||
'''Required''' | '''Required''' | ||
* Complete | * R lecture materials from 2019 W02 | ||
* Complete problem set #3 | |||
** Empirical paper/data (UCB admissions. Police stops in IL.) | |||
** See PS2 Programming challenges from 2019 | |||
'''Resources''' | '''Resources''' | ||
* [https://science.sciencemag.org/content/187/4175/398 UCB admissions paper] | * [https://science.sciencemag.org/content/187/4175/398 UCB admissions paper] | ||
* [https://openpolicing.stanford.edu Stanford OpenPolicing Project] | * [https://openpolicing.stanford.edu Stanford OpenPolicing Project] | ||
==== October 1: Probability ==== | ==== October 1: Probability ==== | ||
'''Required''' | '''Required''' | ||
* Read Diez, Çetinkaya-Rundel, and Barr: §3 (Probability). | * Read Diez, Çetinkaya-Rundel, and Barr: §3.1-3; §3.4-5 (Probability). | ||
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5EgoOajhw83Ax_FktnlD6n&v=rG-SLQ2uF8U Probability introduction] and [https://www.youtube.com/watch?v=HxEz4ZHUY5Y&list=PLkIselvEzpM5EgoOajhw83Ax_FktnlD6n&index=2 Probability trees] OpenIntro lectures (just videos 1 and 2 in the playlist). | * Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5EgoOajhw83Ax_FktnlD6n&v=rG-SLQ2uF8U Probability introduction] and [https://www.youtube.com/watch?v=HxEz4ZHUY5Y&list=PLkIselvEzpM5EgoOajhw83Ax_FktnlD6n&index=2 Probability trees] OpenIntro lectures (just videos 1 and 2 in the playlist). | ||
* Complete | * Complete problem set #4 | ||
** OpenIntro questions: | |||
'''Resources''' | '''Resources''' | ||
Line 364: | Line 337: | ||
=== Week 4 (10/6, 10/8) === | === Week 4 (10/6, 10/8) === | ||
==== October 6: <Topic> ==== | |||
==== October 6: | |||
'''Required''' | '''Required''' | ||
* Complete problem set #5 | |||
* Complete | |||
'''Resources''' | |||
==== October 8: Distributions ==== | ==== October 8: Distributions ==== | ||
'''Required''' | '''Required''' | ||
* Read Diez, Çetinkaya-Rundel, and Barr: §4.1-3 (Normal and binomial distributions). | * Read Diez, Çetinkaya-Rundel, and Barr: §4.1-3 (Normal and binomial distributions). | ||
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM6V9h55s0l9Kzivih9BUWeW&v=S_p5D-YXLS4 normal and binomial distributions] OpenIntro lectures (videos 1-3 in the playlist). | * Watch [https://www.youtube.com/watch?list=PLkIselvEzpM6V9h55s0l9Kzivih9BUWeW&v=S_p5D-YXLS4 normal and binomial distributions] OpenIntro lectures (videos 1-3 in the playlist). | ||
* Complete | * Complete problem set #6 | ||
** OpenIntro questions: | |||
'''Resources''' | '''Resources''' | ||
* [https://seeing-theory.brown.edu/index.html#secondPage/chapter3 Seeing Theory §3 (Probability distributions)] | * [https://seeing-theory.brown.edu/index.html#secondPage/chapter3 Seeing Theory §3 (Probability distributions)] | ||
=== Week 5 (10/13, 10/15) === | === Week 5 (10/13, 10/15) === | ||
==== October 13: <Topic> ==== | |||
==== October 13: | |||
'''Required''' | '''Required''' | ||
* Complete | * Complete problem set #7 | ||
''' | '''Resources''' | ||
==== October 15: Foundations for (frequentist) inference ==== | ==== October 15: Foundations for (frequentist) inference ==== | ||
Line 401: | Line 365: | ||
* Watch [https://www.youtube.com/watch?v=oLW_uzkPZGA&list=PLkIselvEzpM4SHQojH116fYAQJLaN_4Xo foundations for inference] (videos 1-3 in the playlist) OpenIntro lectures. | * Watch [https://www.youtube.com/watch?v=oLW_uzkPZGA&list=PLkIselvEzpM4SHQojH116fYAQJLaN_4Xo foundations for inference] (videos 1-3 in the playlist) OpenIntro lectures. | ||
* Complete [https://www.openintro.org/book/stat/why05/ Why .05?] OpenIntro video/exercise. | * Complete [https://www.openintro.org/book/stat/why05/ Why .05?] OpenIntro video/exercise. | ||
* Complete | * Complete problem set #8 | ||
** OpenIntro questions: | |||
'''Resources''' | '''Resources''' | ||
Line 408: | Line 373: | ||
=== Week 6 (10/20, 10/22) === | === Week 6 (10/20, 10/22) === | ||
==== October 20: <Topic> ==== | |||
==== October 20: | |||
'''Required''' | '''Required''' | ||
* Complete | * Complete problem set #9 | ||
'''Resources''' | |||
==== October 22: Inference for categorical data ==== | ==== October 22: Inference for categorical data ==== | ||
Line 420: | Line 383: | ||
* Read Diez, Çetinkaya-Rundel, and Barr: §6 (Inference for categorical data). | * Read Diez, Çetinkaya-Rundel, and Barr: §6 (Inference for categorical data). | ||
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5Gn-sHTw1NF0e8IvMxwHDW&v=_iFAZgpWsx0 inference for categorical data] (videos 1-3 in the playlist) OpenIntro lectures. | * Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5Gn-sHTw1NF0e8IvMxwHDW&v=_iFAZgpWsx0 inference for categorical data] (videos 1-3 in the playlist) OpenIntro lectures. | ||
* Complete | * Complete problem set #10 | ||
** OpenIntro questions: | |||
'''Resources''' | '''Resources''' | ||
Line 426: | Line 390: | ||
=== Week 7 (10/27, 10/29) === | === Week 7 (10/27, 10/29) === | ||
==== October 27: <Topics> ==== | |||
==== October 27: | |||
'''Required''' | '''Required''' | ||
* Complete problem set #11 | |||
* Complete | |||
'''Resources''' | '''Resources''' | ||
==== October 29: Inference for numerical data (part 1) ==== | ==== October 29: Inference for numerical data (part 1) ==== | ||
Line 442: | Line 399: | ||
* Read Diez, Çetinkaya-Rundel, and Barr: §7.1-3 (Inference for numerical data: differences of means). | * Read Diez, Çetinkaya-Rundel, and Barr: §7.1-3 (Inference for numerical data: differences of means). | ||
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5G3IO1tzQ-DUThsJKQzQCD&v=uVEj2uBJfq0 inference for numerical data] (videos 1-4 in the playlist) OpenIntro lectures (and featuring one of the textbook authors!). | * Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5G3IO1tzQ-DUThsJKQzQCD&v=uVEj2uBJfq0 inference for numerical data] (videos 1-4 in the playlist) OpenIntro lectures (and featuring one of the textbook authors!). | ||
* Complete | * Complete problem set #11 | ||
** OpenIntro questions: | |||
'''Resources''' | '''Resources''' | ||
* [https://gallery.shinyapps.io/CLT_mean/ OpenIntro Central | * [https://gallery.shinyapps.io/CLT_mean/ OpenIntro Central liumit theorem for means demo]. | ||
=== Week 8 (11/3, 11/5) === | === Week 8 (11/3, 11/5) === | ||
==== November 3: | ==== November 3: Self-assessment exercise (no class meeting) ==== | ||
'''Election Day (U.S.): No class meeting today''' | |||
==== November 5: Inference for numerical data (part 2) ==== | ==== November 5: Inference for numerical data (part 2) ==== | ||
Line 460: | Line 413: | ||
* Read Diez, Çetinkaya-Rundel, and Barr: §7.4-5 (Inference for numerical data: power calculations, ANOVA, and multiple comparisons). | * Read Diez, Çetinkaya-Rundel, and Barr: §7.4-5 (Inference for numerical data: power calculations, ANOVA, and multiple comparisons). | ||
* Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5G3IO1tzQ-DUThsJKQzQCD&v=uVEj2uBJfq0 inference for numerical data] (videos 4-8 in the playlist) OpenIntro lectures (and featuring one of the textbook authors!). | * Watch [https://www.youtube.com/watch?list=PLkIselvEzpM5G3IO1tzQ-DUThsJKQzQCD&v=uVEj2uBJfq0 inference for numerical data] (videos 4-8 in the playlist) OpenIntro lectures (and featuring one of the textbook authors!). | ||
* Complete | * Complete problem set #12 | ||
** OpenIntro questions: | |||
'''Resources''' | '''Resources''' | ||
Line 466: | Line 420: | ||
=== Week 9 (11/10, 11/12) === | === Week 9 (11/10, 11/12) === | ||
==== November 10: | ==== November 10: <Topic> ==== | ||
'''Required''' | '''Required''' | ||
* Complete | * Complete problem set #13 | ||
'''Resources''' | '''Resources''' | ||
==== November 12: Linear regression ==== | ==== November 12: Linear regression ==== | ||
Line 480: | Line 431: | ||
* Watch [https://www.youtube.com/playlist?list=PLkIselvEzpM63ikRfN41DNIhSgzboELOM linear regression] (videos 1-4 in the playlist) OpenIntro lectures. | * Watch [https://www.youtube.com/playlist?list=PLkIselvEzpM63ikRfN41DNIhSgzboELOM linear regression] (videos 1-4 in the playlist) OpenIntro lectures. | ||
* Read [https://www.openintro.org/go/?id=stat_more_inference_for_linear_regression&referrer=/book/os/index.php More inference for linear regression] (OpenIntro supplement). | * Read [https://www.openintro.org/go/?id=stat_more_inference_for_linear_regression&referrer=/book/os/index.php More inference for linear regression] (OpenIntro supplement). | ||
* Complete | * Complete Problem set #14 | ||
* | ** OpenIntro questions: | ||
** OpenIntro supplement questions: | |||
'''Resources''' | '''Resources''' | ||
Line 487: | Line 439: | ||
=== Week 10 (11/17, 11/19) === | === Week 10 (11/17, 11/19) === | ||
==== November 17: <Topic> ==== | |||
==== November 17: | |||
'''Required''' | '''Required''' | ||
* Complete | * Complete Problem set #15 | ||
'''Resources''' | '''Resources''' | ||
==== November 19: Multiple and logistic regression ==== | ==== November 19: Multiple and logistic regression ==== | ||
'''Required''' | '''Required''' | ||
Line 501: | Line 452: | ||
* Read [https://www.openintro.org/go/?id=stat_interaction_terms&referrer=/book/os/index.php Interaction terms] (OpenIntro supplement). | * Read [https://www.openintro.org/go/?id=stat_interaction_terms&referrer=/book/os/index.php Interaction terms] (OpenIntro supplement). | ||
* Read [https://www.openintro.org/go/?id=stat_nonlinear_relationships&referrer=/book/os/index.php Fitting models for non-linear trends] (OpenIntro supplement). | * Read [https://www.openintro.org/go/?id=stat_nonlinear_relationships&referrer=/book/os/index.php Fitting models for non-linear trends] (OpenIntro supplement). | ||
* Complete | * Complete Problem set #16 | ||
** OpenIntro questions: | |||
'''Resources''' | '''Resources''' | ||
=== Week 11 (11/24) === | === Week 11 (11/24) === | ||
==== November 24: | ==== November 24: <Topic> and assessment ==== | ||
'''Required''' | '''Required''' | ||
* Complete | * Complete Problem set #16 | ||
* Complete [https://apps3.cehd.umn.edu/artist/user/scale_select.html post-course assessment of statistical concepts] (access code TBA VIA email). '''Submission deadline: December 1, 11:00pm Chicago time''' | |||
'''Resources''' | '''Resources''' | ||
* Mako Hill created | * Mako Hill created an example of [https://communitydata.science/~mako/2017-COM521/logistic_regression_interpretation.html interpreting logistic regression coefficients with examples in R] | ||
== Credit and Notes == | == Credit and Notes == | ||
This syllabus has, in ways that should be obvious, borrowed and built on the [https://www.openintro.org/stat/index.php OpenInto Statistics curriculum]. Most aspects of this course design extend Benjamin Mako Hill's [[Statistics_and_Statistical_Programming_(Winter_2017)|COM 521 class]] from the University of Washington as well as a [[Statistics_and_Statistical_Programming_(Spring_2019)|prior iteration of the same course]] offered at Northwestern in Spring 2019. | This syllabus has, in ways that should be obvious, borrowed and built on the [https://www.openintro.org/stat/index.php OpenInto Statistics curriculum]. Most aspects of this course design extend Benjamin Mako Hill's [[Statistics_and_Statistical_Programming_(Winter_2017)|COM 521 class]] from the University of Washington as well as a [[Statistics_and_Statistical_Programming_(Spring_2019)|prior iteration of the same course]] offered at Northwestern in Spring 2019. |