Editing User:Aaronshaw/Stats course

From CommunityData
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 2: Line 2:


:'''Statistics and Statistical Programming'''
:'''Statistics and Statistical Programming'''
:'''MTS 525''' Media, Technology & Society
:'''MTS 525''' Media, Technology & Society, Northwestern University
:'''Northwestern University''' Spring 2019
:'''Instructor:''' [http://aaronshaw.org Aaron Shaw] ([https://communication.northwestern.edu/faculty/AaronShaw Northwestern University])
:'''Instructor:''' [http://aaronshaw.org Aaron Shaw] ([https://communication.northwestern.edu/faculty/AaronShaw Northwestern University])
:'''Course Websites''':
:'''Course Websites''':
:* We will use [https://canvas.northwestern.edu/courses/90927 Canvas] for [https://canvas.northwestern.edu/courses/90927/announcements announcements], [https://canvas.northwestern.edu/courses/90927/assignments turning in some assignments], and [https://canvas.northwestern.edu/courses/90927/discussion_topics discussions].
:* We will use Canvas for [https://canvas.northwestern.edu announcements], [https://canvas.northwestern.edu/ turning in assignments], and [https://canvas.northwestern.edu discussion] (if you choose to use them)
:* Everything else will be linked on this page.
:* Everything else will be linked on this page.
:* List of student git repositories (will be a link)
:* List of student git repositories (will be a link)




== Overview and learning objectives ==
== Overview and Learning Objectives ==


This course provides a get-your-hands-dirty introduction to statistics and statistical programming mostly for applications in the social sciences and social computing. My main objectives are for all participants to acquire the conceptual, technical, and practical skills to conduct your own statistical analyses and become more sophisticated consumers of quantitative research in communication, HCI, and adjacent disciplines.
This course provides a get-your-hands-dirty introduction to statistics and statistical programming mostly for applications in the social sciences and social computing. My main objectives are for all participants to acquire the conceptual, technical, and practical skills to conduct your own statistical analyses and become more sophisticated consumers of quantitative research in communication, HCI, and adjacent disciplines.


I will consider the course a complete success if every student is able to do all of the following things at the end of the quarter:
I will consider the course a complete success if every student is able to do all of the following things at the end of the quarter:
* Design and execute a complete quantitative research project, start to finish.
* Design and carry out a complete analysis of a quantitative research project, start to finish.
* Read, modify, and create short programs in the R statistical programming language.
* Read, modify, and create short programs in the R statistical programming language.
* Feel comfortable reading and interpreting papers that use basic statistical techniques.
* Feel comfortable reading and interpreting papers that use basic statistical techniques.
* Feel comfortable and prepared to enroll in more specialized and advanced statistics courses.
* Feel comfortable and prepared to enroll in more specialized and advanced statistics courses.


The course will cover the following techniques: t-tests; chi-squared tests; ANOVA, MANOVA, and related methods; linear regression; and logistic regression. We will also consider salient issues in quantitative research such as reproducibility and "the statistical crisis in science." We may cover other topics as time and interest allow.
The course will cover the following statistical techniques: t-tests; chi-squared tests; ANOVA, MANOVA, and related methods; linear regression; and logistic regression. We will also consider salient issues in quantitative research such as reproducibility and "the statistical crisis in science." We may cover other topics as time and interest allow.


The course materials will consist of readings, problem sets, and recorded lectures and screencasts (some created by me, some created by other people). The course requirements will emphasize active participation, self-evaluation, and will include a final project focused on the design and execution of an original piece of quantitative research. We will use the R programming language for all examples and assignments.
The course materials will consist of readings, problem sets, and recorded lectures and screencasts (some created by me, some created by other people). The course requirements will emphasize active participation, self-evaluation, and will include a final project focused on the design and execution of an original piece of quantitative research. We will use the R programming language for all examples and assignments.
Line 27: Line 26:
You are not required to know much about statistics or statistical programming to take this class. I will assume some (very little!) knowledge of the basics of empirical research methods and design, basic algebra and arithmetic, and a willingness to work to learn the rest. In general we are not going to cover the math behind the techniques we'll be learning. Although we may do some math, this is not a math class. This course will also not require knowledge of calculus or matrix algebra. I will *not* do proofs on the board. Instead, the class is unapologetically focused on the application of statistical methods. Likewise, while some exposure to R, other programming languages, or other statistical computing resources will be helpful, it is absolutely not assumed.
You are not required to know much about statistics or statistical programming to take this class. I will assume some (very little!) knowledge of the basics of empirical research methods and design, basic algebra and arithmetic, and a willingness to work to learn the rest. In general we are not going to cover the math behind the techniques we'll be learning. Although we may do some math, this is not a math class. This course will also not require knowledge of calculus or matrix algebra. I will *not* do proofs on the board. Instead, the class is unapologetically focused on the application of statistical methods. Likewise, while some exposure to R, other programming languages, or other statistical computing resources will be helpful, it is absolutely not assumed.


== Why this course? Why statistical programming? Why R? ==
== Why this course? Why Statistical Programming? Why R? ==


Many comparable courses in statistics and quantitative methods do not focus on statistical programming and use easier-to-learn statistical software than R. So why bother? By learning statistical programming you will gain a deeper understanding of both the principles behind your analysis techniques as well as the tools you use to apply those techniques. In addition, a solid grasp of statistical programming will prepare you to create reproducible research, avoid common errors, and enable both greater durability and validity of your work.  
Many comparable courses in statistics and quantitative methods do not focus on statistical programming and use easier-to-learn statistical software than R. So why bother? By learning statistical programming you will gain a deeper understanding of both the principles behind your analysis techniques as well as the tools you use to apply those techniques. In addition, a solid grasp of statistical programming will prepare you to create reproducible research, avoid common errors, and enable both greater durability and validity of your work.  


Other programming languages are also well suited to statistics, including Stata and Python. Ultimately, I teach (and use) R for a few reasons:
Other programming languages are also well suited to statistics, including Stata and Python. Ultimately, I teach with R for a few reasons:
* R is freely available and open source.
* R is freely available and open source.
* R is becoming the most widely used package in statistics and many social science fields.
* R is becoming the most widely used package in statistics and many social science fields.
* R (along with Stata) will be used in most other advanced stats classes I hope you will take after this course.
* R is the system (along with Stata) that will be used in most other advanced stats classes I hope you will take after this course.
* R is better general purpose programming language than software like Stata which means that R programming skills will let you solve non-statistical problems and will make it easier to learn other programming languages like Python.
* R is better general purpose programming language than software like Stata which means that R programming skills will let you solve non-statistical problems like collecting data from the web and will make it easier to learn other programming languages like Python.
* R is what I use for all of my research, so it's the language I am best equipped to teach.


== A note about this syllabus ==
== Note About This Syllabus ==


This syllabus will be a dynamic document that will evolve throughout the quarter. Although the core expectations are fixed, the details will shift. As a result, please keep in mind the following:
You should expect this syllabus to be a dynamic document and you will notice that there are a few places marked "To Be Determined." Although the core expectations for this class are fixed, the details of readings and assignments will shift. As a result, there are three important things to keep in mind:


# I will not add readings or assignments less than one week before they are due. If I don't fill in a "To Be Determined" one week before it's due, it is dropped. If you plan to read more than one week ahead, contact me first.
# Although details on this syllabus will change, I will not change readings or assignments less than one week before they are due. If I don't fill in a "To Be Determined" one week before it's due, it is dropped. If you plan to read more than one week ahead, contact me first.
# Closely monitor your email and/or [https://canvas.northwestern.edu the announcements section on the course website on Canvas]. When I make changes, these changes will be recorded in [http://wiki.communitydata.cc/ the history of this page] so that you can track what has changed. I will also do my best to summarize these changes in an announcement on Canvas that will be emailed to everybody in the class.
# Closely monitor your email or [https://canvas.uw.edu/courses/1098035/announcements the announcements section on the course website on Canvas]. When I make changes, these changes will be recorded in [http://wiki.communitydata.cc/index.php?title=Statistics_and_Statistical_Programming_(Winter_2017)&action=history the history of this page] so that you can track what has changed and I will summarize these changes in an announcement on Canvas that will be emailed to everybody in the class.
# I will ask the class for voluntary anonymous feedback — especially toward the beginning of the quarter. Please let me know what is working and what can be improved. In the past, I have made many adjustments based on this feedback.
# I will ask the class for voluntary anonymous feedback frequently — especially toward the beginning of the quarter. Please let me know what is working and what can be improved. In the past, I have made many adjustments based on this feedback.


== Books and resources ==
== Books and Resources ==


This class will use a freely-licensed textbook:
Although I've never taught with a textbook in a proper sense, statistics is very well covered terrain and, as a result, there is an enormous amount of excellent curricular material out there I think we would be wise to build from. As a result, this class is going to use two textbooks:


* Diez, David M., Christopher D. Barr, and Mine Çetinkaya-Rundel. 2015. ''OpenIntro Statistics''. 3rd edition. OpenIntro, Inc. ([https://www.openintro.org/download.php?file=os3&referrer=/stat/textbook.php PDF]; [https://www.openintro.org/download.php?file=os3_tablet&referrer=/stat/textbook.php Table-friendly PDF]; [https://www.openintro.org/stat/textbook.php Other])
* Diez, David M., Christopher D. Barr, and Mine Çetinkaya-Rundel. 2015. ''OpenIntro Statistics''. 3rd edition. OpenIntro, Inc. ([https://www.openintro.org/download.php?file=os3&referrer=/stat/textbook.php PDF]; [https://www.openintro.org/download.php?file=os3_tablet&referrer=/stat/textbook.php Table-friendly PDF]; [https://www.openintro.org/stat/textbook.php Other])
* Verzani, John. 2014. ''Using R for Introductory Statistics, Second Edition''. 2 edition. Boca Raton: Chapman and Hall/CRC. ([https://en.wikipedia.org/wiki/Special:BookSources/978-1-4665-9073-1 Various Sources]; [https://www.amazon.com/Using-Introductory-Statistics-Second-Chapman/dp/1466590734/ref=mt_hardcover?_encoding=UTF8&me= Amazon])


The texbook (in any format) is required material for the course. You can download it at no cost and/or buy (affordable!) hard copy versions in either [https://www.openintro.org/redirect.php?go=amazon_os3_hardcover&referrer=/stat/textbook.php full color hardcover] or in [https://www.openintro.org/redirect.php?go=createspace_os3&referrer=/stat/textbook.php black and white paperback]. The book is excellent and has been adopted widely. It has also developed a large online community of students and teachers who have shared other resources. Lecture slides, videos, notes, and more are all freely licensed (many through the website and others elsewhere).
Diez, Barr, and Çetinkaya-Rundel's is a free, and freely-licensed, online statistics textbook. Over the last seven years, the book has also developed a large online community of students and teachers who have shared other resources. The book, lectures notes, and more are all freely licensed which has allowed the text to be adapted in a series of different fields. The book is excellent and it has been adopted extraordinarily widely. You can buy versions from Amazon in either [https://www.openintro.org/redirect.php?go=amazon_os3_hardcover&referrer=/stat/textbook.php full color hardcover] ($19.99) or in [https://www.openintro.org/redirect.php?go=createspace_os3&referrer=/stat/textbook.php black and white paperback] ($7.60). I haven't purchased a paper copy so I can't speak to the quality of either.
 
I will also assigning several chapters from the following:
 
* Reinhart, Alex. 2015. ''Statistics Done Wrong: The Woefully Complete Guide''. SF, CA: No Starch Press. ([https://www.safaribooksonline.com/library/view/statistics-done-wrong/9781457189845/ Safari online via NU libraries])


This book provides a conceptual introduction to some common failures in statistical analysis that you should learn to recognize and avoid. It was also written by a Ph.D. student. You have access to an electronic copy via the NU library, but you may find it helpful to purchase.
Verzani's book is an introduction to the R programming language. It's designed to be used as a companion to a basic introductory statistics textbook (like OpenIntro). It's a poor stand-alone text but it will provide good resources for the material we're covering in the course and it should act as a good reference going forward. The book is available online for about $50.


A few other books may be useful resources while you're learning to analyze, visualize, and interpret statistical data with R. I will share some advice about these during the first class meeting:
Although it's not required for the course, I want to point you to these two books. When I was learning R, these both were very useful references:


* Healy, Kieran. 2019. ''Data Visualization: A Practical Introduction''. Princeton, NJ: Princeton UP. ([https://kieranhealy.org/publications/dataviz/ via Healy's website])
* Teetor, Paul. 2011. ''R Cookbook: Proven Recipes for Data Analysis, Statistics, and Graphics''. 1 edition. Sebastopol, CA: O’Reilly Media. ([http://proquest.safaribooksonline.com/9780596809287 Safari Proquest/UW Libraries]; [https://en.wikipedia.org/wiki/Special:BookSources/978-0-596-80915-7 Various Sources]; [https://www.amazon.com/Cookbook-Analysis-Statistics-Graphics-Cookbooks/dp/0596809158/ref=sr_1_1?ie=UTF8&qid=1482802812&sr=8-1&keywords=r+cookbook Amazon])
* Teetor, Paul. 2011. ''R Cookbook: Proven Recipes for Data Analysis, Statistics, and Graphics''. 1 edition. Sebastopol, CA: O’Reilly Media. ([http://proquest.safaribooksonline.com/9780596809287 Safari Proquest/NU Libraries]; [https://en.wikipedia.org/wiki/Special:BookSources/978-0-596-80915-7 Various Sources]; [https://www.amazon.com/Cookbook-Analysis-Statistics-Graphics-Cookbooks/dp/0596809158/ref=sr_1_1?ie=UTF8&qid=1482802812&sr=8-1&keywords=r+cookbook Amazon])
* Wickham, Hadley. 2010. ''ggplot2: Elegant Graphics for Data Analysis''. 1st ed. 2009. Corr. 3rd printing 2010 edition. New York: Springer. ([https://link.springer.com/book/10.1007%2F978-3-319-24277-4 Springer/UW Libraries]; [https://en.wikipedia.org/wiki/Special:BookSources/978-0-596-80915-7 Various Sources])
* Verzani, John. 2014. ''Using R for Introductory Statistics, Second Edition''. 2 edition. Boca Raton: Chapman and Hall/CRC. ([https://en.wikipedia.org/wiki/Special:BookSources/978-1-4665-9073-1 Various Sources]; [https://www.amazon.com/Using-Introductory-Statistics-Second-Chapman/dp/1466590734/ref=mt_hardcover?_encoding=UTF8&me= Amazon])
* Wickham, Hadley. 2010. ''ggplot2: Elegant Graphics for Data Analysis''. 1st ed. 2009. Corr. 3rd printing 2010 edition. New York: Springer. ([https://link.springer.com/book/10.1007%2F978-3-319-24277-4 Springer/NU Libraries]; [https://en.wikipedia.org/wiki/Special:BookSources/978-0-596-80915-7 Various Sources])


There are also some invaluable non-textbook resources:
There are also two non-textbook resources I wanted to point you to that are invaluable:


* [ftp://cran.r-project.org/pub/R/doc/contrib/Baggott-refcard-v2.pdf Baggott's R Reference Card v2] — Print this out. Take it with you everywhere and look at it dozens of times a day. You will learn the language faster!
* [ftp://cran.r-project.org/pub/R/doc/contrib/Baggott-refcard-v2.pdf Baggott's R Reference Card v2] — When I was learning R, I ''literally'' took a similar reference card with me everywhere and looked at it dozens of times a day.
* [https://stackoverflow.com/questions/tagged/r StackOverflow R Tag] — Somebody already had your question about how to do ''X'' in R. They asked it, and several people have answered it, on StackOverflow. Learning to read this effectively will take time but as build up some basic familiarity with R and with StackOverflow, it will get easier. I promise.
* [https://stackoverflow.com/questions/tagged/r StackOverflow R Tag] — Somebody already had your question about how to do ''X'' in R. They asked it, and several people have answered it, on StackOverflow. Learning to read this effectively will take time but as build up some basic familiarity with R and with StackOverflow, it will get easier. I promise.
* [http://rseek.org/ Rseek] — Rseek is a modified version of Google that just search R websites online. Sometimes, R is hard to search before because R is a common letter. This has become much easier over time as R has become more popular but it might still be the case sometimes and Rseek is a good solution.
* [http://rseek.org/ Rseek] — Rseek is a modified version of Google that just search R websites online. Sometimes, R is hard to search before because R is a common letter. This has become much easier over time as R has become more popular but it might still be the case sometimes and Rseek is a good solution.
* [https://ggplot2.tidyverse.org/ ggplot2 documentation] — Ggplot is a powerful data visualization package for R that I recommend highly. The documentation is indispensable for learning how to use it.


== Assignments ==
== Assignments ==


The assignments in this class focus on applied statistical research design, analysis, and interpretation. There will be no graded exams or quizzes. Unless otherwise noted, all assignments are due at the end of the day (i.e., 11:59pm on the day they are due).
The assignments in this class are designed to give you an opportunity to try your hand at using the conceptual material taught in the class. There will be no exams or quizzes. Unless otherwise noted, all assignments are due at the end of the day (i.e., 11:59pm on the day they are due).


=== Weekly problem sets and participation ===
=== Weekly Problem Sets and Participation ===


Each week I will post a problem set. Some of these will be taken from the textbooks and some will not. They will include:
Each week I will post a problem set with a list of questions. Some of these will be drawn from the textbooks and some will be ones I design or write. The questions will cover:


* '''Statistics questions''' about statistical concepts, principles, and interpretation.
* '''Statistics questions''' — These will be questions about statistics from the OpenIntro sections as well as any empirical papers that are listed as required for that that day.
* '''Programming challenges''' that you must solve using R.
* '''Programming challenges''' — These will be R programming problems that cover material from the Verzani text that was listed as required from the previous session.
* '''Empirical paper questions''' about other assigned readings.  


You should submit your solutions to the programming challenges ahead of each class session. While I will not grade them, we will spend a good chunk of class going through the answers to the assignment due on that day.
I won't be grading these assignment and I won't be asking you to turn in anything for the ''statistics questions'' portion of the weekly assignment. That said, we will spend a good chunk of class each day going through the answers to the questions due on that day.


Because randomness is extremely important in statistics, I will use a small R program to '''randomly call on''' students to walk through your answer to statistics questions and empirical paper questions in class. We'll then discuss the answers, address points of confusion, and consider alternative approaches as a group.
Because randomness is an extremely important concept in statistics, I will use a small R program to '''randomly cold call''' on students in the class to walk through your "answer" to each question and explain your reasoning to the class. We'll then have an opportunity to discuss the different approaches as a group. I don't promise to ask all of these questions in class (especially if it's clear that folks get the point). Although I might ask them, I won't cold call for questions that are not on the list.


For the programming challenges, you should submit code for your solutions before class (more on how in a moment) so we can walk through the material together. If you get completely stuck on a problem, that's okay, but please share whatever code you have so that you can tell us what you did and what you were thinking.
For the programming challenges, I will ask that everybody shares code for any solutions to programming problems before class so we can walk through in class. If you get completely stuck on a problem and cannot "solve" it, that's OK, but share the code that you do have so that you can walk us through what you did and what you were thinking.


Coming to class will be profoundly important to learning the material and to your final grade. Although the problem sets will not be graded, it is critical that you be present and able to discuss your answers to each of the questions. Your ability to do so will figure prominently in your participation grade for the course (40% of your final grade). More on
Although the problem sets are not going to be graded, it is critical that you be at class and that you be able to discuss your answers to each of the questions. Your ability to do these latter two things will be reflected in your participation grade for the course which makes a full 40% of your grade. I can't emphasize enough how important it will be to be in class.


I strongly encourage you to form groups to work on the problem sets if you find that helpful; however, you must still submit your work individually and respond to my cold-call prompts in class individually to help ensure that you learn and understand the material.
I'm not going to form groups for you but it's totally fine with me if you want to work on these problem sets in small groups.


I evaluate participation along four dimensions: attendance, preparation, engagement, and contribution. These are quite similar to the dimensions described in the "Participation Rubric" section of [https://mako.cc/teaching/assessment.html Benjamin Mako Hill's assessment page] and [https://reagle.org/joseph/zwiki/Teaching/Assessment/Participation.html Joseph Reagle's participation assessment rubric]. Exceptional participation means excelling along all four dimensions. Please note that participation ≠ talking more and I encourage all of us to seek [https://reagle.org/joseph/zwiki/Teaching/Best_Practices/Learning/Balance_in_Discussion.html balance in our classroom discussions].
The "Participation Rubric" section of [https://mako.cc/teaching/assessment.html my page on assessment] gives the details on how I evaluate participation in my classes. If you sense a conflict between material in this section and material on that page, you can safely assume that the syllabus takes precedence.


=== Research project ===
=== Research Project ===


As a demonstration of your learning in this course, you will design and carry out a quantitative research project, start to finish. This means you will all:
As a demonstration of your learning in this course, you will design and carry out a quantitative research project, start to finish. This means you will all:


* '''Design and describe a plan for a study''' — The study you design should involve quantitative analysis and should be something you can complete at least a first pass on during this quarter.
* '''Design and describe a social scientific study''' — You should all have experience doing this at least once in COM520. The study you design should involves quantitative analysis and should be something you can complete at least a first pass at over the course of this quarter.
* '''Find a dataset''' — Very quickly, you should identify a dataset you will use to complete this project. For most of you, I suspect you will be engaging in secondary data analysis or a analysis of a previously collected dataset.
* '''Find a dataset''' — Very quickly, you should identify a dataset you will use to complete this project. For most of you, I suspect you will be engaging in secondary data analysis or a re-analysis of a previously collected dataset.
* '''Engage in descriptive data analysis''' — Use R to calculate descriptive statistics and visualizations to describe your data.
* '''Engage in descriptive data analysis''' — Use R to create descriptive statistics and visualization to describe your data.
* '''Motivate and test at least one hypothesis about relationships between two or more variables'''
* '''Test a hypotheses about relationships between two or more variables'''
* '''Report and interpret your findings''' — You will do this in both a short paper and a short presentation.
* '''Report your findings''' — I'll expect you all to report your findings in both a short paper and a short presentation.
* '''Ensure that your work is replicable''' — You will need to provide code and data for your analysis in a way that makes your work replicable by other researchers.
* '''Ensure replicability''' — I'll expect you all to provide code and data for your analysis in a way that makes your work replicable by other researchers.


''I strongly urge you'' to produce a project that will further your academic career outside of the class. There are many ways that this can happen. Some obvious options are to prepare a project that you can submit for publication, use as pilot analysis that you can report in a grant or thesis proposal, and/or that fulfills a degree requirement.
Although it's not required, I ''strongly urge each of you'' to take this opportunity to produce a document that will further your academic career outside of the class. There are many ways that this can happen but the obvious ones are that the paper is something you can submit for publication to a journal or conference, that provides primarily analysis for or acts as a pilot analysis that you can report in a grant proposal or thesis proposal, and/or that serves as part of your masters thesis or dissertation.


There are several intermediate milestones and deadlines to help you accomplish a successful research project. Unless otherwise noted, all deliverables should be submitted via Canvas.
==== Project and Dataset Identification ====


==== Project plan and dataset identification ====
;Due Date: January 17
;Maximum paper length: 500 words (~1-2 page)
;Deliverables: Turn in in Canvas


;Due date: Thursday, April 18, 2019
Early on, I want you to identify and describe your final project. Your proposal should be short and can be either paragraphs or bullets. It should include the following things:
;Maximum length: 500 words (~1-2 pages)


Early on, I want you to identify and describe your final project. Your description should be short and can be either paragraphs or bullets. It should include the following:
* A one paragraph abstract of the proposed study and research question, theory, community, and/or groups you plan to study.
* A short description of how the project will fit into your career trajectory.
* An identification of the dataset you will use and a description of the columns or type of data it will include. If you do not currently have access to these data, explain when you will have access to the data.


* An abstract of the proposed study including the topic, research question, theoretical motivation, object(s) of study, and anticipated research contribution.
==== Final Project Ouline ====
* An identification of the dataset you will use and a description of the columns or type of data it will include. If you do not currently have access to these data, explain why and when you will.
* A short (several sentences?) description of how the project will fit into your career trajectory.


==== Project planning document ====
;Outline Due Date: February 21
;Maximum outline length: 5 pages
;Deliverables: Turn in in Canvas


;Due date: Thursday, May 16, 2019
The outline should should have the following sections: (a) Rationale, (b) Objectives; (b.1) General Objectives; (b.2) Specific Objectives; (c) Null hypotheses; (d) Conceptual Diagram; (e) Measures; (e) Dummy Tables.
;Maximum length: 5 pages


The project planing document is a basic shell/outline of an empirical quantitative research paper. Your planning document should should have the following sections: (a) Rationale, (b) Objectives; (b.1) General objectives; (b.2) Specific objectives; (c) Null hypotheses; (d) Conceptual diagram and/or explanation of the relationship you plan to test; (e) Measures; (e) Dummy tables. Descriptions of each of these planning document section are available [[TODO-planningdoc|on this wiki page]].
An excellent example from my partner Mika Matsuzakis is [https://canvas.uw.edu/courses/1098035/files/40388318/download?wrap=1 online in Canavs]. Your diagram will likely be much less complicated than Matsuzaki's. Also, please don't be distracted by the fact that Mika does public health. It's the basic form I want you all to emulate, not the content. You can read [http://ajcn.nutrition.org/content/99/6/1450.full the published paper] to compare.


An exemplary planning document from public health researcher Mika Matsuzaki is [https://canvas.northwestern.edu online in Canavs]. Your diagram will likely be much less complicated than Matsuzaki's. Also, please don't be distracted by the fact that Matsuzaki does public health research. You can (and should!) emulate the form rather than the content. You can also check out [http://ajcn.nutrition.org/content/99/6/1450.full the published paper] to see how the project wound up.
The example includes everything except a "Measures" section. Your Measures section only needs to include two column table where column 1 is the name of each variable in your analysis and 2 is the specific operationalization of this measures and a description of how you will create it.


Please note that the Matsuzaki planning document includes everything except a "Measures" section. Your Measures section should include a two column table where column 1 is the name of each variable in your analysis and column 2 describes the operationalization of each measures and (if necessary) how you will create it.
==== Final Project ====


==== Project presentation and paper ====
;Paper Due Date: March 19
 
;Paper due date: Monday, June 10, 2019
;Maximum length: 6000 words (~20 pages)
;Maximum length: 6000 words (~20 pages)
;Presentation Date: March 14
;All Deliverables: Turn in in Canvas


;Presentation due date: Thursday, June 6, 2019
I'm expecting you to produce a draft of a short research paper that, after some additional work, you could consider submitting for publication. I'm also very open to the project being a part of a dissertation. I don't expect the papers to be ''publication ready'' but I do expect them to have well considered drafts of all the necessary pieces in terms of quantitative methodology.
;Maximum length: 12 minutes
 
 
''The paper:'' Ideally, I expect you to produce a high quality short research paper that you might revise and submit for publication and/or a dissertation milestone. I do not expect the paper to be ready for publication, but it should contain polished drafts of all the necessary components of a scholarly quantitative empirical research study. In terms of the structure, please see the page on the [[structure of a quantitative empirical research paper]].
 
As noted above, you should also provide data, code, and any documentation sufficient to enable the replication of all analysis and visualizations. This can happen through Github. If that is not possible/appropriate for some reason, please talk to me so that we can find another solution.


Because the emphasis in this class is on statistics and methods and because I'm not an expert in each of your fields, I'm happy to assume that your paper, proposal, or thesis chapter has already established the relevance and significance of your study and has a comprehensive literature review, well-grounded conceptual approach, and compelling reason why this research is important. As a result, you need not focus on these elements of the work in your written submission. Instead, feel free to start with a brief summary of the purpose and importance of this research followed by an introduction of your research questions or hypotheses. If you provide more detail, that's fine, but I won't give you detailed feedback on these parts and they will not figure prominently in my assessment of the work.
Because the emphasis in this class is on statistics and methodology and because I'm not an expert in each of your areas or fields, I'm happy to assume that your paper, proposal, or thesis chapter has already established the relevance and significance of your study and has a comprehensive literature review, well-grounded conceptual approach, and compelling reason why this research is so important. Instead of providing all of these details, feel free to start with a brief summary of the purpose and importance of this research, and an introduction of your research questions or hypotheses. If your provide more detail, that's fine, but I won't give you detailed feedback on these parts.


I have a strong preference for you to write the paper individually, but I'm open to the idea that you may want to work with others in the class. Please contact me ''before'' you attempt to pursue a collaborative final paper.
I have a strong preference for you to write this paper individually but I'm open to the idea that you may want to work with others in the class.


I do not have strong preferences about the style or formatting guidelines you follow for the paper and its bibliography. However, ''your paper must follow a standard format'' (e.g., <TODO link> ACM SIGCHI CSCW format or <TODO link> APA 6th edition) that is applicable for a peer-reviewed journal or conference proceedings in which you aim to publish the work (they all have formatting or submission guidelines published online and you should follow them). This includes the references. I also strongly recommend that you use reference management software to handle your bibliographic sources.
In terms of content:


'' The presentation:'' The presentation will provide an opportunity to share a brief summary of your project and findings with the other members of the class. Since you will all give other research presentations throughout your career, I strongly encourage you to take the opportunity to refine your academic presentation skills. The document [https://canvas.northwestern.edu Creating a Successful Scholarly Presentation] (link is in Canvas) may be useful.
* In terms of the structure of the paper, please see the page that I've written on the [[structure of a quantitative empirical research paper]].
* In terms of the structure of your presentation, you've got some latitude but this document on [https://canvas.uw.edu/files/40848246/download?download_frd=1 Creating a Successful Scholarly Presentation] (link is in Canvas) will likely be useful.


=== Grading ===
=== Grading ===


I will assign grades (usually a numeric value ranging from 0-10) for each of the following aspects of your performance. The percentage values in parentheses are weights that will be applied to calculate your overall grade for the course.
I have put together a very detailed page that describes [https://mako.cc/teaching/assessment.html grading rubric] I will be using in this course. Please read it carefully I will assign grades for each of the following items on the UW 4.0 grade scale according to the weights below:


* Participation: 40%
* Participation: 40%
* Proposal identification: 5%
* Proposal identification: 5%
* Final project planning document: 5%
* Final paper outline: 5%
* Final project presentation: 10%
* Final Presentation: 10%
* Final project paper: 40%
* Final Paper: 40%
 
My assessment of your paper will reflect the clarity of the written work, the effective execution and presentation of quantitative empirical analysis, as well as the quality and originality of the analysis. Throughout the quarter, we will talk a lot about the qualities of exemplary quantitative research. I expect your final project to embody these exemplary qualities.


== Note on finding a dataset ==
== Finding a Dataset ==


In order to complete your project, you will each need a dataset. If you already have a dataset for the project you plan to conduct, great! If not, there are many datasets to draw from. Here are some ideas:
In order to complete your project, you will each need a dataset. If you are at the stage of your career where you already have a dataset, great! If not, there are many datasets to draw from. Here are some ideas:


* Ask your advisor for a dataset they have collected and used in previous papers. Are there other variables you could use? Other relationships you could analyze?
* Ask your advisor for a dataset they have collected and used in previous papers. Are there other variables you could use?
* If there's an important study you loved, you can send a polite email to the author(s) asking if they are willing and able to share an archival or replication version of the dataset used in their paper. Be very polite and make it clear that this is starting as a class project, but that it might turn into a paper for publication. Make your timeline clear. In Communication and HCI, replication datasets are still very rare, so be prepared for a negative answer and/or questions about your motives in conducting the analysis.
* If there's an author of a study you loved, you can send a polite email asking if they are able or willing to share an archival or replication version of the dataset used in their paper. Be very polite and make it clear that this is starting as a class project but that might turn into a paper for publication. Make your timeline clear. In communication, replication datasets are still very rare, so be prepared for a negative answer.
* Do some Google Scholar and normal internet searching for datasets in your research area. You'll probably be surprised at what's available.
* Do some Google Scholar and normal Google searching for datasets in your research area. You'd be surprised at what's available.
* Take a look at datasets available in the [https://dataverse.harvard.edu/ Harvard Dataverse] (a very large collection of social science research data) or one of the other members of the [http://dataverse.org/ Dataverse network].
* Take a look at datasets available in the [https://dataverse.harvard.edu/ Harvard Dataverse] (the largest collection of social science research data) or one of the other members of the [http://dataverse.org/ Dataverse network].
* Look at the collection of social scientific datasets at [https://www.icpsr.umich.edu/icpsrweb/ICPSR/ ICPSR at the University of Michigan] (NU is a member). There are an enormous number of very rich datasets.
* Look at the collection of social scientific datasets at [https://www.icpsr.umich.edu/icpsrweb/ICPSR/ ICPSR] (UW is a member). There are an enormous number of very rich datasets.
* Use the [http://scientificdata.isa-explorer.org/index.html ISA Explorer] to find datasets. Keep in mind the large majority of datasets it will search are drawn from the natural sciences.
* Use the [http://scientificdata.isa-explorer.org/index.html ISA Explorer] to find datasets. Keep in mind the large majority of datasets it will search are drawn from the natural sciences.
* <TODO fix/update accordingly> Set up a meeting with Jennifer Muilenburg — Data Curriculum and Communications Librarian who runs [https://www.lib.washington.edu/digitalscholarship/services/data research data services at the UW libraries]. Her email is: libdata@uw.edu I've have talked to her about this course and she is excited about meeting with you to help.  
* Set up a meeting with Jennifer Muilenburg — Data Curriculum and Communications Librarian who runs [https://www.lib.washington.edu/digitalscholarship/services/data research data services at the UW libraries]. Her email is: libdata@uw.edu I've have talked to her about this course and she is excited about meeting with you to help.
* [http://fivethirtyeight.com FiveThirtyEight.com] has published a [https://cran.r-project.org/web/packages/fivethirtyeight/vignettes/fivethirtyeight.html GitHub repository and an R package] with pre-processed and cleaned versions of many of the datasets they use for articles published on their website.  
* [http://fivethirtyeight.com FiveThirtyEight.com] has published a [https://cran.r-project.org/web/packages/fivethirtyeight/vignettes/fivethirtyeight.html GitHub repository and an R package] with pre-processed and cleaned versions of many of the datasets they use for articles published on their website.  


=== Human subjects research, IRB, and ethics ===
In general, you're responsible for make sure that you're on the right side of the human subject rules and that work is ethical. Class projects generally do not need IRB approval but I hope that each of your projects will turn into something more. If your study involves human subjects research, ''that'' work will need IRB oversight of some sort. In general, you can't do a class project with IRB approval and then retroactively get it later. Secondary analysis of anonymized data is generally not considered human subjects research but I strongly suggest that you get a determination from [https://www.washington.edu/research/hsd UW's Human Subject Division] before you start. For work that is not considered human subjects research, this can often happen in a few hours or days. If you need a faculty sponsor, that should ideally be your advisor. If that doesn't make sense for any of you, I'm happy to talk about serving as the faculty supervisor for the work.
In general, you are responsible for making sure that you're on the right side of the IRB requirements and that your work meets applicable ethical norms and standards.


Class projects generally do not need IRB approval, but research for publications, dissertations, and sometimes even pilot studies generally fall under IRB purview. You should ''not'' plan to seek IRB approval/determination retroactively. If your study may involve human subjects and you may ever publish it in any form, you will need IRB oversight of some sort.
== Structure of Class ==


Secondary analysis of anonymized data is generally not considered human subjects research, but I strongly suggest that you get a determination from [https://irb.northwestern.edu/ the Northwestern IRB] before you start. For work that is not considered human subjects research, this can often happen in a few hours or days. If you need to list a faculty sponsor or Principal Investigator, that should ideally be your advisor. If that doesn't make sense for some reason, please talk to me.
I expect everybody to come to class, every week, with their laptop and a power cord, being ready to answer any question on the problem set and having uploaded and shared code to the code related questions. The class is listed as nearly 4 hours long and, with the exception of a few short breaks, I intend to use the entire period. Be in class on time and be plugged in and ready to go.


== Structure of Class ==
When it comes to the statistics part of this material, this will be a primarily "flipped" classroom. What this means is that we'll be relying on the textbook and other resources to introduce the material and we'll be using the class to discuss it and answer questions that come up.


I expect everybody to come to class, every week, with a laptop and a power cord, ready to answer any question on the problem set and having uploaded code related the the programming questions. The class is listed as nearly 3 hours long and, with the exception of short breaks, I intend to use the entire period. Please be in class on time, plugged in, and ready to go.
Although structure of class will vary, it will generally include the following parts.


When it comes to the statistics material, this will mostly be a so-called "flipped" classroom. This means we will rely on the textbook and other resources to introduce the material and we will use the class sessions to discuss questions as they come up.
# Quick updates about assignments, projects, and a meta-discussion about the class.
 
# Discussion of '''programming challenges''' due that day.
Although the day-to-day routine will vary, each class session will generally include the following:
# [''Possibly/Sometimes''] Short lecture and/or Q&A about new material in Diez, Barr, and Çetinkaya-Rundel
* Quick updates about assignments, projects, and meta-discussion about the class.
# Discussion of  '''statistics questions''' related to new material in Diez, Barr, and Çetinkaya-Rundel and any exemplary empirical paper we have read to discuss.
* Discussion of '''programming challenges''' due that day.
# Interactive lecture introducing new statistical programming concepts.
* [''Sometimes''] Short lecture and/or Q&A about new material in Diez, Barr, and Çetinkaya-Rundel.
# [''Possibly/Sometimes''] Time to begin work on next week's programming assignments.
* Discussion of  '''statistics questions''' related to new material in Diez, Barr, and Çetinkaya-Rundel.
* Discussion of any exemplary empirical paper we have read.
* [''Sometimes''] Interactive lecture introducing new statistical programming concepts.


== Schedule ==
== Schedule ==
Line 205: Line 189:
When reading the schedule below, the following key might help resolve ambiguity: §n denotes chapter n; §n.x denotes section x of chapter; §n.x-y denotes sections x through y of chapter n.
When reading the schedule below, the following key might help resolve ambiguity: §n denotes chapter n; §n.x denotes section x of chapter; §n.x-y denotes sections x through y of chapter n.


=== Week 1: Thursday April 4: Introduction, Setup, and Data and Variables ===
=== Week 1: Tuesday January 3: Introduction, Setup, and Data and Variables ===


Please complete the readings prior to class so that we can discuss them and start talking through some of the examples in R together.
Hopefully, the material in OpenIntro feels very familiar from COM520. The programming material will be new but I want you to read it before you come to class so we can work through the examples a group.


'''Required Readings:'''
'''Required Readings:'''


* Diez, Barr, and Çetinkaya-Rundel: §1 (Introduction to data)
* Diez, Barr, and Çetinkaya-Rundel: §1 (Introduction to data)
* Kramer, Adam D. I., Jamie E. Guillory, and Jeffrey T. Hancock. 2014. “Experimental Evidence of Massive-Scale Emotional Contagion through Social Networks.” ''Proceedings of the National Academy of Sciences'' 111(24):8788–90. [[http://www.pnas.org/content/111/24/8788.full Available through NU libraries]]
* Verzani: §1 (Getting Started), §2 (Univariate data) [[https://faculty.washington.edu/makohill/com521/verzani-usingr-ch1_ch2.pdf Available with UWNetID]]
* Kramer, Adam D. I., Jamie E. Guillory, and Jeffrey T. Hancock. 2014. “Experimental Evidence of Massive-Scale Emotional Contagion through Social Networks.” ''Proceedings of the National Academy of Sciences'' 111(24):8788–90. [[http://www.pnas.org/content/111/24/8788.full Available through UW libraries]]


'''Recommended Readings:'''
'''Optional Readings:'''


* Verzani: §1 (Getting Started), §2 (Univariate data) [[https://canvas.northwestern.edu/verzani_ch1-ch2.pdf Available via Canvas]]
* Verzani: §A (Programming)
* Verzani: §A (Programming)
* Healy: Chapter 2 (and skim the preferatory material as well as Chapter 1)
 
'''Assignment (Complete before class):'''
'''Assignment (Complete Before Class):'''


* [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 1]]
* [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 1]]


'''R screencasts:'''
'''Lectures:'''
* [https://communitydata.cc/~ads/teaching/2019/stats/r_lectures/w01-introduction.zip Week 1 R lecture materials] (.zip file)
 
* [https://communitydata.cc/~mako/2017-COM521/com521-week_01-r_programming_intro-20170103.ogv Week 1 R lecture screencast (Part I): Introduction to R and univariate statistics] (~1 hour 47 minutes)
* [https://communitydata.cc/~mako/2017-COM521/com521-week_01-r_programming_intro-20170103.ogv Week 1 R lecture screencast (Part I): Introduction to R and univariate statistics] (~1 hour 47 minutes)
* [https://communitydata.cc/~mako/2017-COM521/com521-week_01-github_rscripts-20170104.ogv Week 1 R lecture screencast (Part II): Setting up git/GitHub and saving files in RStudio] (~40 minutes)
* [https://communitydata.cc/~mako/2017-COM521/com521-week_01-github_rscripts-20170104.ogv Week 1 R lecture screencast (Part II): Setting up git/GitHub and saving files in RStudio] (~40 minutes)
* [[Statistics and Statistical Programming (Spring 2019)/R lecture outline: Week 1]]
* [[Statistics and Statistical Programming (Winter 2017)/R lecture outline: Week 1]]


'''Resources:'''
'''Resources:'''
Line 234: Line 218:
* [[Statistics and Statistical Programming (Winter 2017)/Session plan: Week 1]]
* [[Statistics and Statistical Programming (Winter 2017)/Session plan: Week 1]]


=== Week 2: Thursday April 11: Probability and Visualization ===
=== Week 2: Tuesday January 10: Probability and Visualization ===


'''Required Readings:'''
'''Required Readings:'''
Line 257: Line 241:
* [[Statistics and Statistical Programming (Winter 2017)/Session plan: Week 2]]
* [[Statistics and Statistical Programming (Winter 2017)/Session plan: Week 2]]


=== Week 3: Thursday April 18: Distributions ===
=== Week 3: Tuesday January 17: Distributions ===


'''Required Readings:'''
'''Required Readings:'''
Line 281: Line 265:
* [[Statistics and Statistical Programming (Winter 2017)/Session plan: Week 3]]
* [[Statistics and Statistical Programming (Winter 2017)/Session plan: Week 3]]


=== Week 4: Thursday April 25: Statistical significance and hypothesis testing ===
=== Week 4: Tuesday January 24: Statistical significance and hypothesis testing ===


'''Required Readings:'''
'''Required Readings:'''
Line 303: Line 287:
* [[Statistics and Statistical Programming (Winter 2017)/Session plan: Week 4]]
* [[Statistics and Statistical Programming (Winter 2017)/Session plan: Week 4]]


=== Week 5: Thursday May 2: Continuous Numeric Data & ANOVA ===
=== Week 5: Tuesday January 31: Continuous Numeric Data & ANOVA ===


'''Required Readings:'''
'''Required Readings:'''
Line 327: Line 311:
* [https://www.openintro.org/download.php?file=os3_slides_05&referrer=/stat/slides/slides_0x.php Mine Çetinkaya-Rundel's OpenIntro §5 Lecture Notes]
* [https://www.openintro.org/download.php?file=os3_slides_05&referrer=/stat/slides/slides_0x.php Mine Çetinkaya-Rundel's OpenIntro §5 Lecture Notes]


=== Week 6: Thursday May 9: Categorical data ===
=== Week 6: Tuesday February 7: Categorical data ===


'''Required Readings:'''
'''Required Readings:'''
Line 350: Line 334:
* [https://www.openintro.org/stat/videos.php OpenIntro Video Lectures] including 4 videos for §7
* [https://www.openintro.org/stat/videos.php OpenIntro Video Lectures] including 4 videos for §7


=== Week 7: Thursday May 16: Linear Regression ===
=== Week 7: Tuesday February 14: Linear Regression ===


'''Required Readings:'''
'''Required Readings:'''
Line 374: Line 358:
* [https://www.openintro.org/stat/videos.php OpenIntro Video Lectures] including 4 videos for §7 and 3 videos on the sections §8.1-8.3
* [https://www.openintro.org/stat/videos.php OpenIntro Video Lectures] including 4 videos for §7 and 3 videos on the sections §8.1-8.3


=== Week 8: Thursday May 23: Polynomial Terms, Interactions, and Logistic Regression ===
=== Week 8: Tuesday February 21: Polynomial Terms, Interactions, and Logistic Regression ===


'''Required Readings:'''
'''Required Readings:'''
Line 403: Line 387:
* I've written this document which will likely be useful for many of you: [https://communitydata.cc/~mako/2017-COM521/logistic_regression_interpretation.html Interpreting Logistic Regression Coefficients with Examples in R]
* I've written this document which will likely be useful for many of you: [https://communitydata.cc/~mako/2017-COM521/logistic_regression_interpretation.html Interpreting Logistic Regression Coefficients with Examples in R]


=== Week 9: Thursday May 30: TBA ===
=== Week 9: Tuesday February 28: Consulting Meetings ===


Reserved for catch-up, supplementary topics, and maybe some final presentations.
We won't meet as a group. Instead, you will each meet on-on-one with me to work through challenges and issues with your analysis.


=== Week 10: Thursday June 6: Final Presentations ===
=== Week 10: Tuesday March 7: Consulting Meetings ===


Followed by much rejoicing!
We won't meet as a group. Instead, you will each meet on-on-one with me to work through challenges and issues with your analysis.


== Policies ==
=== Week 11: March 14: Final Presentations ===


== Administrative Notes ==
=== Attendance ===
=== Attendance ===


Attendance in class is expected of all participants. If you need to miss class for any reason, please contact me ahead of time (email is best). Multiple unexplained absences will likely result in a lower grade or (in extreme circumstances) a failing grade. In the event of an absence, you are responsible for obtaining class notes, handouts, assignments, etc. You are also still responsible for turning in any assignments on time unless you make prior arrangements with me.
As detailed in [https://mako.cc/teaching/assessment.html my page on assessment], attendance in class is expected of all participants. If you need to miss class for any reason, please contact me ahead of time (email is best). Multiple unexplained absences will likely result in a lower grade or (in extreme circumstances) a failing grade. In the event of an absence, you are responsible for obtaining class notes, handouts, assignments, etc.


=== In-class device usage ===
=== Office Hours ===
 
Please refrain from any uses of digitally networked devices or other distraction machines that do not directly contribute to your engagement with the course material. If you struggle to comply with this policy, I may recommend you temporarily put away your device(s) or leave the classroom.
 
=== Peers’ Work and In-Class Discussions ===
 
Throughout the course, you may receive, read, collaborate, and/or comment on classmates’ work. These assignments are for class use only. You may not share them with anybody outside of class without explicit written permission from the document’s author and pertaining to the specific piece.
 
It is essential to the success of this class that all participants feel comfortable discussing questions, thoughts, ideas, fears, reservations, apprehensions and confusion about works-in-progress, statistical concepts, independent research, and more. Therefore, you may not create any audio or video recordings during class time nor share verbatim comments with those not in class nor are you allowed to share using other methods -- e.g., social media -- any comments linked to people’s identities unless you get clear and explicit permission. If you want to share general impressions or specifics of in-class discussions with those not in class, please do so without disclosing personal identities or details.
 
=== Academic Integrity ===
 
You are responsible for reading and abiding by the Northwestern University [https://www.northwestern.edu/provost/policies/academic-integrity/principles.html Principles Regarding Academic Integrity]. Personally, I expect you to exceed the minimal standards elaborated in those principles and to strive for admirable, extraordinary conduct in every aspect of your academic career. Feel free to ask me (the instructor) for clarification about this or related matters.


=== Deadlines ===
I will not hold regular office hours. In general, I will be available to meet after class. Please contact me on email to arrange a meeting then or at another time.
 
Emergencies happen. Unanticipated obstacles arise. If you cannot make a deadline, please contact me to figure out a schedule that will work. The more proactive and responsible you are, the more receptive I am likely be.
 
A word about extensions and incompletes: I strongly discourage them. In principle, I have no problem with extensions or incompletes. In practice, they tend to be a pain for everybody involved. If you absolutely must submit an assignment late, assume that I may require up to 1 month (4 weeks) to grade it. Please take this into account if you will need me to to submit a grade in order to receive your fellowship/diploma/visa/etc. by a particular date.


=== Accommodations ===
=== Accommodations ===


I am totally happy to provide accommodations for religious observance, physical needs, or other circumstances as needed. Any student requesting accommodations related to a disability or other condition is required to register with AccessibleNU (847-467-5530) and provide professors with an accommodation notification from AccessibleNU, preferably within the first two weeks of class. All information will remain confidential. For more information, visit [https://www.northwestern.edu/accessiblenu/ AccessibleNU].
In general, if you have an issue, such as needing an accommodation for a religious obligation or learning disability, speak with me before it affects your performance; afterward it is too late. Do not ask for favors; instead, offer proposals that show initiative and a willingness to work.


=== Sexual Misconduct ===
To request academic accommodations due to a disability please contact Disability Resources for Students, 448 Schmitz, 206-543-8924/V, 206-5430-8925/TTY. If you have a letter from Disability Resources for Students indicating that you have a disability that requires academic accommodations, please present the letter to me so we can discuss the accommodations that you might need for the class. I am happy to work with you to maximize your learning experience.


All participants in this class are bound by the [https://www.northwestern.edu/sexual-misconduct/title-IX/university-policies/policy-on-sexual-misconduct.html Northwestern University sexual misconduct policy] Please note, that the core of the policy states, "Northwestern is committed to fostering an environment in which all members of our community are safe, secure, and free from sexual misconduct of any form, including, but not limited to, sexual assault, sexual exploitation, stalking, and dating and domestic violence." I take this very seriously. Please review the policy and speak to me if you have any questions or concerns.
=== Academic Misconduct ===


=== Email protocol ===
I am committed to upholding the academic standards of the University of Washington’s Student Conduct Code. If I suspect a student violation of that code, I will first engage in a conversation with that student about my concerns.


I receive too much email and I sometimes fail to keep up. If, for some reason, I do not respond to a message related to this course within 48 hours, please do not take it personally and feel free to re-send the message with a polite reminder. This will help me and I will not resent you for it.
If we cannot successfully resolve a suspected case of academic misconduct through our conversations, I will refer the situation to the department of communication advising office who can then work with the COM Chair to seek further input and if necessary, move the case up through the College.
 
=== Office Hours ===


TBA.
While evidence of academic misconduct may result in a lower grade, I will not unilaterally lower a grade without addressing the issue with you first through the process outlined above.


=== Credit and Notes ===
=== Credit and Notes ===


This syllabus has, in ways that should be obvious, borrowed and built on the [https://www.openintro.org/stat/index.php OpenInto Statistics curriculum]. I also based nearly every aspect of the course design on Benjamin Mako Hill's [[Statistics_and_Statistical_Programming_(Winter_2017)|COM 521 class]].
This syllabus has, in ways that should be obvious, borrowed and built on the [https://www.openintro.org/stat/index.php OpenInto Statistics curriculum]. In the sense that he used the same two textbooks, I also drew some inspiration and confidence from Tom S. Clark's [http://www.tomclarkphd.com/teaching/POLS508F14.pdf syllabus for POLS 508: Data Analysis in Fall 2014].
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see CommunityData:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)