Editing Statistics and Statistical Programming (Winter 2017)
From CommunityData
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 7: | Line 7: | ||
:* We will use Canvas for [https://canvas.uw.edu/courses/1098035/announcements announcements], [https://canvas.uw.edu/courses/1098035/assignments turning in assignments], and [https://canvas.uw.edu/courses/1098035/discussion_topics discussion] (if you choose to use them) | :* We will use Canvas for [https://canvas.uw.edu/courses/1098035/announcements announcements], [https://canvas.uw.edu/courses/1098035/assignments turning in assignments], and [https://canvas.uw.edu/courses/1098035/discussion_topics discussion] (if you choose to use them) | ||
:* Everything else will be linked on this page. | :* Everything else will be linked on this page. | ||
:'''Course Catalog Description:[https://www.washington.edu/students/crscat/com.html#com521]''' | :'''Course Catalog Description:[https://www.washington.edu/students/crscat/com.html#com521]''' | ||
Line 30: | Line 29: | ||
* Feel comfortable reading papers that use basic statistical techniques. | * Feel comfortable reading papers that use basic statistical techniques. | ||
* Feel comfortable and prepared enrolling in future statistics courses in CSSS. | * Feel comfortable and prepared enrolling in future statistics courses in CSSS. | ||
== Note About This Syllabus == | == Note About This Syllabus == | ||
Line 70: | Line 47: | ||
Diez, Barr, and Çetinkaya-Rundel's is a free, and freely-licensed, online statistics textbook. Over the last seven years, the book has also developed a large online community of students and teachers who have shared other resources. The book, lectures notes, and more are all freely licensed which has allowed the text to be adapted in a series of different fields. The book is excellent and it has been adopted extraordinarily widely. You can buy versions from Amazon in either [https://www.openintro.org/redirect.php?go=amazon_os3_hardcover&referrer=/stat/textbook.php full color hardcover] ($19.99) or in [https://www.openintro.org/redirect.php?go=createspace_os3&referrer=/stat/textbook.php black and white paperback] ($7.60). I haven't purchased a paper copy so I can't speak to the quality of either. | Diez, Barr, and Çetinkaya-Rundel's is a free, and freely-licensed, online statistics textbook. Over the last seven years, the book has also developed a large online community of students and teachers who have shared other resources. The book, lectures notes, and more are all freely licensed which has allowed the text to be adapted in a series of different fields. The book is excellent and it has been adopted extraordinarily widely. You can buy versions from Amazon in either [https://www.openintro.org/redirect.php?go=amazon_os3_hardcover&referrer=/stat/textbook.php full color hardcover] ($19.99) or in [https://www.openintro.org/redirect.php?go=createspace_os3&referrer=/stat/textbook.php black and white paperback] ($7.60). I haven't purchased a paper copy so I can't speak to the quality of either. | ||
Verzani's book is an introduction to the R programming language. It's designed to be used as a companion to a basic introductory statistics textbook (like OpenIntro). It's a poor stand-alone text but it will provide good resources for the material we're covering in the course and it should act as a good reference going forward. The book is available online for about $50. | Verzani's book is an introduction to the R programming language. It's designed to be used as a companion to a basic introductory statistics textbook (like OpenIntro). It's a poor stand-alone text but it will provide good resources for the material we're covering in the course and it should act as a good reference going forward. The book is available online for about $50. ''I'd recommend holding off on purchasing the book until after the first class.'' | ||
Although it's not required for the course, I want to point you to these two books. When I was learning R, these both were very useful references: | Although it's not required for the course, I want to point you to these two books. When I was learning R, these both were very useful references: | ||
Line 80: | Line 57: | ||
* [ftp://cran.r-project.org/pub/R/doc/contrib/Baggott-refcard-v2.pdf Baggott's R Reference Card v2] — When I was learning R, I ''literally'' took a similar reference card with me everywhere and looked at it dozens of times a day. | * [ftp://cran.r-project.org/pub/R/doc/contrib/Baggott-refcard-v2.pdf Baggott's R Reference Card v2] — When I was learning R, I ''literally'' took a similar reference card with me everywhere and looked at it dozens of times a day. | ||
* [https://stackoverflow.com/questions/tagged/r StackOverflow R Tag] — Somebody already had your question about how to do ''X'' in R. They asked it, and several people have answered it, on StackOverflow | * [https://stackoverflow.com/questions/tagged/r StackOverflow R Tag] — Somebody already had your question about how to do ''X'' in R. They asked it, and several people have answered it, on StackOverflow. | ||
== Assignments == | == Assignments == | ||
Line 117: | Line 93: | ||
* '''Ensure replicability''' — I'll expect you all to provide code and data for your analysis in a way that makes your work replicable by other researchers. | * '''Ensure replicability''' — I'll expect you all to provide code and data for your analysis in a way that makes your work replicable by other researchers. | ||
Although it's not required, I ''strongly urge each of you'' to take this opportunity to produce a document that will further your academic career outside of the class. There are many ways that this can happen but the obvious ones are that the paper is something you can submit for publication to a journal or conference, that provides primarily analysis for or acts as a pilot analysis that you can report in a grant proposal or thesis proposal, and/or that serves as part of your masters thesis or dissertation. | Although it's not required, I ''strongly urge each of you'' to take this opportunity to produce a document that will further your to academic career outside of the class. There are many ways that this can happen but the obvious ones are that the paper is something you can submit for publication to a journal or conference, that provides primarily analysis for or acts as a pilot analysis that you can report in a grant proposal or thesis proposal, and/or that serves as part of your masters thesis or dissertation. | ||
==== Project and Dataset Identification ==== | ==== Project and Dataset Identification ==== | ||
Line 131: | Line 107: | ||
* An identification of the dataset you will use and a description of the columns or type of data it will include. If you do not currently have access to these data, explain when you will have access to the data. | * An identification of the dataset you will use and a description of the columns or type of data it will include. If you do not currently have access to these data, explain when you will have access to the data. | ||
==== Final Project | ==== Final Project ==== | ||
;Outline Due Date: February 21 | ;Outline Due Date: February 21 | ||
;Maximum outline length: 5 pages | ;Maximum outline length: 5 pages | ||
;Paper Due Date: March 19 | ;Paper Due Date: March 19 | ||
;Maximum length: 6000 words (~20 pages) | ;Maximum outline length: 6000 words (~20 pages) | ||
;Presentation Date: March | ;Presentation Date: March 7 | ||
;All Deliverables: Turn in in Canvas | ;All Deliverables: Turn in in Canvas | ||
Line 156: | Line 122: | ||
I have a strong preference for you to write this paper individually but I'm open to the idea that you may want to work with others in the class. | I have a strong preference for you to write this paper individually but I'm open to the idea that you may want to work with others in the class. | ||
'''''Details Forthcoming:''''' ''Although this material is still somewhat thin, I'll be posting many additional details about the expectations for the final paper as we move forward through the quarter.'' | |||
=== Grading === | === Grading === | ||
Line 180: | Line 143: | ||
* Take a look at datasets available in the [https://dataverse.harvard.edu/ Harvard Dataverse] (the largest collection of social science research data) or one of the other members of the [http://dataverse.org/ Dataverse network]. | * Take a look at datasets available in the [https://dataverse.harvard.edu/ Harvard Dataverse] (the largest collection of social science research data) or one of the other members of the [http://dataverse.org/ Dataverse network]. | ||
* Look at the collection of social scientific datasets at [https://www.icpsr.umich.edu/icpsrweb/ICPSR/ ICPSR] (UW is a member). There are an enormous number of very rich datasets. | * Look at the collection of social scientific datasets at [https://www.icpsr.umich.edu/icpsrweb/ICPSR/ ICPSR] (UW is a member). There are an enormous number of very rich datasets. | ||
* | * The [http://scientificdata.isa-explorer.org/index.html ISA Explorer] to find datasets. Keep in mind the large majority of datasets it will search are drawn from the natural sciences. | ||
* Set up a meeting with Jennifer Muilenburg — Data Curriculum and Communications Librarian who runs [https://www.lib.washington.edu/digitalscholarship/services/data research data services at the UW libraries]. Her email is: libdata@uw.edu | * Set up a meeting with Jennifer Muilenburg — Data Curriculum and Communications Librarian who runs [https://www.lib.washington.edu/digitalscholarship/services/data research data services at the UW libraries]. Her email is: libdata@uw.edu | ||
In general, you're responsible for make sure that you're on the right side of the human subject rules and that work is ethical. Class projects generally do not need IRB approval but I hope that each of your projects will turn into something more. If your study involves human subjects research, ''that'' work will need IRB oversight of some sort | In general, you're responsible for make sure that you're on the right side of the human subject rules and that work is ethical. Class projects generally do not need IRB approval but I hope that each of your projects will turn into something more. If your study involves human subjects research, ''that'' work will need IRB oversight of some sort. | ||
== Structure of Class == | == Structure of Class == | ||
I expect everybody to come to class, every week, with their laptop and a power cord, being ready to answer any question on the problem set and having uploaded and shared code to the code related questions. The class is listed as nearly 4 hours long and, with the exception of a few short breaks, I intend to use the entire period | I expect everybody to come to class, every week, with their laptop and a power cord, being ready to answer any question on the problem set and having uploaded and shared code to the code related questions. The class is listed as nearly 4 hours long and, with the exception of a few short breaks, I intend to use the entire period. | ||
When it comes to the statistics part of this material, this will be a primarily "flipped" classroom. What this means is that we'll be relying on the textbook and other resources to introduce the material and we'll be using the class to discuss it and answer questions that come up. | When it comes to the statistics part of this material, this will be a primarily "flipped" classroom. What this means is that we'll be relying on the textbook and other resources to introduce the material and we'll be using the class to discuss it and answer questions that come up. | ||
Line 194: | Line 156: | ||
Although structure of class will vary, it will generally include the following parts. | Although structure of class will vary, it will generally include the following parts. | ||
# Quick updates about assignments | # Quick updates about assignments. | ||
# Discussion of '''programming challenges''' due that day. | # Discussion of '''programming challenges''' due that day. | ||
# [''Possibly/Sometimes''] Short lecture and/or Q&A about new material in Diez, Barr, and Çetinkaya-Rundel | # [''Possibly/Sometimes''] Short lecture and/or Q&A about new material in Diez, Barr, and Çetinkaya-Rundel | ||
Line 214: | Line 176: | ||
* Verzani: §1 (Getting Started), §2 (Univariate data) [[https://faculty.washington.edu/makohill/com521/verzani-usingr-ch1_ch2.pdf Available with UWNetID]] | * Verzani: §1 (Getting Started), §2 (Univariate data) [[https://faculty.washington.edu/makohill/com521/verzani-usingr-ch1_ch2.pdf Available with UWNetID]] | ||
* Kramer, Adam D. I., Jamie E. Guillory, and Jeffrey T. Hancock. 2014. “Experimental Evidence of Massive-Scale Emotional Contagion through Social Networks.” ''Proceedings of the National Academy of Sciences'' 111(24):8788–90. [[http://www.pnas.org/content/111/24/8788.full Available through UW libraries]] | * Kramer, Adam D. I., Jamie E. Guillory, and Jeffrey T. Hancock. 2014. “Experimental Evidence of Massive-Scale Emotional Contagion through Social Networks.” ''Proceedings of the National Academy of Sciences'' 111(24):8788–90. [[http://www.pnas.org/content/111/24/8788.full Available through UW libraries]] | ||
'''Assignment (Complete Before Class):''' | '''Assignment (Complete Before Class):''' | ||
Line 223: | Line 181: | ||
* [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 1]] | * [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 1]] | ||
''' | '''Optional Readings/Resources:''' | ||
* | * Verzani: §A (Programming) | ||
* [https://www.openintro.org/download.php?file=os3_slides_01&referrer=/stat/slides/slides_0x.php Mine Çetinkaya-Rundel's OpenIntro §1 Lecture Notes] | * [https://www.openintro.org/download.php?file=os3_slides_01&referrer=/stat/slides/slides_0x.php Mine Çetinkaya-Rundel's OpenIntro §1 Lecture Notes] | ||
* [https://www.openintro.org/stat/videos.php OpenIntro Video Lectures] including some for §1 | * [https://www.openintro.org/stat/videos.php OpenIntro Video Lectures] including some for §1 | ||
=== Week 2: Tuesday January 10: Probability and Visualization === | === Week 2: Tuesday January 10: Probability and Visualization === | ||
Line 239: | Line 192: | ||
* Diez, Barr, and Çetinkaya-Rundel: §2 (Probability) | * Diez, Barr, and Çetinkaya-Rundel: §2 (Probability) | ||
* Verzani: §3.1-2 (Bivariate data), §4 (Multivariate data), §5 (Multivariate graphics) | * Verzani: §3.1-2 (Bivariate data), §4 (Multivariate data), §5 (Multivariate graphics) | ||
* | * ''Empirical Paper TBD'' | ||
=== Week 3: Tuesday January 17: Distributions === | |||
'''Required Readings:''' | '''Required Readings:''' | ||
* Diez, Barr, and Çetinkaya-Rundel: §3.1-3.2, §3.4 | * Diez, Barr, and Çetinkaya-Rundel: §3.1-3.2, §3.4 | ||
* Verzani: §6 (Populations) | * Verzani: §6 (Populations) | ||
* ''Empirical Paper TBD'' | |||
* | |||
=== Week 4: Tuesday January 24: Statistical significance and hypothesis testing === | === Week 4: Tuesday January 24: Statistical significance and hypothesis testing === | ||
Line 287: | Line 210: | ||
* Diez, Barr, and Çetinkaya-Rundel: §4 (Foundations for inference) | * Diez, Barr, and Çetinkaya-Rundel: §4 (Foundations for inference) | ||
* Verzani: §7 (Statistical inference), §8 (Confidence intervals) | * Verzani: §7 (Statistical inference), §8 (Confidence intervals) | ||
* ''Empirical Paper TBD'' | |||
* | |||
=== Week 5: Tuesday January 31: Continuous Numeric Data & ANOVA === | === Week 5: Tuesday January 31: Continuous Numeric Data & ANOVA === | ||
Line 309: | Line 218: | ||
* Diez, Barr, and Çetinkaya-Rundel: §5 (Inference for numerical data) | * Diez, Barr, and Çetinkaya-Rundel: §5 (Inference for numerical data) | ||
* Verzani: §9 (significance tests), §12 (Analysis of variance) | * Verzani: §9 (significance tests), §12 (Analysis of variance) | ||
* | * ''Empirical Paper TBD'' | ||
=== Week 6: Tuesday February 7: Categorical data === | === Week 6: Tuesday February 7: Categorical data === | ||
Line 333: | Line 226: | ||
* Diez, Barr, and Çetinkaya-Rundel: §6 (Inference for categorical data) | * Diez, Barr, and Çetinkaya-Rundel: §6 (Inference for categorical data) | ||
* Verzani: §3.4 (Bivariate categorical data); §10.1-10.2 (Goodness of fit) | * Verzani: §3.4 (Bivariate categorical data); §10.1-10.2 (Goodness of fit) | ||
* | * ''Empirical Paper TBD'' | ||
=== Week 7: Tuesday February 14: Linear Regression === | === Week 7: Tuesday February 14: Simple Linear Regression === | ||
'''Required Readings:''' | '''Required Readings:''' | ||
* Diez, Barr, and Çetinkaya-Rundel: §7 (Introduction to linear regression) | * Diez, Barr, and Çetinkaya-Rundel: §7 (Introduction to linear regression) | ||
* Verzani: §11.1-2 (Linear regression), | * Verzani: §11.1-2 (Linear regression), | ||
* | * ''Empirical Paper TBD'' | ||
=== Week 8: Tuesday February 21: Multiple and Logistic Regression === | |||
=== Week 8: Tuesday February 21: | |||
'''Required Readings:''' | '''Required Readings:''' | ||
* Diez, Barr, and Çetinkaya-Rundel: §8 (Multiple and logistic regression) | |||
* Diez, Barr, and Çetinkaya-Rundel: §8 | |||
* Verzani: §11.3 (Linear regression), §13.1 (Logistic regression) | * Verzani: §11.3 (Linear regression), §13.1 (Logistic regression) | ||
* | * ''Empirical Paper TBD'' | ||
=== Week 9: Tuesday February 28: Consulting Meetings === | === Week 9: Tuesday February 28: Consulting Meetings === | ||
Line 407: | Line 248: | ||
We won't meet as a group. Instead, you will each meet on-on-one with me to work through challenges and issues with your analysis. | We won't meet as a group. Instead, you will each meet on-on-one with me to work through challenges and issues with your analysis. | ||
=== Week 10: Tuesday March 7 | === Week 10: Tuesday March 7: Final Presentations === | ||
== Administrative Notes == | == Administrative Notes == |