Not logged in
Talk
Contributions
Create account
Log in
Navigation
Main page
About
People
Publications
Teaching
Resources
Research Blog
Wiki Functions
Recent changes
Help
Licensing
User page
Discussion
Edit
View history
Editing
User:Aaronshaw/Stats course
(section)
From CommunityData
Jump to:
navigation
,
search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Schedule == When reading the schedule below, the following key might help resolve ambiguity: §n denotes chapter n; §n.x denotes section x of chapter; §n.x-y denotes sections x through y of chapter n. === Week 1: Thursday April 4: Introduction, Setup, and Data and Variables === Please complete the readings prior to class so that we can discuss them and start talking through some of the examples in R together. '''Required Readings:''' * Diez, Barr, and Çetinkaya-Rundel: §1 (Introduction to data) * Kramer, Adam D. I., Jamie E. Guillory, and Jeffrey T. Hancock. 2014. “Experimental Evidence of Massive-Scale Emotional Contagion through Social Networks.” ''Proceedings of the National Academy of Sciences'' 111(24):8788–90. [[http://www.pnas.org/content/111/24/8788.full Available through NU libraries]] '''Recommended Readings:''' * Verzani: §1 (Getting Started), §2 (Univariate data) [[https://canvas.northwestern.edu/verzani_ch1-ch2.pdf Available via Canvas]] * Verzani: §A (Programming) * Healy: Chapter 2 (and skim the preferatory material as well as Chapter 1) '''Assignment (Complete before class):''' * [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 1]] '''R screencasts:''' * [https://communitydata.cc/~ads/teaching/2019/stats/r_lectures/w01-introduction.zip Week 1 R lecture materials] (.zip file) * [https://communitydata.cc/~mako/2017-COM521/com521-week_01-r_programming_intro-20170103.ogv Week 1 R lecture screencast (Part I): Introduction to R and univariate statistics] (~1 hour 47 minutes) * [https://communitydata.cc/~mako/2017-COM521/com521-week_01-github_rscripts-20170104.ogv Week 1 R lecture screencast (Part II): Setting up git/GitHub and saving files in RStudio] (~40 minutes) * [[Statistics and Statistical Programming (Spring 2019)/R lecture outline: Week 1]] '''Resources:''' * [https://www.openintro.org/download.php?file=os3_slides_01&referrer=/stat/slides/slides_0x.php Mine Çetinkaya-Rundel's OpenIntro §1 Lecture Notes] * [https://www.openintro.org/stat/videos.php OpenIntro Video Lectures] including some for §1 * [[Statistics and Statistical Programming (Winter 2017)/Session plan: Week 1]] === Week 2: Thursday April 11: Probability and Visualization === '''Required Readings:''' * Diez, Barr, and Çetinkaya-Rundel: §2 (Probability) * Verzani: §3.1-2 (Bivariate data), §4 (Multivariate data), §5 (Multivariate graphics) [[https://faculty.washington.edu/makohill/com521/verzani-usingr-ch3.1-2_ch4_ch5.pdf Available with UW NetID]] * Buechley, Leah and Benjamin Mako Hill. 2010. “LilyPad in the Wild: How Hardware’s Long Tail Is Supporting New Engineering and Design Communities.” Pp. 199–207 in ''Proceedings of the 8th ACM Conference on Designing Interactive Systems.'' Aarhus, Denmark: ACM. [[https://mako.cc/academic/buechley_hill_DIS_10.pdf PDF available on my personal website]] '''Assignment (Complete Before Class):''' * [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 2]] '''Lectures:''' * [[Statistics and Statistical Programming (Winter 2017)/R lecture outline: Week 2]] * [https://communitydata.cc/~mako/2017-COM521/com521-week_02-lists_dataframes_graphing-20170111.ogv Week 2 R lecture screencast: lists, matrixes, data frames, and beginning graphing] (~1 hour 8 minutes) '''Resources:''' * [https://www.openintro.org/download.php?file=os3_slides_02&referrer=/stat/slides/slides_0x.php Mine Çetinkaya-Rundel's OpenIntro §2 Lecture Notes] * [https://www.openintro.org/stat/videos.phpOpenIntro Video Lectures] including 2 short videos for §2 * [[Statistics and Statistical Programming (Winter 2017)/Session plan: Week 2]] === Week 3: Thursday April 18: Distributions === '''Required Readings:''' * Diez, Barr, and Çetinkaya-Rundel: §3.1-3.2, §3.4: You should read the rest of the chapter (§3.3 and §3.5). I won't assign problem set questions about it but it's still important to be familiar with. * Verzani: §6 (Populations) '''Assignment (Complete Before Class):''' * [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 3]] '''Lectures:''' * [[Statistics and Statistical Programming (Winter 2017)/R lecture outline: Week 3]] * [https://communitydata.cc/~mako/2017-COM521/com521-week_03-loading_data_functions_apply_misc.ogv Week 3 R lecture screencast: Loading data, functions; apply(), lapply(), sapply(); several miscellaneous functions] (~34 minutes) — This is the same material I covered in class. If you followed it, there's no reason you need to go back to this. * [https://communitydata.cc/~mako/2017-COM521/com521-week_03-dates_tapply_merge.ogv Week 3 R lecture screencast: Dates; tapply(); and merge()] (~38 minutes) [The audio seems to be broken for the last 10 minutes. Sorry about that! I've rerecorded that below.] * [https://communitydata.cc/~mako/2017-COM521/com521-week_03-merge.ogv Week 3 R lecture screencast: merge()] (~13 minutes) [Rerecording of the last few minutes of the previous video.] '''Resources:''' * [https://www.openintro.org/download.php?file=os3_slides_03&referrer=/stat/slides/slides_0x.php Mine Çetinkaya-Rundel's OpenIntro §3 Lecture Notes] * [https://www.openintro.org/stat/videos.php OpenIntro Video Lectures] including 2 videos for §3.1 and §3.2 * [[Statistics and Statistical Programming (Winter 2017)/Session plan: Week 3]] === Week 4: Thursday April 25: Statistical significance and hypothesis testing === '''Required Readings:''' * Diez, Barr, and Çetinkaya-Rundel: §4 (Foundations for inference) * Verzani: §7 (Statistical inference), §8 (Confidence intervals) '''Assignment (Complete Before Class):''' * [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 4]] '''Lectures:''' * [[Statistics and Statistical Programming (Winter 2017)/R lecture outline: Week 4]] * [https://communitydata.cc/~mako/2017-COM521/com521-week_04-misc_confint_simulation-20170125.ogv Week 4 R lecture screencast: order(); confidence intervals; simulations drawn from repeated random samples] (~27 minutes) '''Resources:''' * [https://www.openintro.org/download.php?file=os3_slides_04&referrer=/stat/slides/slides_0x.php Mine Çetinkaya-Rundel's OpenIntro §4 Lecture Notes] * [https://www.openintro.org/stat/videos.php OpenIntro Video Lectures] including 7 videos for nearly all of §4 * [[Statistics and Statistical Programming (Winter 2017)/Session plan: Week 4]] === Week 5: Thursday May 2: Continuous Numeric Data & ANOVA === '''Required Readings:''' * Diez, Barr, and Çetinkaya-Rundel: §5 (Inference for numerical data) * Verzani: §9 (significance tests), §12 (Analysis of variance) * Gelman, Andrew and Hal Stern. 2006. “The Difference Between ‘Significant’ and ‘Not Significant’ Is Not Itself Statistically Significant.” ''The American Statistician'' 60(4):328–31. [[http://dx.doi.org/10.1198/000313006X152649 Available through UW Libraries]] * Sweetser, K. D., & Metzgar, E. (2007). Communicating during crisis: Use of blogs as a relationship management tool. ''Public Relations Review'', 33(3), 340–342. https://doi.org/10.1016/j.pubrev.2007.05.016 [Available through UW Libraries] * Buechley, Leah and Benjamin Mako Hill. 2010. “LilyPad in the Wild: How Hardware’s Long Tail Is Supporting New Engineering and Design Communities.” Pp. 199–207 in ''Proceedings of the 8th ACM Conference on Designing Interactive Systems.'' Aarhus, Denmark: ACM. [[https://mako.cc/academic/buechley_hill_DIS_10.pdf PDF available on my personal website]] '''Assignment (Complete Before Class):''' * [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 5]] '''Lectures:''' * [[Statistics and Statistical Programming (Winter 2017)/R lecture outline: Week 5]] * [https://communitydata.cc/~mako/2017-COM521/com521-week_05-ttests_and_anova.ogv Week 5 R lecture screencast: t-tests] (~22 minutes) * [https://communitydata.cc/~mako/2017-COM521/com521-week_05-for_if.ogv Week 5 R lecture screencast: for loops and if statements] (~12 minutes) '''Resources:''' * [https://www.openintro.org/download.php?file=os3_slides_05&referrer=/stat/slides/slides_0x.php Mine Çetinkaya-Rundel's OpenIntro §5 Lecture Notes] === Week 6: Thursday May 9: Categorical data === '''Required Readings:''' * Diez, Barr, and Çetinkaya-Rundel: §6 (Inference for categorical data) * Verzani: §3.4 (Bivariate categorical data); §10.1-10.2 (Goodness of fit) * Gelman, Andrew and Eric Loken. 2014. “The Statistical Crisis in Science Data-Dependent Analysis—a ‘garden of Forking Paths’—explains Why Many Statistically Significant Comparisons Don’t Hold Up.” ''American Scientist'' 102(6):460. [[https://www.americanscientist.org/issues/pub/2014/6/the-statistical-crisis-in-science/1 Available through UW Libraries]] (This is a reworked version of [http://www.stat.columbia.edu/~gelman/research/unpublished/p_hacking.pdf this unpublished manuscript] which provides a more detailed examples.) * Buechley, Leah and Benjamin Mako Hill. 2010. “LilyPad in the Wild: How Hardware’s Long Tail Is Supporting New Engineering and Design Communities.” Pp. 199–207 in ''Proceedings of the 8th ACM Conference on Designing Interactive Systems.'' Aarhus, Denmark: ACM. [[https://mako.cc/academic/buechley_hill_DIS_10.pdf PDF available on my personal website]] '''Assignment (Complete Before Class):''' * [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 6]] '''Lectures:''' * [[Statistics and Statistical Programming (Winter 2017)/R lecture outline: Week 6]] * [https://communitydata.cc/~mako/2017-COM521/com521-week_06-tables_chisq_debugging.ogv Week 6 R lecture screencast: Tables, <math>\chi^2</math>-tests, and debugging.] (~40 minutes) '''Resources:''' * [https://www.openintro.org/download.php?file=os3_slides_06&referrer=/stat/slides/slides_0x.php Mine Çetinkaya-Rundel's OpenIntro §6 Lecture Notes] * [https://www.openintro.org/stat/videos.php OpenIntro Video Lectures] including 4 videos for §7 === Week 7: Thursday May 16: Linear Regression === '''Required Readings:''' * Diez, Barr, and Çetinkaya-Rundel: §7 (Introduction to linear regression); §8.1-8.3 (Multiple regression) * OpenIntro eschews a mathematical instruction to correlation. Can you look over [https://en.wikipedia.org/wiki/Correlation_and_dependence the Wikipedia article on correlation and dependence] and pay attentions to the formulas. It's tedious to compute but I'd like to you to at least see what goes into it. * Verzani: §11.1-2 (Linear regression), * Lampe, Cliff, and Paul Resnick. 2004. “Slash(Dot) and Burn: Distributed Moderation in a Large Online Conversation Space.” In ''Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '04)'', 543–550. New York, NY, USA: ACM. doi:10.1145/985692.985761. [[http://dx.doi.org/10.1145/985692.985761 Available in UW libraries]] '''Assignment (Complete Before Class):''' * [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 7]] '''Lectures:''' * [[Statistics and Statistical Programming (Winter 2017)/R lecture outline: Week 7]] * [https://communitydata.cc/~mako/2017-COM521/com521-week_07-linear_regression.ogv Week 7 R lecture screencast: linear regression] (~42 minutes) '''Resources:''' * [https://www.openintro.org/download.php?file=os3_slides_07&referrer=/stat/slides/slides_0x.php Mine Çetinkaya-Rundel's OpenIntro §7 Lecture Notes] * [https://www.openintro.org/download.php?file=os3_slides_08&referrer=/stat/slides/slides_0x.php Mine Çetinkaya-Rundel's OpenIntro §8 Lecture Notes] * [https://www.openintro.org/stat/videos.php OpenIntro Video Lectures] including 4 videos for §7 and 3 videos on the sections §8.1-8.3 === Week 8: Thursday May 23: Polynomial Terms, Interactions, and Logistic Regression === '''Required Readings:''' * [https://onlinecourses.science.psu.edu/stat501/node/301 Lesson 8: Categorical Predictors] and [https://onlinecourses.science.psu.edu/stat501/node/318 Lesson 9: Data Transformations] from the PennState Eberly College of Science STAT 501 Regression Methods Course. There are several subparts (many quite short), please read them all carefully. * Diez, Barr, and Çetinkaya-Rundel: §8.4 (Multiple and logistic regression) * Verzani: §11.3 (Linear regression), §13.1 (Logistic regression) * Ioannidis, John P. A. 2005. “Why Most Published Research Findings Are False.” ''PLoS Medicine'' 2(8):e124. [[http://dx.doi.org/10.1371%2Fjournal.pmed.0020124 Open Access]] * Lampe, Cliff, and Paul Resnick. 2004. “Slash(Dot) and Burn: Distributed Moderation in a Large Online Conversation Space.” In ''Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '04)'', 543–550. New York, NY, USA: ACM. doi:10.1145/985692.985761. [[http://dx.doi.org/10.1145/985692.985761 Available in UW libraries]] '''Optional Readings:''' * Head, Megan L., Luke Holman, Rob Lanfear, Andrew T. Kahn, and Michael D. Jennions. 2015. “The Extent and Consequences of P-Hacking in Science.” ''PLOS Biology'' 13(3):e1002106. [[http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002106 Open Access]] '''Assignment (Complete Before Class):''' * [[Statistics and Statistical Programming (Winter 2017)/Problem Set: Week 8]] '''Lectures:''' * [[Statistics and Statistical Programming (Winter 2017)/R lecture outline: Week 8]] * [https://communitydata.cc/~mako/2017-COM521/com521-week_08-more_regression_anova_redux.ogv Week 8 R lecture screencast: more on linear regression, including interactions, polynomials, log transformations; anova] (~28 minutes) '''Resources:''' * [https://www.openintro.org/download.php?file=os3_slides_08&referrer=/stat/slides/slides_0x.php Mine Çetinkaya-Rundel's OpenIntro §8 Lecture Notes] * [https://www.openintro.org/stat/videos.php OpenIntro Video Lectures] including a video on §8.4 * I've written this document which will likely be useful for many of you: [https://communitydata.cc/~mako/2017-COM521/logistic_regression_interpretation.html Interpreting Logistic Regression Coefficients with Examples in R] === Week 9: Thursday May 30: TBA === Reserved for catch-up, supplementary topics, and maybe some final presentations. === Week 10: Thursday June 6: Final Presentations === Followed by much rejoicing!
Summary:
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see
CommunityData:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:
Cancel
Editing help
(opens in new window)
Tools
What links here
Related changes
User contributions
Logs
View user groups
Special pages
Page information