Editing UW Statistics Courses

From CommunityData

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 1: Line 1:
== First Year Introduction Sequences ==
== First Year Introduction Sequences ==
A sequence is typically going to be a 2-3 quarter group of classes that will give you solid basis into statistics. These will all cover probability, introductory statistics, and statistical programming in R, Stata, SPSS, SAS, etc. They should each cover hypothesis testing and statistical inference, descriptive statistics, some visualization, and linear regression.  They might go further or touch on other things as well. Many of these sequences will also cover more basic features of quantitative social scientific research like operationalization and measure construction, experiment design, etc. A sequence like is the foundation for quantitative analysis and statistics but it is ''not'' a complete training. In almost all cases, it will likely need to be supplemented with additional classes. All CDSC students should take a sequence during their first year.


These are listed more or less in the order of recommendation although different courses will make sense for different students:
These are listed more or less in the order of recommendation although different courses will make sense for different students:


'''Biostatitics (BIOST)''' Biostats. Applied Biostats II (518) is an excellent course, well-taught, with a tight relationship between theory, method, and application. As of Summer '18, no one in the [[CDSC]] has taken Applied Biostats I (517), but it's this year's recommendation for a first class and we'll update this page when possible.
'''BIOST:''' Biostats. Applied Biostats II is an excellent course, well-taught, with a tight relationship between theory, method, and application. As of Summer '18, no one in the [[CDSC]] has taken Applied Biostats I, but it's this year's recommendation for a first class and we'll update this page when possible.


'''Sociology (SOC):''' SOC504, SOC505, and SOC506. These are good courses but are quite applied. For [[CDSC]] members, this would be the easiest minimum option and would be slightly discouraged.
'''Sociology:''' SOC504, SOC505, and SOC506. These are good courses but are quite applied. For [[CDSC]] members, this would be the easiest minimum option and would be slightly discouraged.


'''Communication (COM):''' COM520 + COM521. Some combination of a quantitative research design and basic social scientific epistemology and design. COM520 or COM520 taught by Mako but it is a truly introductory stats class with a strong emphasis on application in GNU R. Like the SOC sequence, this would be discouraged for CDSC folks who would be encouraged to take a more technical course.
'''Communication:''' COM520 + COM521 offered in 2016-2017 and then, if all goes to plan, every other year after. Due to Mako's trip to CASBS in '18-19, the sequence will likely be taught again in 2019-2020. COM520 is more about quantitative research design and basic social scientific epistemology and design. COM521 is taught by Mako but it is a truly introductory stats class with a strong emphasis on application in GNU R. Like the SOC sequence, this would be discouraged for CDSC folks who would be encouraged to take a more technical course.


'''Political Science (POLS):''' This sequence begins with POLS500 in the autumn which is similar to COM520/COM521 and is an introduction to quantitative research in the social sciences.  
'''CSSS503, Advanced Quantitative Political Methodology''' might be a good choice for a 2nd or 3rd quarter in statistics. It is a slightly mathematical applied statistics class which introduces regression and multi-variable techniques for developing causal arguments using statistics -- the course stuck fairly closely to the two textbooks Real Stats and Mastering Metrics in Spr 2018 and the course sites from the last two years are published on GitHub [LINK?], so take a look there if you want a preview of what will be covered. The class is sponsored by Political Science, so some of the content is influenced by their disciplinary norms.


POL501/CS&SS501 focuses on "testing theories with empirical evidence. Examines current topics in research methods and statistical analysis in political science. Content varies according to recent developments in the field and with interests of instructor." 


POLS503/CS&SS503 is Advanced Quantitative Political Methodology and might be a good choice for a 2nd or 3rd quarter in statistics. It is a slightly mathematical applied statistics class which introduces regression and multi-variable techniques for developing causal arguments using statistics. The course stuck fairly closely to the two textbooks Real Stats and Mastering Metrics (an undergrad textbook) in Spr 2018 and the course sites from the last two years [https://uw-pols503.github.io/2017/][https://uw-pols503.github.io/2018/] are published on GitHub, with instructor notes at [https://jrnold.github.io/intro-methods-notes/] so take a look there if you want a preview of what will be covered. The class is sponsored by Political Science, so some of the content is influenced by their disciplinary norms.
'''Education Psychology:''' EDPSY490, EDPSY491 strong focus on psychometric techniques drawn from psychology. Should be relatively easier and very applied but will not provide a good training for research using non-experimental settings. Due to the rigor and the focus on experiments, ANOVA, and SPSS, this will discouraged for CDSC members.


'''Economics/Stastics (ECON/STAT):''' If you already have good linear algebra and multivariate calculus, Taking ECON 580 and 581 is a good short-cut to getting a lot of methods covered in other classes. You could take SOC 504,505,506, CS&SS 503,504 560, and 564 or you could take ECON 580, 581 and read a few books. This is the ideal class for any CDSC folks although it will likely be a poor choice until you have a relatively strong mathematics background. Details on these classes are provided below.
If you already have good linear algebra and multivariate calculus, Taking ECON 580 and 581 is a good short-cut to getting a lot of methods covered in other classes. You could take SOC 504,505,506, CS&SS 503,504 560, and 564 or you could take ECON 580. 581 and read a few books :)


'''Education Psychology (EDPSY):''' EDPSY490, EDPSY491 strong focus on psychometric techniques drawn from psychology. Should be relatively easier and very applied but will not provide a good training for research using non-experimental settings. Due to the rigor and the focus on experiments, ANOVA, and SPSS, this is discouraged for CDSC members which will typically be dealing with observational data or should, at the very least, build the skills necessary to do so.
== Other Courses ==


== Other First-Year Courses ==
'''CS&SS505, Review Of Mathematics For Social Scientists''' is an option for those who need to brush up on their mathematical foundations.


Although they will not be taught in the sequences above, CDSC members should be comfortable with the material taught in these very short courses and camps by the end of their first year. Taking these courses is a good way to make sure that happens!
== Advanced Statistics Courses ==
 
'''Math Camp:''' Math Camp is an intensive one-week introductory course offered during the summer. 
'''Review of Mathematics for Social Scientists (CS&SS 505):''' A 1-credit course covers the same material as Math Camp but at a slower pace.
 
Math Camp/CSSS 505 are recommended for incoming students and students who are entering their 2nd year and plan to take an advanced statistics course. It will assume basic math skills through high school algebra but nothing else. Topics reviewed are algebra, functions and limits, differentiation, maximization of functions, integration, matrix algebra, linear equations and least squares, and probability. Typically offered during winter and spring quarters.
 
'''Introduction to R (CS&SS 508):''' Another 1-credit class that will familiarize students with the R environment for statistical computing.


== Advanced Statistics Courses ==  
=== Center for Statistics in the Social Sciences (CSSS)===
There are many useful courses in the CSSS. Most of CSSS classes are applied and will give you a chance to apply the methods that you learn to your own projects. Try to take advantage of these opportunities to make progress on your research. CSSS 509 is an exception, and will be discussed below under econometrics.


There are many useful courses offered by the Center for Statistics in the Social Sciences (CSSS). Most of CSSS classes are applied and will give you a chance to apply the methods that you learn to your own projects. Try to take advantage of these opportunities to make progress on your research. CSSS 509 is an exception, and will be discussed below under econometrics.  
'''CSSS566, Causal Inference''' CS&SS 566 is good. It covers experimental, instrumental variable, and quasi-experimental designs, structural equation modeling, and DAGs. It takes a relatively philosophical and theoretical approach to causality and shows that common assumptions about causal identification can be wrong (e.g adding a variable to your model ''can''introduce bias (in theory).  


'''CS&SS504, Applied Regression''' is an applied, but still technical course on regression. It may vary based on who is teaching the class. This will be the default option for [[CDSC]] students.
'''CS&SS504, Applied Regression''' is an applied, but still technical course on regression. It may vary based on who is teaching the class. This will be the default option for [[CDSC]] students.
Line 40: Line 31:
'''CS&SS 560, hierarchical modeling''' is important. Hierarchical models are the bread and butter for working with datasets that have community level variables and individual level variables, or that have longitudinal data.
'''CS&SS 560, hierarchical modeling''' is important. Hierarchical models are the bread and butter for working with datasets that have community level variables and individual level variables, or that have longitudinal data.


'''CSSS564, Bayesian Statistics for the Social Sciences''' CS&SS 564 is very good. This may vary by the instructor/text, but in 2023 it was taught using R/Jags/Stan with a project and no tests; the content is a nice blend of mathematical and applied perspectives. There are a lot of online resources that accompany the text so you can learn/re-learn the material a few different ways. It's a fair amount of work because you are building familiarity with doing a lot of simulation and digging your hands into how models are working, but the pre-requisites are low; it's not brain-breaking, just some solid grinding and that takes time. Probably easier than 560 because you will review basics of probability, binomial model, etc. from the intro-sequence but in a Bayesian way. That said, the R is a bit more intense in 564 than it is in 560. Taught using mostly base R -- not tidyverse!
=== Mathematical Statistics and Econometrics ===
 
'''CSSS566, Causal Inference''' CS&SS 566 is good. It covers experimental, instrumental variable, and quasi-experimental designs, structural equation modeling, and DAGs. It takes a relatively philosophical and theoretical approach to causality and shows that common assumptions about causal identification can be wrong (e.g adding a variable to your model ''can'' introduce bias, in theory).
 
== Mathematical Statistics and Econometrics ==


Do you want to get serious about Statistics? Do you want to learn ''why'' statistical methods depend on assumptions and not just *how* to apply them? Do you already have a strong math or statistics background? Then these classes are for you.   
Do you want to get serious about Statistics? Do you want to learn ''why'' statistical methods depend on assumptions and not just *how* to apply them? Do you already have a strong math or statistics background? Then these classes are for you.   
Line 50: Line 37:
'''ECON 580/CS&SS 509, Mathematical Statistics'''  This class is great. It is a big class with quantitative methods folks from all over the social sciences. It is  a "meta-methods class."  The goal of the course is for you to understand in mathematical terms and notation how to derive statistical methods from probability theory. You work through proofs of various statistical methods. It covers probability theory, statistical tests, OLS, MLE, and Bayesian inference. There is some R programming where you use simulations to demonstrate theorems or analytical results.  This class is very much not applied. There are pretty hard tests and homework assignments where you prove theorems and derive corollaries. To enjoy this class you should have at least 2 quarters of college calculus and an introductory stats sequence under your belt, or a strong math background (e.g. you were a math or physics major).  
'''ECON 580/CS&SS 509, Mathematical Statistics'''  This class is great. It is a big class with quantitative methods folks from all over the social sciences. It is  a "meta-methods class."  The goal of the course is for you to understand in mathematical terms and notation how to derive statistical methods from probability theory. You work through proofs of various statistical methods. It covers probability theory, statistical tests, OLS, MLE, and Bayesian inference. There is some R programming where you use simulations to demonstrate theorems or analytical results.  This class is very much not applied. There are pretty hard tests and homework assignments where you prove theorems and derive corollaries. To enjoy this class you should have at least 2 quarters of college calculus and an introductory stats sequence under your belt, or a strong math background (e.g. you were a math or physics major).  


You can brush up on your calculus and stats to prepare for this class. Kaylea recommends Kahn Academy for calculus and  [https://onlinecourses.science.psu.edu/stat414 PSU (414 and 415)] for statistics.  
You can brush up on your calculus and stats to prepare for this class. Kaylea recommends Kahn Academy for calculus and  [ https://onlinecourses.science.psu.edu/stat414 PSU (414 and 415)] for statistics.  
 


'''ECON 581, Econometrics''' The first few weeks of ECON581 generalize 580 into the multivariate case. The second part provides regression methods (instrumental variables, two stage least squares, GMM) for when OLS assumptions are violated.  
'''ECON 581, Econometrics''' The first few weeks of ECON581 generalize 580 into the multivariate case. The second part provides regression methods (instrumental variables, two stage least squares, GMM) for when OLS assumptions are violated.  
Line 66: Line 54:


There are also 400 level introduction to machine learning classes in CSE and STAT, but STAT 588 looks better than either of these.
There are also 400 level introduction to machine learning classes in CSE and STAT, but STAT 588 looks better than either of these.
== More Courses ==
IMT 573 Data Science I. is focused on the theoretical foundations of data science and provides a nontechnical overivew of the key concepts and skills required for data science. It introduces common data science pipelines, data collection and storage, basic analytics, meachine learning and data visualization with industry standard statistical packages.
IMT 574 Data Science II. is the second course in the sequence offers theoretical and practical introduction to techniques for the analysis of large-scale data. The course does have prerequisites but depending on where you are in the program it can be a good choice.
Data 512 is Human-Centered Data Science. It introduces the fundamental principles of data science and its human implications. Data ethics, privacy, algorithmic bias, legal frameworks, intellectual property and more.
CSSS 594 is a 1 credit special topics course. Have a peek to see if whatever is being offered in the current quarter is something your interested in.
CSE 160 is a 3 credit introduction to data manipulation in Python. It is an undergraduate course but if youre coming in unfamiliar with how to manipulate your dataset this course can be helpful. *it is intended for students without prior programming experience*
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see CommunityData:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)