UW Statistics Courses

First Year Introduction Sequences
These are listed more or less in the order of recommendation although different courses will make sense for different students:

CS&SS505, Review Of Mathematics For Social Scientists is an option for those who need to brush up on their mathematical foundations.

BIOST: Biostats. Applied Biostats II is an excellent course, well-taught, with a tight relationship between theory, method, and application. No one in our group has taken Applied Biostats I, but it's this year's recommendation for a first class and we'll update this page when possible. Sociology: SOC504, SOC505, and SOC506. These are great courses and and quite applied. For CDSC members, this would be the easiest minimum option and would be slightly discouraged.

Communication: COM520 + COM521 offered in 2016-2017 and then, if all goes to plan, every other year after (e.g., 2018-2019, 2020-2021, etc.). COM520 is more about quantitative research design and basic social scientific epistemology and design. COM521 is taught by Mako but it is a truly introductory stats class with a strong emphasis on application in GNU R. Like the SOC sequence, this would be discouraged for CDSC folks who would be encouraged to take a more technical course.

CSSS503, Advanced Quantitative Political Methodology might be a good choice for a 2nd or 3rd quarter in statistics. It is a slightly mathematical applied statistics class which introduces regression and multi-variable techniques for developing causal arguments using statistics -- the course stuck fairly closely to the two textbooks Real Stats and Mastering Metrics in Spr 2018 and the course sites from the last two years are published on GitHub [LINK?], so take a look there if you want a preview of what will be covered. The class is sponsored by Political Science, so some of the content is influenced by their disciplinary norms.

Education Psychology: EDPSY490, EDPSY491 strong focus on psychometric techniques drawn from psychology. Should be relatively easier and very applied but will not provide a good training for research using non-experimental settings. Due to the rigor and the focus on experiments, ANOVA, and SPSS, this will discouraged for CDSC members.

If you already have good linear algebra and multivariate calculus, Taking ECON 580 and 581 is a good short-cut to getting a lot of methods covered in other classes. You could take SOC 504,505,506, CS&SS 503,504 560, and 564 or you could take ECON 580. 581 and read a few books :)

Center for Statistics in the Social Sciences (CSSS)
There are many useful courses in the CSSS. Most of CSSS classes are applied and will give you a chance to apply the methods that you learn to your own projects. Try to take advantage of these opportunities to make progress on your research. CSSS 509 is an exception, and will be discussed below under econometrics.

CSSS566, Causal Inference CS&SS 566 is good. It covers experimental, instrumental variable, and quasi-experimental designs, structural equation modeling, and DAGs. It takes a relatively philosophical and theoretical approach to causality and shows that common assumptions about causal identification can be wrong (e.g adding a variable to your model canintroduce bias (in theory).

CS&SS504, Applied Regression is an applied, but still technical course on regression. It may vary based on who is teaching the class. This will be the default option for CDSC students.

CS&SS 560, hierarchical modeling is important. Hierarchical models are the bread and butter for working with datasets that have community level variables and individual level variables, or that have longitudinal data.

Mathematical Statistics and Econometrics
Do you want to get serious about Statistics? Do you want to learn *why* statistical methods depend on assumptions and not just *how* to apply them? Do you already have a strong math or statistics background? Then these classes are for you.

ECON 580/CS&SS 509, Mathematical Statistics This class is great. It is a big class with quantitative methods folks from all over the social sciences. It is a "meta-methods class." The goal of the course is for you to understand in mathematical terms and notation how to derive statistical methods from probability theory. You work through proofs of various statistical methods. It covers probability theory, statistical tests, OLS, MLE, and Bayesian inference. There is some R programming where you use simulations to demonstrate theorems or analytical results. This class is very much not applied. There are pretty hard tests and homework assignments where you prove theorems and derive corollaries. To enjoy this class you should have at least 2 quarters of college calculus and an introductory stats sequence under your belt, or a strong math background (e.g. you were a math or physics major).

You can brush up on your calculus and stats to prepare for this class. Kaylea recommends Kahn Academy for calculus and [ https://onlinecourses.science.psu.edu/stat414 PSU (414 and 415)] for statistics.

ECON 581, Econometrics The first few weeks of ECON581 generalize 580 into the multivariate case. The second part provides regression methods (instrumental variables, two stage least squares, GMM) for when OLS assumptions are violated. In addition to calculus you used in CS&SS 509, ECON 581 uses multivariate calculus (partial derivatives, gradients), and linear algebra. Most students in 581 are PhD students in Stats, ECON, or finance.

CS&SS and 581 cover pretty much all the econometrics useful for applied empirical research. Applied courses in CSSS will be more useful for learning about time series, longitudinal, count data and so on. ECON582 is on nonparametric models and ECON583 and ECON 584 are "Econometric Theory I and II" and will be excellent but only really for folks building new econometric theories. Most students taking 583 and 584 will be PhD students in the ECON department specializing in methods. 580 is great. 581 is good, but CS&SS 503 and 510 should cover the most useful stuff in 581 except you won't do the proofs yourself. If you are seriously considering taking 583 or 584 you might also consider switching to a PhD in economics. :)

Machine Learning
Sometimes statistical inference is very hard. Prediction is often easier and sometimes predicting an outcome can be a useful contribution. Prediction and is also useful for constructing variables (e.g. content analysis). Supervised machine learning is essentially giving up on inference and focusing on prediction. "Unsupervised machine learning" (i.e. clustering) can be very useful for operationalization.

If you do not have a computer science background, STAT 588 looks like a good place to get some quick and dirty machine learning. Fitting machine learning models can be difficult when you data is very big (as ours often is). STAT 548 is a good class to learn how to solve these problems. It mainly focuses on stochastic optimization. It isn't very difficult, but you will get more out of it if you are good at linear algebra and multivariate calculus.

There are also 400 level introduction to machine learning classes in CSE and STAT, but STAT 588 looks better than either of these.