UW Statistics Courses

From CommunityData
Revision as of 20:43, 10 June 2018 by Kaylea (talk | contribs) (some updates from Kaylea who forgot to log in first)

First Year Introduction Sequences

These are listed more or less in the order of recommendation although different courses will make sense for different students:

BIOST: Biostats. Applied Biostats II is an excellent course, well-taught, with a tight relationship between theory, method, and application.

CSSS: CS&SS503 is a slightly mathematical applied statistics class which introduces regression; CS&SS504 more applied, but still technical; CS&SS505 (Review Of Mathematics For Social Scientists) is important. It may vary based on who is teaching the class. This will be the default option for CDSC students. CS&SS 566 is also good. It is a more philosophical and theoretical approach to causality which corrects assumptions about causal identification that are commonly held by econometricians (e.g adding a variable to your model canintroduce bias (in theory). You might also consider CS&SS 560, hierarchical modeling, but you could also just read Andrew Gelman's book. You might also take CS&SS 564 (Baysian Statistics) but if you take ECON 580 you could probably learn the material in this class on your own.

Economics: If you are doing this sequence, ECON 580/CS&SS 509 is essential. (Kaylea recommends: If you do not have both a 2nd year college calculus and a 400-level college stats sequence under your belt, you will need them before you will be happy in this class. If you have a gap in your preparation or it's been a while since you took math, I recommend you work through all of the calculus in Kahn Academy and both of the PSU stat classes listed here (414 and 415): https://onlinecourses.science.psu.edu/stat414 as preparation for this course.) It is a "meta-methods class" ("meta-methods" in this case means that you work through proofs of various statistical methods, and do some R programming that models the behaviors of ideal functions -- this class is not applied) that is essentially a more rigorous (two-variable calculus, many proofs) introduction to statistics. It covers MLE, Baysian inference, and bivariate OLS. It is also a prereq to the most interesting and advanced CS&SS classes. The first few weeks of ECON581 generalize 580 into the multivariate case. The second part provides regression methods for when OLS assumptions are violated. It is good to take if you are (a) good at linear algebra and multivariate calculus (b) want to learn how to derive MLE, GLS, and prove consistency and asymptotics of MLE, IV, and GMM models. ECON 580 and 581 probably all the econometrics useful for applied empirical research. ECON582 is nonparametric models and ECON583, ECON 584 are "Econometric Theory I and II" and will be excellent but only really for folks building new econometric theories. 580 is a big class with quantitative methods folks from all over the social sciences. Most students in 581 are PhD students in Stats, ECON, or finance. Most students taking 583 and 584 will be PhD students in the ECON department specializing in methods. 580 is great. 581 is good, but CS&SS 503 and 510 should cover the most useful stuff in 581 except you won't do the proofs yourself. If you are seriously considering taking 583 or 584 you might also consider switching to a PhD in economics. :)

Sociology: SOC504, SOC505, and SOC506. These are great courses and and quite applied. For CDSC members, this would be the easiest minimum option and would be slightly discouraged.

Communication: COM520 + COM521 offered in 2016-2017 and then, if all goes to plan, every other year after (e.g., 2018-2019, 2020-2021, etc.). COM520 is more about quantitative research design and basic social scientific epistemology and design. COM521 is taught by Mako but it is a truly introductory stats class with a strong emphasis on application in GNU R. Like the SOC sequence, this would be discouraged for CDSC folks who would be encouraged to take a more technical course.

Education Psychology: EDPSY490, EDPSY491 strong focus on psychometric techniques drawn from psychology. Should be relatively easier and very applied but will not provide a good training for research using non-experimental settings. Due to the rigor and the focus on experiments, ANOVA, and SPSS, this will discouraged for CDSC members.

If you already have good linear algebra and multivariate calculus, Taking ECON 580 and 581 is a good short-cut to getting a lot of methods covered in other classes. You could take SOC 504,505,506, CS&SS 503,504 560, and 564 or you could take ECON 580. 581 and read a few books :)

Other Topics

Machine Learning

Sometimes statistical inference is very hard. Prediction is often easier and sometimes predicting an outcome can be a useful contribution. Prediction and is also useful for constructing variables (e.g. content analysis). Supervised machine learning is essentially giving up on inference and focusing on prediction. "Unsupervised machine learning" (i.e. clustering) can be very useful for operationalization.

If you do not have a computer science background, STAT 588 looks like a good place to get some quick and dirty machine learning. Fitting machine learning models can be difficult when you data is very big (as ours often is). STAT 548 is a good class to learn how to solve these problems. It mainly focuses on stochastic optimization. It isn't very difficult, but you will get more out of it if you are good at linear algebra and multivariate calculus.

There are also 400 level introduction to machine learning classes in CSE and STAT, but STAT 588 looks better than either of these.