# Statistics and Statistical Programming (Winter 2017)/R lecture outline: Week 7

• correlations
• cor(): works with two variables, or with more!
• cor(method="spearman") is useful if you have non-normally distributed data because it is simply rank correlations)
• fitting a linear model with one variable: lm()
• module formulae, which we've already seen!
• looking at model objects: summary(); m\$<tab> or names(m)
• m\$fitted.values; m\$residuals
• also functions: coefficients(m) (or coef), predict(m), residuals(m) (or resid); confint(m)
• we can also do these by hand:
• residuals: mtcars\$mpg - m\$fitted.values
• confint: est + 1.96 * c(-1, 1) * se
• plotting residuals:
• hist(residuals(m))
• plot against our x: plot(mtcars\$hp, residuals(m)
• QQ-plots with qqnorm(residuals(m))
• doing a plot with ggplot just involves making a dataset: d.fig <- data.frame(hp=mtcars\$hp, resids=residuals(m))
• adding controls: just make our formula more complex
• update.formula()
• or just a write a new one
• adding logical variables: no problem!
• adding categorical variables: no problem! (I'll explain interpretation later, but i want you to see that this works!)
• generating nice regression plots:
• one of many options: stargazer(m1, m2, type="text") or type="html"
• interpreting linear models with anova() — i'm not going to walk through the details but the important thing to keep in mind is that although the statistics are different, the p-values are identical!