Statistics and Statistical Programming (Winter 2017)/R lecture outline: Week 7: Difference between revisions
From CommunityData
No edit summary |
No edit summary |
||
(2 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
* correlations | |||
** cor(): works with two variables, or with more! | |||
** cor(method="spearman") is useful if you have non-normally distributed data because it is simply rank correlations) | |||
* fitting a linear model with one variable: lm() | * fitting a linear model with one variable: lm() | ||
** module formulae, which we've already seen! | ** module formulae, which we've already seen! | ||
** looking at model objects: summary(); m$<tab> or names(m) | ** looking at model objects: summary(); m$<tab> or names(m) | ||
*** m$fitted.values; | *** m$fitted.values; m$residuals | ||
*** also functions: coefficients(m) (or coef), predict(m), residuals(m) (or resid); confint(m) | |||
*** also functions: coefficients(m) (or coef), predict(m), residuals(m) (or resid) | *** we can also do these by hand: | ||
**** residuals: mtcars$mpg - m$fitted.values | |||
**** confint: est + 1.96 * c(-1, 1) * se | |||
* plotting residuals: | * plotting residuals: | ||
** hist(residuals(m)) | ** hist(residuals(m)) | ||
Line 17: | Line 22: | ||
* generating nice regression plots: | * generating nice regression plots: | ||
** one of many options: stargazer(m1, m2, type="text") or type="html" | ** one of many options: stargazer(m1, m2, type="text") or type="html" | ||
* interpreting linear models with anova() — i'm not going to walk through the details but the important thing to keep in mind is that although the statistics are different, the p-values are identical! |
Latest revision as of 06:40, 16 February 2017
- correlations
- cor(): works with two variables, or with more!
- cor(method="spearman") is useful if you have non-normally distributed data because it is simply rank correlations)
- fitting a linear model with one variable: lm()
- module formulae, which we've already seen!
- looking at model objects: summary(); m$<tab> or names(m)
- m$fitted.values; m$residuals
- also functions: coefficients(m) (or coef), predict(m), residuals(m) (or resid); confint(m)
- we can also do these by hand:
- residuals: mtcars$mpg - m$fitted.values
- confint: est + 1.96 * c(-1, 1) * se
- plotting residuals:
- hist(residuals(m))
- plot against our x: plot(mtcars$hp, residuals(m)
- QQ-plots with qqnorm(residuals(m))
- doing a plot with ggplot just involves making a dataset: d.fig <- data.frame(hp=mtcars$hp, resids=residuals(m))
- adding controls: just make our formula more complex
- update.formula()
- or just a write a new one
- adding logical variables: no problem!
- adding categorical variables: no problem! (I'll explain interpretation later, but i want you to see that this works!)
- generating nice regression plots:
- one of many options: stargazer(m1, m2, type="text") or type="html"
- interpreting linear models with anova() — i'm not going to walk through the details but the important thing to keep in mind is that although the statistics are different, the p-values are identical!