Statistics and Statistical Programming (Winter 2017)/R lecture outline: Week 5

From CommunityData
Revision as of 23:02, 2 February 2017 by Benjamin Mako Hill (talk | contribs) (Created page with "as promised, we'll be adding much less each week. first, lets make two datasets: # lets work with rivers. but i'll add some random noise: new.rivers <- rivers: rnorm(n=lengt...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

as promised, we'll be adding much less each week.

first, lets make two datasets:

  1. lets work with rivers. but i'll add some random noise: new.rivers <- rivers: rnorm(n=length(rivers), mean=3, sd=3)
  2. lets also download this file: http://www.openintro.org/stat/data/nc.RData ([https://htmlpreview.github.io/?https://github.com/andrewpbray/oiLabs-base-R/blob/master/inf_for_numerical_data/inf_for_numerical_data.html documentation is here)
  • paired t-test:
    • i'm not going to walk through doing it by hand this week. i trust you can translate the equations in the book into R at this point
    • compare our two rivers datasets using t.test()
  • unpaired t-test with two vectors
    • works with the rivers examples in the same way
    • we can also do it with birthweight boys and girls in the nc.dataset by splitting into two vectors
  • unpaired t-test with the formula notation: t.test(mpg ~ am, data=mtcars) # manual versus automatic transmission
  • anova: aov(), we'll be talking about anova() later!
    • returns a anova object. we can save that and then use the summary() function to give us more useful information