Statistics and Statistical Programming (Winter 2017)/R lecture outline: Week 5: Difference between revisions

From CommunityData
No edit summary
No edit summary
 
Line 2: Line 2:


first, lets make two datasets:
first, lets make two datasets:
# lets work with rivers. but i'll add some random noise: new.rivers <- rivers: rnorm(n=length(rivers), mean=100, sd=100)
# Lets download this dataset of births in North Carolina: http://www.openintro.org/stat/data/nc.RData ([https://htmlpreview.github.io/?https://github.com/andrewpbray/oiLabs-base-R/blob/master/inf_for_numerical_data/inf_for_numerical_data.html documentation is here])
# lets also download this file: http://www.openintro.org/stat/data/nc.RData ([https://htmlpreview.github.io/?https://github.com/andrewpbray/oiLabs-base-R/blob/master/inf_for_numerical_data/inf_for_numerical_data.html documentation is here])
# Lets also continue to work with rivers. I want to first add some random noise: new.rivers <- rivers: rnorm(n=length(rivers), mean=100, sd=100)


* paired t-test:
* unpaired t-test with two vectors: just t.test()
** i'm not going to walk through doing it by hand this week. i trust you can translate the equations in the book into R at this point
** compare our two rivers datasets using t.test()
* unpaired t-test with two vectors
** works with the rivers examples in the same way
** works with the rivers examples in the same way
** we can also do it with birthweight boys and girls in the nc.dataset by splitting into two vectors
** we can also do it with birthweight boys and girls in the nc.dataset by splitting into two vectors
* paired t-tests with t.test(paired=TRUE):
** i'm not going to walk through doing it by hand this week. i trust you can translate the equations in the book into R at this point
* unpaired t-test with the formula notation: t.test(mpg ~ am, data=mtcars) # manual versus automatic transmission
* unpaired t-test with the formula notation: t.test(mpg ~ am, data=mtcars) # manual versus automatic transmission
* anova: aov(), we'll be talking about anova() later!
* anova: aov(), we'll be talking about anova() later!
** returns a anova object. we can save that and then use the summary() function to give us more useful information
** returns an aov object. we can save that and then use the summary() function to give us more useful information
** we can see that the results are very similar with the two group example!
** we can see that the results are very similar with the two group example!



Latest revision as of 02:35, 3 February 2017

as promised, we'll be adding much less each week.

first, lets make two datasets:

  1. Lets download this dataset of births in North Carolina: http://www.openintro.org/stat/data/nc.RData (documentation is here)
  2. Lets also continue to work with rivers. I want to first add some random noise: new.rivers <- rivers: rnorm(n=length(rivers), mean=100, sd=100)
  • unpaired t-test with two vectors: just t.test()
    • works with the rivers examples in the same way
    • we can also do it with birthweight boys and girls in the nc.dataset by splitting into two vectors
  • paired t-tests with t.test(paired=TRUE):
    • i'm not going to walk through doing it by hand this week. i trust you can translate the equations in the book into R at this point
  • unpaired t-test with the formula notation: t.test(mpg ~ am, data=mtcars) # manual versus automatic transmission
  • anova: aov(), we'll be talking about anova() later!
    • returns an aov object. we can save that and then use the summary() function to give us more useful information
    • we can see that the results are very similar with the two group example!

extra good things to know:

  • if statements: i use them often in a function
    • lets make a version of my river modification code above that only adds positive numbers
  • for loops: for (name in list) {}
    • next is useful