Statistics and Statistical Programming (Winter 2017)/R lecture outline: Week 1: Difference between revisions
From CommunityData
Line 16: | Line 16: | ||
** saving numbers to variables: cups.of.flour <- 2 | ** saving numbers to variables: cups.of.flour <- 2 | ||
** special variables built in: pi (we'll see many more) | ** special variables built in: pi (we'll see many more) | ||
** variables can be set to anything! | |||
** there's also one special thing: NA (no quotes!) which means missing | |||
* types of variables | * types of variables | ||
** numeric: we've already seen, with or without the decimal point | ** numeric: we've already seen, with or without the decimal point | ||
Line 33: | Line 35: | ||
** vectors can be of any type but they have to one type: c("mako", "mika") | ** vectors can be of any type but they have to one type: c("mako", "mika") | ||
** if you mix vectors together, they will be "coerced"(!) | ** if you mix vectors together, they will be "coerced"(!) | ||
** slicing | ** slicing or indexing: | ||
*** basic syntax: ages[1]; ages[2] | *** basic syntax: ages[1]; ages[2] | ||
*** more complex: ages[1:2] | *** more complex: ages[1:2] | ||
*** assignment through indexing: ages[1] <- 20 | |||
** vectors can names for elements! we can set those with names(): | ** vectors can names for elements! we can set those with names(): | ||
*** names(ages) | *** names(ages) | ||
Line 42: | Line 45: | ||
** many functions are particularly useful on vectors with multiple elements: | ** many functions are particularly useful on vectors with multiple elements: | ||
*** sum() | *** sum() | ||
*** mean() | *** mean(); sd(); median(); var(); IQR() etc() | ||
*** length() | *** length() | ||
*** head() | *** head() | ||
*** table() | |||
* more advanced variables types: | |||
** factors: for categorical data | |||
*** make with factor("mako", "mika", "mako") | |||
*** you can create factors from characters with as.factor() | |||
** also think about: dates with POSIXct(), ordered() — really just a type of factor for ordinal data | |||
* using logical vectors to index and recode data: | |||
** comparison operators will return logical variables: rivers > 300; rivers < 300; rivers <= 320; rivers == 210; rivers != 210 | |||
** indexing with logicals: rivers[rivers > 300] | |||
** recoding data: my.rivers <- rivers; rivers[rivers < 300] <- NA | |||
* basic plotting and visualization: | |||
** boxplot() — boxplots | |||
** hist() — draw histograms | |||
** density() — density plots | |||
* installing new pacakges and loading new datasets: | * installing new pacakges and loading new datasets: | ||
** install.packages("UsingR") | ** install.packages("UsingR") | ||
** install.packages("openintro") | ** install.packages("openintro") | ||
*** library(UsingR) no quotes! | *** library(UsingR) no quotes! |
Revision as of 19:31, 3 January 2017
Lecture Outline
Intro to R and basic variables types:
- using R as a calculator:
- addition: 2 + 2
- subtraction: 2 - 3
- multiplication: 5 * 4
- division: 5/2
- more complicated stuff: use parentheses!
- powers: 2^2; 2^3
- variables
- the basic concept and how they work
- syntax for assignment: use <- (although = equals too, it's not idiomatic R)
- what makes a valid variables name: starts with a letter, contains letters and numbers; case is important; instead of spaces, use "." (not _ as in Python, although _ will usually work too)
- saving numbers to variables: cups.of.flour <- 2
- special variables built in: pi (we'll see many more)
- variables can be set to anything!
- there's also one special thing: NA (no quotes!) which means missing
- types of variables
- numeric: we've already seen, with or without the decimal point
- character: name <- "mako" (uses single or double quotes)
- logical: TRUE or FALSE (all caps)
- functions: contains parens right after the variable name
- there are many built in functions including:
- sqrt()
- log()
- log1p() — super useful!
- class() — tells you what type of variable you have
- ls()
- check your reference card for many, many more
- there are many built in functions including:
- vectors: you can think of a vector as like a list of things that are all the same time (lists, which will come to letter, actually refer to lists of things that might be of different types!)
- in R, all variables are vectors! although many have just one thing in them! that's why it prints out [1] next to every numbers
- you can make vectors with a special function: c(), like ages <- c(36, 4, 35)
- vectors can be of any type but they have to one type: c("mako", "mika")
- if you mix vectors together, they will be "coerced"(!)
- slicing or indexing:
- basic syntax: ages[1]; ages[2]
- more complex: ages[1:2]
- assignment through indexing: ages[1] <- 20
- vectors can names for elements! we can set those with names():
- names(ages)
- names(ages) <- c("mako", "atom", "mika")
- once we do that, we can index with names: ages["mako"]
- many functions are particularly useful on vectors with multiple elements:
- sum()
- mean(); sd(); median(); var(); IQR() etc()
- length()
- head()
- table()
- more advanced variables types:
- factors: for categorical data
- make with factor("mako", "mika", "mako")
- you can create factors from characters with as.factor()
- also think about: dates with POSIXct(), ordered() — really just a type of factor for ordinal data
- factors: for categorical data
- using logical vectors to index and recode data:
- comparison operators will return logical variables: rivers > 300; rivers < 300; rivers <= 320; rivers == 210; rivers != 210
- indexing with logicals: rivers[rivers > 300]
- recoding data: my.rivers <- rivers; rivers[rivers < 300] <- NA
- basic plotting and visualization:
- boxplot() — boxplots
- hist() — draw histograms
- density() — density plots
- installing new pacakges and loading new datasets:
- install.packages("UsingR")
- install.packages("openintro")
- library(UsingR) no quotes!