Latest revision |
Your text |
Line 6: |
Line 6: |
| *** make with factor("mako", "mika", "mako") | | *** make with factor("mako", "mika", "mako") |
| *** you can create factors from characters with as.factor() | | *** you can create factors from characters with as.factor() |
| * lists: like vecotrs but can contains objects of any kind
| | ** also think about: dates with POSIXct(), ordered() — really just a type of factor for ordinal data |
| ** lets say we have two vectors: short.rivers (rivers * 0.5) and normal.rivers (rivers)
| |
| ** construct lists: rivers.list <- list(normal.rivers, short.rivers)
| |
| ** named lists: list(foo=foo, bar=bar), or add names with names() | |
| ** index into lists: use double square brackets like rivers.list[[1]], otherwise they work like lists
| |
| ** index recursively: rivers.list$short.rivers[1]
| |
| ** some function work on lists: boxplot(rivers.list); some don't: hist(rivers)
| |
| * matrix: lets create the table from the homework as a matrix
| |
| ** create from vectors: start with 1:9, then add real numbers: matrix(x, ncol=3)
| |
| * data.frames: ''the'' most important data structure in R. we will be using them '''constantly'''
| |
| **lets explore the faithful data.frame first
| |
| *** head(faithful); colnames(faithful); nrow(faithful); ncol(faithful)
| |
| *** work with the columns faithful$eruptions and faithful$waiting (mean, boxplot, hist)
| |
| *** but the real power is doing bivariate analysis: plot()
| |
| *** dataframes can have more than one columns: mtcars
| |
| ** indexing by numbers: faithful[1,]; faithful[,2]; faithful[1,2], faithful[1:2, 2:3], etc
| |
| * how do we plot things in that space? we use the formula "~" symbol
| |
| ** plot(var1 ~ var2, data=dataframe); boxplot works too
| |
| * making/modifying new dataframes: lets work on a copy of mtcars (call it mako.cars)
| |
| ** several ways: data.frame() is the basic one:
| |
| ** modification/building up: cbind(); rbind(); as.data.frame()
| |
| ** modifying values: d[1,2] <- NA
| |
| ** removing lines, columns d[1,] <- NULL
| |
| ** recoding/transforming data: lets log a column
| |
| ** changing types (lets turn a number into a factor) (e.g., gear)
| |
| ** creating subsets of new data.frames using logical vectors
| |
| * useful functions with data.frames:
| |
| ** is.na()
| |
| ** complete.cases()
| |
| * apply functions: super, useful!
| |
| ** sapply, lapply: lets work on the rivers dataset
| |
| ** apply: more complicated, but can be very useful with matrixes
| |
| * graphing with ggplot2: this is what I use so it's what we'll use moving forward
| |
| ** first, install the package and load it with install.packages() and library()
| |
| ** lets just play around with examples from mtcars
| |
| ** philosophy: a graphics grammar. you start out by using ggplot
| |
| ** ggplot(data=mtcars) + aes(x=hp, y=mpg, color=gear, size=carb) + geom_point()
| |
| * read data from a CSV file: read.csv(); read.delim() can be useful as well! options can be helpful!
| |
| ** library foreign can be very helpful: read.dta(); read.sav(); etc
| |