Statistics and Statistical Programming (Winter 2017)/R lecture outline: Week 1: Difference between revisions
From CommunityData
(12 intermediate revisions by 2 users not shown) | |||
Line 16: | Line 16: | ||
** saving numbers to variables: cups.of.flour <- 2 | ** saving numbers to variables: cups.of.flour <- 2 | ||
** special variables built in: pi (we'll see many more) | ** special variables built in: pi (we'll see many more) | ||
** variables can be set to anything! | |||
** there's also one special thing: NA (no quotes!) which means missing | |||
* types of variables | * types of variables | ||
** numeric: we've already seen, with or without the decimal point | ** numeric: we've already seen, with or without the decimal point | ||
** character: name <- "mako" (uses single or double quotes) | ** character: name <- "mako" (uses single or double quotes) | ||
** logical: TRUE or FALSE (all caps) | ** logical: TRUE or FALSE (all caps) | ||
* functions: contains | * functions: contains parentheses right after the variable name | ||
** functions take some input (called an argument) and provide some output (called the output or something the return value) — both are optional! | |||
*** some arguments are named (meaning that they have "foo=" or similar before them. mostly names are optional) | |||
** the most important function: help() | |||
** another useful function to clean up our messes: rm() or remove() | |||
** there are many built in functions including: | ** there are many built in functions including: | ||
*** sqrt() | *** sqrt() | ||
Line 28: | Line 34: | ||
*** ls() | *** ls() | ||
*** check your reference card for many, many more | *** check your reference card for many, many more | ||
* vectors: you can think of a vector as like a list of things that are all the same | * vectors: you can think of a vector as like a list of things that are all the same type (lists, which will come to letter, actually refer to lists of things that might be of different types!) | ||
** in R, all variables are vectors! although many have just one thing in them! that's why it prints out [1] next to every numbers | ** in R, all variables are vectors! although many have just one thing in them! that's why it prints out [1] next to every numbers | ||
** you can make vectors with a special function: c(), like ages <- c(36, 4, 35) | ** you can make vectors with a special function: c(), like ages <- c(36, 4, 35) | ||
** vectors can be of any type but they have to one type: c("mako", "mika") | ** vectors can be of any type but they have to one type: c("mako", "mika") | ||
** if you mix vectors together, they will be "coerced"(!) | ** if you mix vectors together, they will be "coerced"(!) | ||
** slicing or indexing: | |||
*** basic syntax: ages[1]; ages[2] | |||
*** more complex: ages[1:2] | |||
*** assignment through indexing: ages[1] <- 20 | |||
** most math operators operate on vectors with ''recycling'': ages * 2; ages - 3 | |||
** vectors can names for elements! we can set those with names(): | ** vectors can names for elements! we can set those with names(): | ||
*** names(ages) | *** names(ages) | ||
*** names(ages) <- c("mako", "atom", "mika") | *** names(ages) <- c("mako", "atom", "mika") | ||
*** once we do that, we can index with names: ages["mako"] | |||
** many functions are particularly useful on vectors with multiple elements: | ** many functions are particularly useful on vectors with multiple elements: | ||
*** sum() | *** some functions return a single item: sum(); mean(); sd(); median(); var(); length() | ||
*** | *** some return vectors: sort(); head(); range(); | ||
*** | *** some functions return other things: table(); summary() | ||
*** | * using logical vectors to index and recode data: | ||
** | ** comparison operators will return logical variables: rivers > 300; rivers < 300; rivers <= 320; rivers == 210; rivers != 210 | ||
* installing new | ** indexing with logicals: rivers[rivers > 300] | ||
** recoding data: my.rivers <- rivers; rivers[rivers < 300] <- NA | |||
* basic plotting and visualization: | |||
** boxplot() — boxplots | |||
** hist() — draw histograms | |||
* creating/saving files | |||
** running things in the console | |||
* installing new packages and loading new datasets: | |||
** the simplest way is with load() | |||
** install.packages("UsingR") | ** install.packages("UsingR") | ||
** install.packages("openintro") | ** install.packages("openintro") | ||
*** library(UsingR) no quotes! | *** library(UsingR) no quotes! | ||
Second lecture on GitHub and saving files: | |||
* creating/saving files | |||
** creating saving R scripts in RStudio | |||
** running things in the console (Ctrl-Enter) | |||
** copying things from the console (and vice versa) | |||
* github | |||
** how version control, git, github works | |||
*** working directors, the role of the github desktop client, and the github website! | |||
*** just an interface between your working directory and the website | |||
** walk through example of saving something and publishing it in github | |||
* other sources of help: | |||
** built in documentation | |||
** StackOverflow | |||
** R reference card |
Latest revision as of 04:02, 5 January 2017
Lecture Outline[edit]
Intro to R and basic variables types:
- using R as a calculator:
- addition: 2 + 2
- subtraction: 2 - 3
- multiplication: 5 * 4
- division: 5/2
- more complicated stuff: use parentheses!
- powers: 2^2; 2^3
- variables
- the basic concept and how they work
- syntax for assignment: use <- (although = equals too, it's not idiomatic R)
- what makes a valid variables name: starts with a letter, contains letters and numbers; case is important; instead of spaces, use "." (not _ as in Python, although _ will usually work too)
- saving numbers to variables: cups.of.flour <- 2
- special variables built in: pi (we'll see many more)
- variables can be set to anything!
- there's also one special thing: NA (no quotes!) which means missing
- types of variables
- numeric: we've already seen, with or without the decimal point
- character: name <- "mako" (uses single or double quotes)
- logical: TRUE or FALSE (all caps)
- functions: contains parentheses right after the variable name
- functions take some input (called an argument) and provide some output (called the output or something the return value) — both are optional!
- some arguments are named (meaning that they have "foo=" or similar before them. mostly names are optional)
- the most important function: help()
- another useful function to clean up our messes: rm() or remove()
- there are many built in functions including:
- sqrt()
- log()
- log1p() — super useful!
- class() — tells you what type of variable you have
- ls()
- check your reference card for many, many more
- functions take some input (called an argument) and provide some output (called the output or something the return value) — both are optional!
- vectors: you can think of a vector as like a list of things that are all the same type (lists, which will come to letter, actually refer to lists of things that might be of different types!)
- in R, all variables are vectors! although many have just one thing in them! that's why it prints out [1] next to every numbers
- you can make vectors with a special function: c(), like ages <- c(36, 4, 35)
- vectors can be of any type but they have to one type: c("mako", "mika")
- if you mix vectors together, they will be "coerced"(!)
- slicing or indexing:
- basic syntax: ages[1]; ages[2]
- more complex: ages[1:2]
- assignment through indexing: ages[1] <- 20
- most math operators operate on vectors with recycling: ages * 2; ages - 3
- vectors can names for elements! we can set those with names():
- names(ages)
- names(ages) <- c("mako", "atom", "mika")
- once we do that, we can index with names: ages["mako"]
- many functions are particularly useful on vectors with multiple elements:
- some functions return a single item: sum(); mean(); sd(); median(); var(); length()
- some return vectors: sort(); head(); range();
- some functions return other things: table(); summary()
- using logical vectors to index and recode data:
- comparison operators will return logical variables: rivers > 300; rivers < 300; rivers <= 320; rivers == 210; rivers != 210
- indexing with logicals: rivers[rivers > 300]
- recoding data: my.rivers <- rivers; rivers[rivers < 300] <- NA
- basic plotting and visualization:
- boxplot() — boxplots
- hist() — draw histograms
- creating/saving files
- running things in the console
- installing new packages and loading new datasets:
- the simplest way is with load()
- install.packages("UsingR")
- install.packages("openintro")
- library(UsingR) no quotes!
Second lecture on GitHub and saving files:
- creating/saving files
- creating saving R scripts in RStudio
- running things in the console (Ctrl-Enter)
- copying things from the console (and vice versa)
- github
- how version control, git, github works
- working directors, the role of the github desktop client, and the github website!
- just an interface between your working directory and the website
- walk through example of saving something and publishing it in github
- how version control, git, github works
- other sources of help:
- built in documentation
- StackOverflow
- R reference card