About this page
This page presents a collection of tables organizing commonly used commands from various statistical programming languages. The tables are organized according to Hadley Wickham’s model for a data science program:
This is a working document which will be updated throughout the course. If there are functions here you are unfamiliar with it would be a good idea to read the documentation in the corresponding languages.
Importing Data
import data in native formats |
load() (.RData), data() |
- |
use (.dta), webuse, sysuse |
saving data in native formats |
save |
- |
save |
import delimited data (i.e. csv, tsv) |
read.csv(), read.table() |
readr: read_delim() |
import delimited |
write/export delimited data |
write.csv(), write.table() |
readr: write_delim() |
export delimited |
Tidy
reshape data from wide to long |
reshape |
tidyr: gather, spread |
reshape |
Model
linear models |
lm |
- |
regress |
generalized linear models |
glm(), family() |
- |
logit/logistic, poisson, etc. |
Visualize
scatter plots |
plot(x,y) |
ggplot2: ggplot(aes(x=x,y=y)) + geom_point() |
twoway scatter |
Course Homepage