Checkout the etherpad for today https://etherpad.wikimedia.org/p/cIVbE45bGm
Install/load the following packages:
library(tidyr)
library(dplyr)
Slides from today
Download the following datasets:
Time to read these data files into R.
# We'll have to make several tweaks here.
counts <- read.csv('datasets/mammal_count_data.csv')
sites <- read.csv('datasets/mammal_site_data.csv')
phys <- read.csv('datasets/mammal_physiology_data.csv')
Post reading data warm-up option 1:
Post reading data warm-up option 2:
You are often going to collect some data that you then want to join up data from multiple tables/data.frames.
Talk about unique keys.
Make unique key for lookup!
Joining A with B
left_join (matches B onto A)
right_join (matches A onto B)
inner_join (retain only matching values in A and B)
full_join (retain all values in all rows)
anti_join (filtering join, no actual joining - get all rows in A, not found in B)
Concept of one to many