Data Wrangling with dplyr

A Presentation for Weill Cornell Medicine’s Biostatistics Computing Club Image courtesy of Allison Horst’s Twitter: @allison_horst Introduction Why dplyr? Powerful but efficient Consistent syntax Fast Function chaining Works well with entire tidyverse suite Efficiency* Simple syntax Function chaining Ability to analyze external databases Works well with other packages in tidyverse suite ggplot2 tidyr stringr forcats purrr *if you start dealing with data sets with > 1 million rows, data.