class: center, middle, inverse, title-slide .title[ # Reproducible workflow ] .author[ ### INFO 5940
Cornell University ] --- class: inverse, middle # A holistic workflow --- ## Workspace * Libraries with `library()` * User-created objects --- ## Pets or cattle? <img src="/img/pets-cattle.jpeg" width="80%" style="display: block; margin: auto;" /> -- .task[Think of your R processes as livestock, not pets.] --- class: middle <img src="/img/if-you-liked-it-you-should-have-saved-the-source-for-it.jpg" width="80%" style="display: block; margin: auto;" /> --- ## Save code, not workspace * Enforces reproducibility * Easy to regenerate on-demand * Always save commands * Always start R with a blank state * Restart R often --- ## Bad approaches ```r rm(list = ls()) ``` * Good intent, but poor execution * Only deletes user-created objects * Enforces hidden dependencies on things you ran before `rm(list = ls())` --- class: middle, center <iframe width="800" height="500" src="https://www.youtube.com/embed/GiPe1OiKQuk?start=7" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> --- ## Avoid unknown unknowns * Write every script like its running in a fresh process * Best way to ensure this: write every script in a fresh process -- * Storing computationally demanding output * `cache = TRUE` * `write_rds()` & `read_rds()` --- class: inverse, middle # Project-based workflows --- ## How to store work * Split work into projects * **We already do this** * But why? --- ## `setwd()` ```r library(tidyverse) setwd("/Users/bensoltoff/cuddly_broccoli/verbose_funicular/foofy/data") foofy <- read_csv("raw_foofy_data.csv") p <- ggplot(foofy, aes(x, y)) + geom_point() ggsave("../figs/foofy_scatterplot.png") ``` --- ## Project-based workflow * File system discipline * Working directory intentionality * File path discipline -- ## Rationale for workflow * Ensures portability * Reliable, polite behavior -- ## RStudio Projects * `.Rproj` --- ## Use safe filepaths * Avoid `setwd()` * Split work into projects * Declare each folder as a project * Use `here()` --- class: small ## `here::here()` ```r library(here) here() ``` ``` ## [1] "/Users/soltoffbc/Projects/Computing for Social Sciences/course-site" ``` -- * Build a file path ```r here("static", "extras", "awesome.txt") ## [1] "/Users/soltoffbc/Projects/Computing for Social Sciences/course-site/static/extras/awesome.txt" cat(readLines(here("static", "extras", "awesome.txt"))) ## OMG this is so awesome! ``` -- * What if we change the working directory? ```r setwd(here("static")) getwd() ## [1] "/Users/soltoffbc/Projects/Computing for Social Sciences/course-site/static" cat(readLines(here("static", "extras", "awesome.txt"))) ## OMG this is so awesome! ``` --- ## Filepaths and R Markdown ``` data/ scotus.csv analysis/ exploratory-analysis.Rmd final-report.Rmd scotus.Rproj ``` -- * `.Rmd` and assumption of working directory * Run `read_csv("data/scotus.csv")` * Run `read_csv(here("data", "scotus.csv"))` --- ## Here's a GIF of Nicolas Cage <img src="https://media.giphy.com/media/l2Je5sSem0BybIKJi/giphy.gif" width="80%" style="display: block; margin: auto;" />
12
:
00
--- class: inverse, middle # Personal R admin --- ## R startup procedures * Customized startup * `.Renviron` * `.Rprofile` --- ## `.Renviron` * Define sensitive information * Set R specific environmental variables * Does not use R code syntax * `usethis::edit_r_environ()` -- ## Example `.Renviron` ```shell R_HISTSIZE=100000 GITHUB_PAT=abc123 R_LIBS_USER=~/R/%p/%v ``` --- ## `.Rprofile` * R code to run when R starts up * Runs after `.Renviron` * Multiple `.Rprofile` files * Home directory (`~/.Rprofile`) * Each R Project folder * `usethis::edit_r_profile()` --- ## Common items in `.Rprofile` 1. Set a default CRAN mirror 1. Write a welcome message 1. Customize their R prompt 1. Change options, screen width, numeric display 1. Store API keys/tokens that are necessary for only a single project --- ## Git tracking of `.Rprofile` <img src="https://media.giphy.com/media/13e1PQJrKtqYKyO0FY/giphy.gif" width="80%" style="display: block; margin: auto;" /> --- ## A couple of things America got right: [cars and freedom](https://www.youtube.com/watch?v=OnQXRxW9VcQ) <img src="https://media.giphy.com/media/Sd8uqMJqpGpP2/giphy.gif" width="80%" style="display: block; margin: auto;" />
05
:
00