7  Background skills

This chapter provides information about data science skills that are crucial to completing the activities in the book (aka prerequisites).

7.1 R/Rstudio Installation

The R for Data Science book provides instructions for installing R and Rstudio.

7.2 R skills

The book uses R as the focal programming language and uses the Tidyverse approach when working with and visualizing data. The following functions are commonly used: read_csv, write_csv, mutate, filter, group_by, summarize, ggplot, select, pipes (|> or %>), pivot_wider, pivot_longer, arrange, left_join). If you are new to R and Tidyverse there are many great materials on the internet. The Data Carpentry “Data Analysis and Visualization in R for Ecologists” is an excellent starting point for learning. The R for Data Science book is an especially useful reference for learning the Tidyverse commands. Finally, I have created an introductory to tidyverse module for my undergraduate Environmental Data Science class. You can use the module as a “test” of your Tidyverse skills

7.3 Git Skills

You will be required to use Git and GitHub to complete the assignments in the book. In particular, Git and GitHub are uses for the generation and submission of forecasts to the NEON Ecological Forecasting Challenge. Below are instructions for setting up Git and GitHub on your computer.

7.3.1 Setting up Git and GitHub

  1. Create a GitHub user account at https://github.com, if you don’t already have one. Here is advice about choosing a user name, because choosing a good user name is critical.

  2. Go to Rstudio and install the usethis package.

install.packages("usethis")
  1. Run the following command where you replace the user.email and user.name with the email used for GitHub and your GitHub user name. You can learn more about the command here
library(usethis)
use_git_config(user.name = "Jane Doe", user.email = "jane@example.org")

If you get an error at this step it is likely due to your computer not having Git. Follow the instructions here about installing Git

  1. Set up your GitHub credentials on your computer. Follow the instructions here about using usethis::create_github_token() and gitcreds::gitcreds_set() functions. Also, save your GitHub PAT to a password manager so that you can find it in the future (in case you need to interact with GitHub from a different computer).

If you are having issues (i.e., your computer does not seem to have Git installed), here is an excellent resource to help you debug your git + Rstudio issues.

7.3.2 Working with GitHub: a quarto example

This section provides instructions on working with Git and GitHub in Rstudio in the context of creating, modifying, and rendering a Quarto document.

  1. Go to https://github.com/frec3044/git-rmd-intro. Find the “fork” bottom near the top right. Click “Fork” and tell it to fork to your personal GitHub account.

  2. Go to the repo on your personal GitHub account. It will be something like https://github.com/[your-user-name]/git-rmd-intro

  3. Under the green “Code” button, select the local tab, and copy the URL link.

  4. Open Rstudio on your computer and create a new project. First, File -> New Project -> Version Control -> Git. Paste the URL from you repo in the first box, hit tab to fill in the repo name in the second, and then use Browse to select where you want the project on your computer (I recommend having a directory on your computer where you keep all repositories we use in the class). If you don’t see a Version Control option then you may not have Git installed on your computer (use the instructions here to install Git)

  5. Your project will load. Then go to File -> New -> New File -> Quarto Document

  6. In the prompt use Title = “Assignment 1” and Author = [Your name]

  7. Save file as “assignment1.qmd” in the assignment subdirectory of the Project.

  8. Commit your assignment1.qmd file using the Git tab at the top right pane using a useful commit message. You will need to check the box for the files that you want to commit. A useful message helps you broadly remember what you did to the files that are included in the commit. The Git tab may not be in the top right panel if you have moved the panels around. If you don’t have the Git tab on pane, then you may not have created a project from GitHub correctly or you do not have Git installed on your computer.

  9. Find the Sources / Visual buttons right above the document. Select Source (which is the code view).

  10. Copy the code chunk on lines 21-24 and paste it at end of the document. Change to echo: TRUE.

  11. Find the following code at the top

format: html:

and change it so that all the necessary files are saved in a single html file.

format:   
  html:
    embed-resources: true
  1. Find the Render (found above the document) button and click it to render the document to an html document. You will see a file named “assignment1.html” appear. The html is like a webpage version of your code. If you have a directory called assignment1_files then you did not do step 15 correctly.
  2. Click on the “assignment1.html” in your “Files” pane and select “View in Web Browser”. Confirm that it looks as expected.
  3. Commit the updated .qmd and new .html files to git.
  4. Push to your repository on GitHub.
  5. Go to https://github.com/[your-user-name]/git-rmd-intro You should also see your two most recent commits.