7 Background skills
This chapter provides information about data science skills that are crucial to completing the activities in the book (aka prerequisites).
7.1 R/Rstudio Installation
The R for Data Science book provides instructions for installing R and Rstudio.
7.2 R skills
The book uses R as the focal programming language and uses the Tidyverse approach when working with and visualizing data. The following functions are commonly used: read_csv
, write_csv
, mutate
, filter
, group_by
, summarize
, ggplot
, select
, pipes (|>
or %>
), pivot_wider
, pivot_longer
, arrange
, left_join
). If you are new to R and Tidyverse there are many great materials on the internet. The Data Carpentry “Data Analysis and Visualization in R for Ecologists” is an excellent starting point for learning. The R for Data Science book is an especially useful reference for learning the Tidyverse commands. Finally, I have created an introductory to tidyverse module for my undergraduate Environmental Data Science class. You can use the module as a “test” of your Tidyverse skills
7.3 Git Skills
You will be required to use Git and GitHub to complete the assignments in the book. In particular, Git and GitHub are uses for the generation and submission of forecasts to the NEON Ecological Forecasting Challenge. Below are instructions for setting up Git and GitHub on your computer.
7.3.1 Setting up Git and GitHub
Create a GitHub user account at https://github.com, if you don’t already have one. Here is advice about choosing a user name, because choosing a good user name is critical.
Go to Rstudio and install the
usethis
package.
install.packages("usethis")
- Run the following command where you replace the user.email and user.name with the email used for GitHub and your GitHub user name. You can learn more about the command here
library(usethis)
use_git_config(user.name = "Jane Doe", user.email = "jane@example.org")
If you get an error at this step it is likely due to your computer not having Git. Follow the instructions here about installing Git
- Set up your GitHub credentials on your computer. Follow the instructions here about using
usethis::create_github_token()
andgitcreds::gitcreds_set()
functions. Also, save your GitHub PAT to a password manager so that you can find it in the future (in case you need to interact with GitHub from a different computer).
If you are having issues (i.e., your computer does not seem to have Git installed), here is an excellent resource to help you debug your git + Rstudio issues.
7.3.2 Working with GitHub: a quarto example
This section provides instructions on working with Git and GitHub in Rstudio in the context of creating, modifying, and rendering a Quarto document.
Go to https://github.com/frec3044/git-rmd-intro. Find the “fork” bottom near the top right. Click “Fork” and tell it to fork to your personal GitHub account.
Go to the repo on your personal GitHub account. It will be something like
https://github.com/[your-user-name]/git-rmd-intro
Under the green “Code” button, select the local tab, and copy the URL link.
Open Rstudio on your computer and create a new project. First, File -> New Project -> Version Control -> Git. Paste the URL from you repo in the first box, hit tab to fill in the repo name in the second, and then use Browse to select where you want the project on your computer (I recommend having a directory on your computer where you keep all repositories we use in the class). If you don’t see a
Version Control
option then you may not have Git installed on your computer (use the instructions here to install Git)Your project will load. Then go to File -> New -> New File -> Quarto Document
In the prompt use Title = “Assignment 1” and Author = [Your name]
Save file as “assignment1.qmd” in the assignment subdirectory of the Project.
Commit your
assignment1.qmd
file using the Git tab at the top right pane using a useful commit message. You will need to check the box for the files that you want to commit. A useful message helps you broadly remember what you did to the files that are included in the commit. The Git tab may not be in the top right panel if you have moved the panels around. If you don’t have the Git tab on pane, then you may not have created a project from GitHub correctly or you do not have Git installed on your computer.Find the Sources / Visual buttons right above the document. Select Source (which is the code view).
Copy the code chunk on lines 21-24 and paste it at end of the document. Change to
echo: TRUE
.Find the following code at the top
format: html:
and change it so that all the necessary files are saved in a single html file.
format:
html:
embed-resources: true
- Find the Render (found above the document) button and click it to render the document to an html document. You will see a file named “assignment1.html” appear. The html is like a webpage version of your code. If you have a directory called
assignment1_files
then you did not do step 15 correctly. - Click on the “assignment1.html” in your “Files” pane and select “View in Web Browser”. Confirm that it looks as expected.
- Commit the updated
.qmd
and new.html
files to git. - Push to your repository on GitHub.
- Go to
https://github.com/[your-user-name]/git-rmd-intro
You should also see your two most recent commits.