10  Automated, iterative forecasting

In the third forecast generation assignment, you will be automating the submission of your forecast. By running your forecast code each day, you will be updating your model with the latest data, which is a simple form of data assimilation.

Automation tutorial

You will be generating forecasts of water temperature at NEON aquatics sites.

10.1 Key concepts

GitHub Actions is used to execute the forecast each day in the “cloud”. The “cloud” are small computers that are run somewhere in a Microsoft data center (Microsoft owns GitHub). You create a yaml file in the .github/workflows in your github repo that defines the time of day that your job is run, the environment (type of operating system and Docker container), and the steps in the job.

You will use Docker container (eco4cast/neon4cast-rocker) to ensure consistent computation environment each day. The Docker is based on a rocker container that already has R, Rstudio, and many common packages (e.g., tidyverse). It also contains packages that are specific to the NEON Forecasting Challenge.

Pulling data include the Docker container and export the results. Nothing persists in container after it finishes in GitHub Actions. You will use the neon4cast::stage2 and neon4cast::stage3 to read in the meteorology data. You will use read_csv and a weblink to read in the targets. You will export your forecasts using the neon4cast::submit function that uploads your forecast to the NEON Ecological Forecasting Challenge S3 submission bucket.

10.2 Docker Skills

Docker is tool that the activities in the book use for improving the reproduciblity and automation of data analysis and forecasting workflows. Below are the instructions for setting up and interacting with a Docker container (instructions are from Freya Olsson’s workshop)

Go to https://docs.docker.com/get-docker/ to install the relevant install for your platform (available for PC, Mac and Linux). Also see https://docs.docker.com/desktop/.

NOTE: * If you’re running Windows, you will need WSL (Windows Subsystem for Linux) * If you’re running a Linux distribution, you may have to enable Viritualization on your computer (see here)

10.3 Running a docker container

  1. Launch Docker Desktop (either from the Command Line or by starting the GUI)
  2. At the command line run the following command which tells docker to run the container with the name eco4cast/rocker-neon4cast that has all the packages and libraries installed already. The PASSWORD=yourpassword sets a simple password that you will use to open the container. The -ti option starts both a terminal and an interactive session.
docker run --rm -ti -e PASSWORD=yourpassword -p 8787:8787 eco4cast/rocker-neon4cast

This can take a few minutes to download and install. It will be quicker the next time you launch it.

  1. Open up a web browser and navigate to http://localhost:8787/
  2. Enter the username: rstudio and password: yourpassword
  3. You should see a R Studio interface with all the packages etc. pre-installed and ready to go.

You can close this localhost window (and then come back to it) but if you close the container from Docker (turn off your computer etc.) any changes will be lost unless you push them to Github or exported to your local environment.

10.4 Reading

Thomas, R. Q., Boettiger, C., Carey, C. C., Dietze, M. C., Johnson, L. R., Kenney, M. A., et al. (2023). The NEON Ecological Forecasting Challenge. Frontiers in Ecology and the Environment, 21(3), 112–113. https://doi.org/10.1002/fee.2616

10.5 Assignment

Pre-assignment set up: Complete Chapter 9

  1. Convert your template markdown code to a .R script
  2. Update your repository to include the GitHub Action files
  3. Commit updated files to GitHub
  4. Provide instructor with link to GitHub repository that demonstrates multiple successful automated execution of forecasting code.

10.6 Module reference

Olsson, F., C. Boettiger, C.C. Carey, M.E. Lofton, and R.Q. Thomas. Can you predict the future? A tutorial for the National Ecological Observatory Network Ecological Forecasting Challenge. In review at Journal of Open Source Education.