Install the package

Install the package fhpredict from GitHub, using the package remotes. Install from the “dev” branch to get the latest version:

# install.packages(remotes)

remotes::install_github("kwb-r/fhpredict@dev", build_vignettes = TRUE)

Users

In the data model of the app everything is assigned to a user identified by a unique identifier, the user id. To get an overview on available users and their associated ids, run:

Bathing Spots

Overview on available bathing spots

Use the function api_get_bathingspot() to get an overview on the bathing spots that are stored in the postgres database. You need to specify the user to whom the bathing spots are associated. See above for how to get a list of available user ids.

By default, only the first bathing spots are considered in the returned data frame. Set the argument limit to a high number so that you get a list of all available bathing spots:

Create a new bathing spot

Delete a bathing spot

Accessing the properties of a bathing spot

The variable spot now contains a list of properties of the selected bathing spot. As the list contains a lot of NULL elements, we remove these elements before looking at the overall structure of the list:

Use the $ operator to access the different properties of the bathing spot:

Polygon defining the “catchment area”

We are especially interested in the coordinates of the polygon that defines the area over which to average the rain data that is assumed to influence the water quality of the bathing spot. This information is stored in the list element area. It can be transformed into a GeoJSON string using the function toJSON() from the jsonlite package. There is also a list element area_coordinates that contains the same coordinates in the form of a two column data frame.

Water quality measurements

The spot object contains the results of water quality measurements in its list element measurements. Use the (non-exported) function flatten_recursive_list to convert the corresponding recursive list into a data frame.

Measurements

Measurements of water quality (concentration of E.coli) are imported from CSV files into the database by the end user. Use the following code to check for which bathing spots measurements are available:

Models

Overview on available models

Use the function api_get_model() to get an overview on the models that are stored in the postgres database. You need to pass the user id (here: 3) and the id of the bathing spot (here: 18) to the function.

The function returns a data frame with one row per available model:

Use this overview on available models to lookup the id of the model that you actually want to fetch from the database.

Fetching a model

Use the model id to fetch a specific model from the database:

Saving a model

For testing purposes we store a simple, small object instead of a STAN model. We use the cars dataset that is shipped with “base R”. Use the function api_add_model() to add the “model” to the database:

The function returns the id of the model that was given by the database. We stored the id in the variable model_id. To check if the model arrived in the database we read it back, again using api_get_model()

We convince ourselves that what we get is identical to what we stored:

Deleting a model

Use the function api_delete_model() to remove a model from the database. Let’s delete the model that we just added. Its id is given in the variable model_id.

Purification Plants

Use the following script to check if there are any purification plants defined. As there are many bathing spots, the script takes quite a long time and is not run here so that you do not see any outputs.

Rains

This chapter describes how to

  • read binary rain data related to a given time interval from files on the Amazon File Server,
  • spatially select and aggregate rain data that are related to a given bathing spot,
  • store the rain data in the Postgres database.

Define bathing spot and reference time:

There is a top-level function that can be used to perform all steps at once. For the single steps that are performed within this function, see below.

In the following we reduce the range of days for which to load rain data to three days. Omitting date_range or setting it to NULL will instead load rain data for the whole range of dates for which measurements are available.

Check that the data arrived by reloading them from the database:

Rains: Details

Read rain data stored for a bathing spot

(rain <- fhpredict::api_get_rain(user_id = 3, spot_id = 43))

Delete all rain data stored for a bathing spot

system.time(fhpredict::api_delete_rain(user_id = 3, spot_id = 43))

Add some fake rain data

n <- 10000
new_rain <- data.frame(
  datum = seq(as.Date("2019-10-10"), by = 1, length.out = n),
  rain = 0.1 * sample(1:10, size = n, replace = TRUE)
)

fhpredict:::is_valid_postgres_api_token(fhpredict:::read_token())

fhpredict:::api_delete_rain(user_id, spot_id)
fhpredict::api_add_rain(user_id = 3, spot_id = 43, rain = new_rain)

Spacially select and aggregate rain data

We want to use only the rain data that lie within a polygon around the bathing spot. This area is a piece of metadata that are stored in the Postgres database for each bathing spot. We first read all metadata about the bathing spot defined above and provide the area information.

In the next step we use the area information to cut the area regions from the rain data that so far comprises whole Germany:

We check the result by plotting the areas. Important note: It seems that not the polygon is cut but the “extent”, i.e. the smallest possible rectangle that contains the polygon!

raster::plot(cropped)

We now calculate the average rain in each raster layer and provide a simple data frame:

Let’s have a look at the created data frame:

Irradiances and Generic Inputs

# Define user and bathing spot -------------------------------------------------
user_id <- 3
spot_id <- 42

# Define artificial time series ------------------------------------------------
n <- 10
timeseries <- data.frame(
  date = seq(as.Date("2018-07-12"), by = 1, length.out = n), 
  dateTime = "12:13:14",
  value = rnorm(n, 100)
)

# Get overview on generic inputs -----------------------------------------------
fhpredict:::api_get_generic(user_id, spot_id)
fhpredict:::api_get_generic(user_id, spot_id, generic_id = 16)

# Delete existing generic inputs -----------------------------------------------
fhpredict:::api_delete_generic(user_id, spot_id, generic_id = 6)

# Delete all generics!
fhpredict:::api_delete_generic(user_id, spot_id)

# Create a new generic input ---------------------------------------------------
fhpredict:::api_add_generic(user_id, spot_id, name = "generic_1")
fhpredict:::api_add_generic(user_id, spot_id, name = "generic_2")
fhpredict:::api_add_generic(user_id, spot_id, name = "generic_3")

# Add measurements to a specific generic input ---------------------------------
fhpredict:::api_add_generic_measurements(user_id, spot_id, 16, data = timeseries)

# Read back the generic input. Where are the measurements?
fhpredict:::postgres_get(
  fhpredict:::path_generic_measurements(user_id, spot_id, 16)
)

fhpredict:::api_get_generic(user_id, spot_id, generic_id = 16)

# Get irradiance measurements --------------------------------------------------
fhpredict:::api_get_irradiances(user_id, spot_id)

# Delete all irradiance measurements -------------------------------------------
fhpredict:::api_delete_irradiances(user_id, spot_id)

# Add irradiance measurements --------------------------------------------------
fhpredict:::api_add_irradiances(user_id, spot_id, data = timeseries)

Get overview on available data

The following function reads all timeseries that are stored for a bathing spot and returns the range of dates covered as well as the number nof data points.

user_id <- 9
spots <- fhpredict::api_get_bathingspot(user_id)
spot_ids <- setNames(spots$id, spots$name)
summaries <- lapply(spot_ids, fhpredict::get_data_summary, user_id = user_id)