Skip to contents

Background

One measurement chain consists of six sensors that are placed in one production well, but at different filter depths. In total, three measurement chains will be installed in three different production wells.

Table 1 below lists all individual sensors. The number of the sensor within the well is given in column sensor_endnummer. The identifier of the production well that the sensor is located in is given in column brunnen_nummer.

Each sensor (type: WLF05, TODO: add-link-to-factsheet-on-kwb-cloud) measures two parameters with the following characteristics:

  • electrical conductivity:
    • measurement range: 0 - 20 mS
    • measurement range detection: automatic
    • typical accuracy: +/- 1.5 % of measurement value
  • temperature:
    • measurement range: 0 - 50 Celsius
    • measurement resolution: 0.1 Celsius
    • typical accuracy: <= 0.1 Celsius

The maximum temporal resolution that a sensor can provide is “every second”. Within the GeoSalz project, measurements are taken every 5 minutes. This should be sufficient to detect potential salinity shifts that are caused by changes in the pumping regime.

Data Management

Define Paths

# Define paths to directories, using <placeholder> replacements
paths <- kwb.utils::resolve(list(
  
  # Local temporary directory
  local_dir = fs::path_abs(tempdir()),
  
  # Target directory for downloaded measurement chain files (.csv)
  download_dir = "<local_dir>/download", 
 
  # Local directory for aggregated data (.csv and .zip)
  export_dir = "<local_dir>/export", 
 
  # KWB cloud directory to which data in "export_dir" is uploaded
  upload_dir = "projects/GeoSalz/Monitoring/messketten"
))

# Print all paths
paths
#> $local_dir
#> C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU
#> 
#> $download_dir
#> [1] "C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/download"
#> 
#> $export_dir
#> [1] "C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export"
#> 
#> $upload_dir
#> [1] "projects/GeoSalz/Monitoring/messketten"

Define SFTP Login

Run usethis::edit_r_environ() to edit the .Renviron file. The file defines environment variables that are to be made accessible during an R session. Add the following three rows, defining three more environment variables, to the file. The environment variables are required to log in to the SFTP server from which input data are downloaded.

MESSKETTEN_SERVER=<sftp_url>
MESSKETTEN_USER=<sftp_username>
MESSKETTEN_PASSWORD=<sftp_userpassword>

Replace the placeholders <sftp_url>, <sftp_username>, <sftp_userpassword> with the URL, the user name, and the password, respectively, that are required to get access to the SFTP server.

Save the .Renviron file and restart the R session (e.g. with “Session/Restart R” from the menu in RStudio) to make the environment variables available in R.

In case that the SFTP login credentials are correct, the code below should work. It downloads the measurement chains data (i.e. parameters electrical conductivity and temperature) from the SFTP server to a user-defined directory on your local device.

Data Download

# Metadata of measurement chains (see also Table 1 above)
metadata <- kwb.geosalz::get_measurementchains_metadata()
str(metadata)
#> spc_tbl_ [18 × 10] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
#>  $ galerie               : chr [1:18] "K" "K" "K" "K" ...
#>  $ brunnen_nummer        : int [1:18] 9 9 9 9 9 9 10 10 10 10 ...
#>  $ dn                    : int [1:18] 400 400 400 400 400 400 600 600 600 600 ...
#>  $ einbau_pumpe          : chr [1:18] "über Filter" "über Filter" "über Filter" "über Filter" ...
#>  $ einbau_messkette      : chr [1:18] "unter Pumpe" "unter Pumpe" "unter Pumpe" "unter Pumpe" ...
#>  $ filteroberkante_muGOK : num [1:18] 21 21 21 21 21 21 11 11 11 11 ...
#>  $ filterunterkante_muGOK: num [1:18] 27 27 27 27 27 27 18 18 18 18 ...
#>  $ sensor_id             : int [1:18] 4182018 4182017 4182016 4182015 4182014 4182013 4182006 4182005 4182004 4182003 ...
#>  $ sensor_endnummer      : int [1:18] 8 7 6 5 4 3 6 5 4 3 ...
#>  $ einbau_sensor_muGOK   : num [1:18] 21.5 22.5 23.5 24.5 25.5 26.5 11.5 12.7 13.9 15.1 ...
#>  - attr(*, "spec")=
#>   .. cols(
#>   ..   galerie = col_character(),
#>   ..   brunnen_nummer = col_integer(),
#>   ..   dn = col_integer(),
#>   ..   einbau_pumpe = col_character(),
#>   ..   einbau_messkette = col_character(),
#>   ..   filteroberkante_muGOK = col_double(),
#>   ..   filterunterkante_muGOK = col_double(),
#>   ..   sensor_id = col_integer(),
#>   ..   sensor_endnummer = col_integer(),
#>   ..   einbau_sensor_muGOK = col_double()
#>   .. )
#>  - attr(*, "problems")=<externalptr>

# Information on available measurement chain files on SFTP server
mc_files <- kwb.geosalz::get_measurementchains_files()
#> Warning: Expected 2 pieces. Additional pieces discarded in 480 rows [1754, 1755, 1756,
#> 1757, 1758, 1759, 1760, 1761, 1762, 1763, 1764, 1765, 1766, 1767, 1768, 1769,
#> 1770, 1771, 1772, 1773, ...].
#> Warning: There was 1 warning in `dplyr::mutate()`.
#>  In argument: `sensor_id = as.integer(.data$sensor_id)`.
#> Caused by warning:
#> ! NAs introduced by coercion

str(mc_files)
#> tibble [3,283 × 15] (S3: tbl_df/tbl/data.frame)
#>  $ rights          : chr [1:3283] "-rw-r--r--" "-rw-r--r--" "-rw-r--r--" "-rw-r--r--" ...
#>  $ links           : chr [1:3283] "1" "1" "1" "1" ...
#>  $ ownername       : chr [1:3283] "1011" "1011" "1011" "1011" ...
#>  $ ownergroup      : chr [1:3283] "1002" "1002" "1002" "1002" ...
#>  $ filesize        : chr [1:3283] "4559" "4559" "4559" "4559" ...
#>  $ t1              : chr [1:3283] "Oct" "Oct" "Oct" "Oct" ...
#>  $ t2              : chr [1:3283] "24" "24" "24" "25" ...
#>  $ t3              : chr [1:3283] "09:05" "09:05" "09:05" "06:15" ...
#>  $ sftp_path       : chr [1:3283] "K09/4182013-2022-10-22-0606.csv" "K09/4182013-2022-10-23-0606.csv" "K09/4182013-2022-10-24-0606.csv" "K09/4182013-2022-10-25-0606.csv" ...
#>  $ sensor_id       : int [1:3283] 4182013 4182013 4182013 4182013 4182013 4182013 4182013 4182013 4182013 4182013 ...
#>  $ datum_uhrzeit   : POSIXct[1:3283], format: "2022-10-22 06:06:00" "2022-10-23 06:06:00" ...
#>  $ type            : chr [1:3283] "file" "file" "file" "file" ...
#>  $ galerie         : chr [1:3283] "K" "K" "K" "K" ...
#>  $ brunnen_nummer  : int [1:3283] 9 9 9 9 9 9 9 9 9 9 ...
#>  $ sensor_endnummer: int [1:3283] 3 3 3 3 3 3 3 3 3 3 ...
head(mc_files)
#> # A tibble: 6 × 15
#>   rights     links ownername ownergroup filesize t1    t2    t3    sftp_path    
#>   <chr>      <chr> <chr>     <chr>      <chr>    <chr> <chr> <chr> <chr>        
#> 1 -rw-r--r-- 1     1011      1002       4559     Oct   24    09:05 K09/4182013-…
#> 2 -rw-r--r-- 1     1011      1002       4559     Oct   24    09:05 K09/4182013-…
#> 3 -rw-r--r-- 1     1011      1002       4559     Oct   24    09:05 K09/4182013-…
#> 4 -rw-r--r-- 1     1011      1002       4559     Oct   25    06:15 K09/4182013-…
#> 5 -rw-r--r-- 1     1011      1002       4559     Oct   26    06:15 K09/4182013-…
#> 6 -rw-r--r-- 1     1011      1002       4559     Oct   27    06:15 K09/4182013-…
#> # ℹ 6 more variables: sensor_id <int>, datum_uhrzeit <dttm>, type <chr>,
#> #   galerie <chr>, brunnen_nummer <int>, sensor_endnummer <int>

# Download measurement chain files (.csv) from SFTP server
csv_files <- kwb.geosalz::download_measurementchains_data(
  sftp_paths = mc_files$sftp_path,
  target_directory = paths$download_dir,
  debug = TRUE
)
#> Download 3283 measurement chains files (using 2 CPU cores) ... ok. (1.20 hours)
#> Failed downloading data from the following FTP path(s):
#> K10/4182005_2022-12-13-0541.csv
#> K10/4182005_2022-12-17-0542.csv
#> K10/4182005_2022-12-19-0541.csv
#> K10/4182005_2022-12-20-0541.csv
#> K10/4182005_2022-12-22-0541.csv
#> K10/4182005_2022-12-29-0541.csv
#> K10/4182005_2023-01-07-0541.csv
#> K10/4182005_2023-01-11-0541.csv
#> K10/4182005_2023-01-12-0541.csv
#> K10/4182005_2023-01-17-0541.csv
#> K10/4182005_2023-01-19-0542.csv
#> K10/4182005_2023-01-22-0541.csv
#> K10/4182006_2022-09-27-1302.csv
#> K10/4182006_2022-09-29-0532.csv
#> K10/4182006_2022-09-30-0532.csv
#> K10/4182006_2022-10-01-0631.csv
#> K13/4182012-2023-02-28-0531.csv
#> K13/4182012-2023-03-07-0531.csv
#> K13/4182012-2023-03-09-0531.csv
#> K13/4182012-2023-03-11-0532.csv
#> K13/4182012-2023-03-12-0531.csv
#> K13/4182012-2023-03-13-0531.csv
#> K13/4182012-2023-03-14-0531.csv
#> K13/4182012-2023-03-22-0532.csv
#> K13/4182012-2023-03-23-0531.csv
#> K13/4182012-2023-03-24-0532.csv
#> K13/4182012-2023-03-27-0631.csv
#> K13/4182012-2023-03-29-0632.csv
#> K13/4182012-2023-04-03-0632.csv
#> K13/4182012-2023-04-10-0531.csv
#> K13/4182012-2023-04-12-0531.csv
#> K13/4182012_2022-10-20-0631.csv
#> K13/4182012_2022-10-21-0631.csv

# Print the paths to the downloaded files
csv_files
#> # A tibble: 3,250 × 3
#>    file_id sftp_path                       local_path                           
#>      <int> <chr>                           <chr>                                
#>  1       1 K09/4182013-2022-10-22-0606.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#>  2       2 K09/4182013-2022-10-23-0606.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#>  3       3 K09/4182013-2022-10-24-0606.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#>  4       4 K09/4182013-2022-10-25-0606.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#>  5       5 K09/4182013-2022-10-26-0606.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#>  6       6 K09/4182013-2022-10-27-0606.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#>  7       7 K09/4182013-2022-10-28-0606.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#>  8       8 K09/4182013-2022-10-29-0606.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#>  9       9 K09/4182013-2022-10-30-0506.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#> 10      10 K09/4182013-2022-10-31-0506.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#> # ℹ 3,240 more rows

Data Import

The following code imports the downloaded measurement chains files (.csv) into R:

# Stop in case that no csv file is available
stopifnot(nrow(csv_files) > 0L) 

# Import csv files using multiple CPU cores
mc_data <- kwb.geosalz::read_measurementchains_data(
  csv_files,
  run_parallel = TRUE,
  debug = TRUE
)
#> Importing 3250 measurement chains files (using 2 CPU cores) ... ok. (40.89 secs) 
#> Filtering out 'lab' measurements before '2022-09-27 11:00:00' (installation in K10) ... ok. (0.01 secs)

The following datasets were imported into R:

mc_data_stats <- mc_data %>%
  kwb.geosalz::get_measurmentchains_data_stats() %>% 
  dplyr::arrange(
    .data$parameter, 
    dplyr::desc(.data$sensor_id)
  )

These cover the time period from 2022-09-27 11:00:00 to 2023-04-14 23:00:00 with a total of 108820 samples.

Data Export

debug <- TRUE

# Export "mc_data" to csv file
data_csv_path <- kwb.geosalz::write_measurementchains_data(
  mc_data,
  target_directory = paths$export_dir,
  to_zip = FALSE,
  debug = debug
)
#> Exporting provided dataset 'mc_data' to 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_20220927-110000TZ+01_20230414-230000TZ+01.csv' ... ok. (1.14 secs)

size_data_csv <- fs::file_size(data_csv_path)

# Export "mc_data" to zip file (~10x less disk space for test dataset)
data_zip_path <- kwb.geosalz::write_measurementchains_data(
  mc_data,
  target_directory = paths$export_dir,
  to_zip = TRUE,
  debug = debug
)
#> Exporting provided dataset 'mc_data' to 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_20220927-110000TZ+01_20230414-230000TZ+01.csv' ... ok. (1.12 secs) 
#> Exporting provided dataset 'mc_data' to 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data.zip' and delete intermediate 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_20220927-110000TZ+01_20230414-230000TZ+01.csv' ... ok. (0.11 secs)

size_data_zip <- fs::file_size(data_zip_path)
size_data_csv / size_data_zip
#> 17.1

extract_data_timeperiod <- function(data_path) {
  basename(data_path) %>% 
    kwb.utils::replaceFileExtension("") %>%  
    stringr::str_remove("^mc[-|_]data")
}

# Define helper function to write a csv file in a target directory
write_csv <- function(df, postfix = "", target_dir) {
  fs::dir_create(target_dir)
  name <- deparse(substitute(df))
  file <- file.path(target_dir, paste0(name, postfix,  ".csv"))
  readr::write_csv(df, file)
  file
}

postfix <- extract_data_timeperiod(data_zip_path)
target_dir <- paths$export_dir

# Export data and metadata to csv files
stats_path <- write_csv(mc_data_stats, postfix, target_dir)
metadata_path <- write_csv(metadata, postfix, target_dir)
files_path <- write_csv(mc_files, postfix, target_dir)

# Define function that plots data to a pdf file
plot_to_pdf <- function(mc_data, para, target_dir = ".", debug = TRUE) {
  
  plots <- kwb.geosalz::plot_measurementchains(mc_data, para)
  
  sapply(names(plots), function(name) {
    path <-  file.path(target_dir, sprintf("mc_data_%s.pdf", name))
    kwb.utils::catAndRun(
      sprintf("Writting '%s' to '%s'", name, path),
      expr = {
        kwb.utils::preparePdf(path, width.cm = 25, height.cm = 15)
        print(plots[[name]])
        dev.off()
        path 
      }, 
      dbg = debug
    )
  })
}

pdf_files <- c(
  plot_to_pdf(mc_data, "Leitfaehigkeit", paths$export_dir), 
  plot_to_pdf(mc_data, "Temperatur", paths$export_dir)
)
#> Writting 'Leitfaehigkeit_K10' to 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Leitfaehigkeit_K10.pdf' ...
#> ok. (1.19 secs) 
#> Writting 'Leitfaehigkeit_K13' to 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Leitfaehigkeit_K13.pdf' ...
#> ok. (0.72 secs) 
#> Writting 'Leitfaehigkeit_K09' to 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Leitfaehigkeit_K09.pdf' ...
#> ok. (0.75 secs) 
#> Writting 'Temperatur_K10' to 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Temperatur_K10.pdf' ...
#> ok. (1.19 secs) 
#> Writting 'Temperatur_K13' to 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Temperatur_K13.pdf' ...
#> ok. (0.66 secs) 
#> Writting 'Temperatur_K09' to 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Temperatur_K09.pdf' ...
#> ok. (0.75 secs)

data_zip_path
#> [1] "C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data.zip"

The file size of the mc_data_20220927-110000TZ+01_20230414-230000TZ+01.csv file is 17.1 times larger compared to compressing it into mc_data.zip.

Data Upload

In order to automate the data management process the exported data these data will be uploaded to a restricted shared folder on the KWB cloud. For doing so the following environment variables need to be defined in case the code below should be run from a client computer. However, as these variables are defined in a GitHub actions workflow there is no need to do this locally.

NEXTCLOUD_URL = "https://<replace-with-kwb-cloud-url>"
NEXTCLOUD_USER = "<your-kwb-cloud-username>" # your username
NEXTCLOUD_PASSWORD = "your-nextcloud-app-password" ### see details below

Subsequently the following code is run for uploading the data:


# Upload all files in "paths$export_dir"
#upload_files <- list.files(paths$export_dir, full.names = TRUE)


# List paths of files to upload explicitly
upload_files <- c(
  stats_path, 
  pdf_files,
  metadata_path, 
  files_path,
  data_zip_path
)

for (file in upload_files) {
  
  if (!file.exists(file)) {
    message("File does not exist: ", file)
    next
  }
  
  kwb.utils::catAndRun(
    messageText = paste("Uploading file", file),
    expr = try(kwb.nextcloud::upload_file(
      file = file, 
      target_path = paths$upload_dir
    )),
    dbg = TRUE
  )
}
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_stats.csv ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#>   C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_stats.csv
#> to
#>   projects/GeoSalz/Monitoring/messketten ... ok. (1.10 secs) 
#> ok. (2.43 secs) 
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Leitfaehigkeit_K10.pdf ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#>   C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Leitfaehigkeit_K10.pdf
#> to
#>   projects/GeoSalz/Monitoring/messketten ... ok. (2.04 secs) 
#> ok. (2.42 secs) 
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Leitfaehigkeit_K13.pdf ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#>   C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Leitfaehigkeit_K13.pdf
#> to
#>   projects/GeoSalz/Monitoring/messketten ... ok. (1.18 secs) 
#> ok. (1.55 secs) 
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Leitfaehigkeit_K09.pdf ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#>   C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Leitfaehigkeit_K09.pdf
#> to
#>   projects/GeoSalz/Monitoring/messketten ... ok. (1.22 secs) 
#> ok. (1.60 secs) 
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Temperatur_K10.pdf ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#>   C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Temperatur_K10.pdf
#> to
#>   projects/GeoSalz/Monitoring/messketten ... ok. (1.60 secs) 
#> ok. (1.97 secs) 
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Temperatur_K13.pdf ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#>   C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Temperatur_K13.pdf
#> to
#>   projects/GeoSalz/Monitoring/messketten ... ok. (1.39 secs) 
#> ok. (1.77 secs) 
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Temperatur_K09.pdf ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#>   C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Temperatur_K09.pdf
#> to
#>   projects/GeoSalz/Monitoring/messketten ... ok. (1.22 secs) 
#> ok. (1.60 secs) 
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/metadata.csv ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#>   C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/metadata.csv
#> to
#>   projects/GeoSalz/Monitoring/messketten ... ok. (1.17 secs) 
#> ok. (1.54 secs) 
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_files.csv ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#>   C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_files.csv
#> to
#>   projects/GeoSalz/Monitoring/messketten ... ok. (1.39 secs) 
#> ok. (1.77 secs) 
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data.zip ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#>   C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data.zip
#> to
#>   projects/GeoSalz/Monitoring/messketten ... ok. (1.65 secs) 
#> ok. (2.05 secs)