Background
One measurement chain consists of six sensors that are placed in one production well, but at different filter depths. In total, three measurement chains will be installed in three different production wells.
Table 1 below lists all individual sensors. The number of the sensor
within the well is given in column sensor_endnummer
. The
identifier of the production well that the sensor is located in is given
in column brunnen_nummer
.
Each sensor (type: WLF05, TODO: add-link-to-factsheet-on-kwb-cloud) measures two parameters with the following characteristics:
- electrical conductivity:
- measurement range: 0 - 20 mS
- measurement range detection: automatic
- typical accuracy: +/- 1.5 % of measurement value
- temperature:
- measurement range: 0 - 50 Celsius
- measurement resolution: 0.1 Celsius
- typical accuracy: <= 0.1 Celsius
The maximum temporal resolution that a sensor can provide is “every second”. Within the GeoSalz project, measurements are taken every 5 minutes. This should be sufficient to detect potential salinity shifts that are caused by changes in the pumping regime.
Data Management
Define Paths
# Define paths to directories, using <placeholder> replacements
paths <- kwb.utils::resolve(list(
# Local temporary directory
local_dir = fs::path_abs(tempdir()),
# Target directory for downloaded measurement chain files (.csv)
download_dir = "<local_dir>/download",
# Local directory for aggregated data (.csv and .zip)
export_dir = "<local_dir>/export",
# KWB cloud directory to which data in "export_dir" is uploaded
upload_dir = "projects/GeoSalz/Monitoring/messketten"
))
# Print all paths
paths
#> $local_dir
#> C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU
#>
#> $download_dir
#> [1] "C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/download"
#>
#> $export_dir
#> [1] "C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export"
#>
#> $upload_dir
#> [1] "projects/GeoSalz/Monitoring/messketten"
Define SFTP Login
Run usethis::edit_r_environ()
to edit the
.Renviron
file. The file defines environment variables that
are to be made accessible during an R session. Add the following three
rows, defining three more environment variables, to the file. The
environment variables are required to log in to the SFTP server from
which input data are downloaded.
MESSKETTEN_SERVER=<sftp_url>
MESSKETTEN_USER=<sftp_username>
MESSKETTEN_PASSWORD=<sftp_userpassword>
Replace the placeholders <sftp_url>
,
<sftp_username>
,
<sftp_userpassword>
with the URL, the user name, and
the password, respectively, that are required to get access to the SFTP
server.
Save the .Renviron
file and restart the R session
(e.g. with “Session/Restart R” from the menu in RStudio) to make the
environment variables available in R.
In case that the SFTP login credentials are correct, the code below should work. It downloads the measurement chains data (i.e. parameters electrical conductivity and temperature) from the SFTP server to a user-defined directory on your local device.
Data Download
# Metadata of measurement chains (see also Table 1 above)
metadata <- kwb.geosalz::get_measurementchains_metadata()
str(metadata)
#> spc_tbl_ [18 × 10] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
#> $ galerie : chr [1:18] "K" "K" "K" "K" ...
#> $ brunnen_nummer : int [1:18] 9 9 9 9 9 9 10 10 10 10 ...
#> $ dn : int [1:18] 400 400 400 400 400 400 600 600 600 600 ...
#> $ einbau_pumpe : chr [1:18] "über Filter" "über Filter" "über Filter" "über Filter" ...
#> $ einbau_messkette : chr [1:18] "unter Pumpe" "unter Pumpe" "unter Pumpe" "unter Pumpe" ...
#> $ filteroberkante_muGOK : num [1:18] 21 21 21 21 21 21 11 11 11 11 ...
#> $ filterunterkante_muGOK: num [1:18] 27 27 27 27 27 27 18 18 18 18 ...
#> $ sensor_id : int [1:18] 4182018 4182017 4182016 4182015 4182014 4182013 4182006 4182005 4182004 4182003 ...
#> $ sensor_endnummer : int [1:18] 8 7 6 5 4 3 6 5 4 3 ...
#> $ einbau_sensor_muGOK : num [1:18] 21.5 22.5 23.5 24.5 25.5 26.5 11.5 12.7 13.9 15.1 ...
#> - attr(*, "spec")=
#> .. cols(
#> .. galerie = col_character(),
#> .. brunnen_nummer = col_integer(),
#> .. dn = col_integer(),
#> .. einbau_pumpe = col_character(),
#> .. einbau_messkette = col_character(),
#> .. filteroberkante_muGOK = col_double(),
#> .. filterunterkante_muGOK = col_double(),
#> .. sensor_id = col_integer(),
#> .. sensor_endnummer = col_integer(),
#> .. einbau_sensor_muGOK = col_double()
#> .. )
#> - attr(*, "problems")=<externalptr>
# Information on available measurement chain files on SFTP server
mc_files <- kwb.geosalz::get_measurementchains_files()
#> Warning: Expected 2 pieces. Additional pieces discarded in 480 rows [1754, 1755, 1756,
#> 1757, 1758, 1759, 1760, 1761, 1762, 1763, 1764, 1765, 1766, 1767, 1768, 1769,
#> 1770, 1771, 1772, 1773, ...].
#> Warning: There was 1 warning in `dplyr::mutate()`.
#> ℹ In argument: `sensor_id = as.integer(.data$sensor_id)`.
#> Caused by warning:
#> ! NAs introduced by coercion
str(mc_files)
#> tibble [3,283 × 15] (S3: tbl_df/tbl/data.frame)
#> $ rights : chr [1:3283] "-rw-r--r--" "-rw-r--r--" "-rw-r--r--" "-rw-r--r--" ...
#> $ links : chr [1:3283] "1" "1" "1" "1" ...
#> $ ownername : chr [1:3283] "1011" "1011" "1011" "1011" ...
#> $ ownergroup : chr [1:3283] "1002" "1002" "1002" "1002" ...
#> $ filesize : chr [1:3283] "4559" "4559" "4559" "4559" ...
#> $ t1 : chr [1:3283] "Oct" "Oct" "Oct" "Oct" ...
#> $ t2 : chr [1:3283] "24" "24" "24" "25" ...
#> $ t3 : chr [1:3283] "09:05" "09:05" "09:05" "06:15" ...
#> $ sftp_path : chr [1:3283] "K09/4182013-2022-10-22-0606.csv" "K09/4182013-2022-10-23-0606.csv" "K09/4182013-2022-10-24-0606.csv" "K09/4182013-2022-10-25-0606.csv" ...
#> $ sensor_id : int [1:3283] 4182013 4182013 4182013 4182013 4182013 4182013 4182013 4182013 4182013 4182013 ...
#> $ datum_uhrzeit : POSIXct[1:3283], format: "2022-10-22 06:06:00" "2022-10-23 06:06:00" ...
#> $ type : chr [1:3283] "file" "file" "file" "file" ...
#> $ galerie : chr [1:3283] "K" "K" "K" "K" ...
#> $ brunnen_nummer : int [1:3283] 9 9 9 9 9 9 9 9 9 9 ...
#> $ sensor_endnummer: int [1:3283] 3 3 3 3 3 3 3 3 3 3 ...
head(mc_files)
#> # A tibble: 6 × 15
#> rights links ownername ownergroup filesize t1 t2 t3 sftp_path
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 -rw-r--r-- 1 1011 1002 4559 Oct 24 09:05 K09/4182013-…
#> 2 -rw-r--r-- 1 1011 1002 4559 Oct 24 09:05 K09/4182013-…
#> 3 -rw-r--r-- 1 1011 1002 4559 Oct 24 09:05 K09/4182013-…
#> 4 -rw-r--r-- 1 1011 1002 4559 Oct 25 06:15 K09/4182013-…
#> 5 -rw-r--r-- 1 1011 1002 4559 Oct 26 06:15 K09/4182013-…
#> 6 -rw-r--r-- 1 1011 1002 4559 Oct 27 06:15 K09/4182013-…
#> # ℹ 6 more variables: sensor_id <int>, datum_uhrzeit <dttm>, type <chr>,
#> # galerie <chr>, brunnen_nummer <int>, sensor_endnummer <int>
# Download measurement chain files (.csv) from SFTP server
csv_files <- kwb.geosalz::download_measurementchains_data(
sftp_paths = mc_files$sftp_path,
target_directory = paths$download_dir,
debug = TRUE
)
#> Download 3283 measurement chains files (using 2 CPU cores) ... ok. (1.20 hours)
#> Failed downloading data from the following FTP path(s):
#> K10/4182005_2022-12-13-0541.csv
#> K10/4182005_2022-12-17-0542.csv
#> K10/4182005_2022-12-19-0541.csv
#> K10/4182005_2022-12-20-0541.csv
#> K10/4182005_2022-12-22-0541.csv
#> K10/4182005_2022-12-29-0541.csv
#> K10/4182005_2023-01-07-0541.csv
#> K10/4182005_2023-01-11-0541.csv
#> K10/4182005_2023-01-12-0541.csv
#> K10/4182005_2023-01-17-0541.csv
#> K10/4182005_2023-01-19-0542.csv
#> K10/4182005_2023-01-22-0541.csv
#> K10/4182006_2022-09-27-1302.csv
#> K10/4182006_2022-09-29-0532.csv
#> K10/4182006_2022-09-30-0532.csv
#> K10/4182006_2022-10-01-0631.csv
#> K13/4182012-2023-02-28-0531.csv
#> K13/4182012-2023-03-07-0531.csv
#> K13/4182012-2023-03-09-0531.csv
#> K13/4182012-2023-03-11-0532.csv
#> K13/4182012-2023-03-12-0531.csv
#> K13/4182012-2023-03-13-0531.csv
#> K13/4182012-2023-03-14-0531.csv
#> K13/4182012-2023-03-22-0532.csv
#> K13/4182012-2023-03-23-0531.csv
#> K13/4182012-2023-03-24-0532.csv
#> K13/4182012-2023-03-27-0631.csv
#> K13/4182012-2023-03-29-0632.csv
#> K13/4182012-2023-04-03-0632.csv
#> K13/4182012-2023-04-10-0531.csv
#> K13/4182012-2023-04-12-0531.csv
#> K13/4182012_2022-10-20-0631.csv
#> K13/4182012_2022-10-21-0631.csv
# Print the paths to the downloaded files
csv_files
#> # A tibble: 3,250 × 3
#> file_id sftp_path local_path
#> <int> <chr> <chr>
#> 1 1 K09/4182013-2022-10-22-0606.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#> 2 2 K09/4182013-2022-10-23-0606.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#> 3 3 K09/4182013-2022-10-24-0606.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#> 4 4 K09/4182013-2022-10-25-0606.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#> 5 5 K09/4182013-2022-10-26-0606.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#> 6 6 K09/4182013-2022-10-27-0606.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#> 7 7 K09/4182013-2022-10-28-0606.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#> 8 8 K09/4182013-2022-10-29-0606.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#> 9 9 K09/4182013-2022-10-30-0506.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#> 10 10 K09/4182013-2022-10-31-0506.csv C:/Users/RUNNER~1/AppData/Local/Temp…
#> # ℹ 3,240 more rows
Data Import
The following code imports the downloaded measurement chains files (.csv) into R:
# Stop in case that no csv file is available
stopifnot(nrow(csv_files) > 0L)
# Import csv files using multiple CPU cores
mc_data <- kwb.geosalz::read_measurementchains_data(
csv_files,
run_parallel = TRUE,
debug = TRUE
)
#> Importing 3250 measurement chains files (using 2 CPU cores) ... ok. (40.89 secs)
#> Filtering out 'lab' measurements before '2022-09-27 11:00:00' (installation in K10) ... ok. (0.01 secs)
The following datasets were imported into R:
mc_data_stats <- mc_data %>%
kwb.geosalz::get_measurmentchains_data_stats() %>%
dplyr::arrange(
.data$parameter,
dplyr::desc(.data$sensor_id)
)
These cover the time period from 2022-09-27 11:00:00 to 2023-04-14 23:00:00 with a total of 108820 samples.
Data Export
debug <- TRUE
# Export "mc_data" to csv file
data_csv_path <- kwb.geosalz::write_measurementchains_data(
mc_data,
target_directory = paths$export_dir,
to_zip = FALSE,
debug = debug
)
#> Exporting provided dataset 'mc_data' to 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_20220927-110000TZ+01_20230414-230000TZ+01.csv' ... ok. (1.14 secs)
size_data_csv <- fs::file_size(data_csv_path)
# Export "mc_data" to zip file (~10x less disk space for test dataset)
data_zip_path <- kwb.geosalz::write_measurementchains_data(
mc_data,
target_directory = paths$export_dir,
to_zip = TRUE,
debug = debug
)
#> Exporting provided dataset 'mc_data' to 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_20220927-110000TZ+01_20230414-230000TZ+01.csv' ... ok. (1.12 secs)
#> Exporting provided dataset 'mc_data' to 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data.zip' and delete intermediate 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_20220927-110000TZ+01_20230414-230000TZ+01.csv' ... ok. (0.11 secs)
size_data_zip <- fs::file_size(data_zip_path)
size_data_csv / size_data_zip
#> 17.1
extract_data_timeperiod <- function(data_path) {
basename(data_path) %>%
kwb.utils::replaceFileExtension("") %>%
stringr::str_remove("^mc[-|_]data")
}
# Define helper function to write a csv file in a target directory
write_csv <- function(df, postfix = "", target_dir) {
fs::dir_create(target_dir)
name <- deparse(substitute(df))
file <- file.path(target_dir, paste0(name, postfix, ".csv"))
readr::write_csv(df, file)
file
}
postfix <- extract_data_timeperiod(data_zip_path)
target_dir <- paths$export_dir
# Export data and metadata to csv files
stats_path <- write_csv(mc_data_stats, postfix, target_dir)
metadata_path <- write_csv(metadata, postfix, target_dir)
files_path <- write_csv(mc_files, postfix, target_dir)
# Define function that plots data to a pdf file
plot_to_pdf <- function(mc_data, para, target_dir = ".", debug = TRUE) {
plots <- kwb.geosalz::plot_measurementchains(mc_data, para)
sapply(names(plots), function(name) {
path <- file.path(target_dir, sprintf("mc_data_%s.pdf", name))
kwb.utils::catAndRun(
sprintf("Writting '%s' to '%s'", name, path),
expr = {
kwb.utils::preparePdf(path, width.cm = 25, height.cm = 15)
print(plots[[name]])
dev.off()
path
},
dbg = debug
)
})
}
pdf_files <- c(
plot_to_pdf(mc_data, "Leitfaehigkeit", paths$export_dir),
plot_to_pdf(mc_data, "Temperatur", paths$export_dir)
)
#> Writting 'Leitfaehigkeit_K10' to 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Leitfaehigkeit_K10.pdf' ...
#> ok. (1.19 secs)
#> Writting 'Leitfaehigkeit_K13' to 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Leitfaehigkeit_K13.pdf' ...
#> ok. (0.72 secs)
#> Writting 'Leitfaehigkeit_K09' to 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Leitfaehigkeit_K09.pdf' ...
#> ok. (0.75 secs)
#> Writting 'Temperatur_K10' to 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Temperatur_K10.pdf' ...
#> ok. (1.19 secs)
#> Writting 'Temperatur_K13' to 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Temperatur_K13.pdf' ...
#> ok. (0.66 secs)
#> Writting 'Temperatur_K09' to 'C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Temperatur_K09.pdf' ...
#> ok. (0.75 secs)
data_zip_path
#> [1] "C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data.zip"
The file size of the mc_data_20220927-110000TZ+01_20230414-230000TZ+01.csv file is 17.1 times larger compared to compressing it into mc_data.zip.
Data Upload
In order to automate the data management process the exported data these data will be uploaded to a restricted shared folder on the KWB cloud. For doing so the following environment variables need to be defined in case the code below should be run from a client computer. However, as these variables are defined in a GitHub actions workflow there is no need to do this locally.
NEXTCLOUD_URL = "https://<replace-with-kwb-cloud-url>"
NEXTCLOUD_USER = "<your-kwb-cloud-username>" # your username
NEXTCLOUD_PASSWORD = "your-nextcloud-app-password" ### see details below
Subsequently the following code is run for uploading the data:
# Upload all files in "paths$export_dir"
#upload_files <- list.files(paths$export_dir, full.names = TRUE)
# List paths of files to upload explicitly
upload_files <- c(
stats_path,
pdf_files,
metadata_path,
files_path,
data_zip_path
)
for (file in upload_files) {
if (!file.exists(file)) {
message("File does not exist: ", file)
next
}
kwb.utils::catAndRun(
messageText = paste("Uploading file", file),
expr = try(kwb.nextcloud::upload_file(
file = file,
target_path = paths$upload_dir
)),
dbg = TRUE
)
}
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_stats.csv ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#> C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_stats.csv
#> to
#> projects/GeoSalz/Monitoring/messketten ... ok. (1.10 secs)
#> ok. (2.43 secs)
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Leitfaehigkeit_K10.pdf ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#> C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Leitfaehigkeit_K10.pdf
#> to
#> projects/GeoSalz/Monitoring/messketten ... ok. (2.04 secs)
#> ok. (2.42 secs)
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Leitfaehigkeit_K13.pdf ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#> C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Leitfaehigkeit_K13.pdf
#> to
#> projects/GeoSalz/Monitoring/messketten ... ok. (1.18 secs)
#> ok. (1.55 secs)
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Leitfaehigkeit_K09.pdf ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#> C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Leitfaehigkeit_K09.pdf
#> to
#> projects/GeoSalz/Monitoring/messketten ... ok. (1.22 secs)
#> ok. (1.60 secs)
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Temperatur_K10.pdf ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#> C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Temperatur_K10.pdf
#> to
#> projects/GeoSalz/Monitoring/messketten ... ok. (1.60 secs)
#> ok. (1.97 secs)
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Temperatur_K13.pdf ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#> C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Temperatur_K13.pdf
#> to
#> projects/GeoSalz/Monitoring/messketten ... ok. (1.39 secs)
#> ok. (1.77 secs)
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Temperatur_K09.pdf ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#> C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data_Temperatur_K09.pdf
#> to
#> projects/GeoSalz/Monitoring/messketten ... ok. (1.22 secs)
#> ok. (1.60 secs)
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/metadata.csv ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#> C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/metadata.csv
#> to
#> projects/GeoSalz/Monitoring/messketten ... ok. (1.17 secs)
#> ok. (1.54 secs)
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_files.csv ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#> C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_files.csv
#> to
#> projects/GeoSalz/Monitoring/messketten ... ok. (1.39 secs)
#> ok. (1.77 secs)
#> Uploading file C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data.zip ...
#> Listing projects/GeoSalz/Monitoring/messketten
#> Uploading
#> C:/Users/RUNNER~1/AppData/Local/Temp/Rtmpuw0IaU/export/mc_data.zip
#> to
#> projects/GeoSalz/Monitoring/messketten ... ok. (1.65 secs)
#> ok. (2.05 secs)