Skip to contents

Overview

kwb.tenders automates checking German public procurement portals (“Vergabeportale”) for tenders relevant to KWB research topics. The first supported portal is Vergabemarktplatz Brandenburg (VMP-BB).

Pipeline: open a browser → scrape all published tenders → score them for relevance (groundwater keywords) → write an Excel + Markdown report that flags what is new since the previous run. The browser is driven directly via chromote (headless), which works locally and on headless CI runners.

One-shot run

check_tenders()                  # public search, all pages
check_tenders(max_pages = 2)     # quick test (first 2 pages)

This writes reports/vmp-bb_<date>.xlsx (sheets Relevant / Alle / Neu) and reports/latest.md.

Login is optional

The public tender search returns results without logging in, so check_tenders() does not log in by default. If you have valid credentials and want to log in (env vars VMP_BB_USERNAME / VMP_BB_PASSWORD, e.g. in ~/.Renviron):

check_tenders(login = TRUE)

Step by step

session <- vmp_bb_session()
# vmp_bb_login(session)                       # optional
tenders <- vmp_bb_scrape_tenders(session, max_pages = 2)

scored <- score_relevance(tenders)
write_tender_report(scored)

session$close()
# session$view()   # open a live view of the headless session in your browser

Research groups & keywords

Tenders are scored against all KWB research groups and each relevant tender is tagged (column groups) with the matching group(s). The keyword lists live in inst/extdata/keywords_<slug>.yml – one file per group, each with a display name and strong / supporting vectors. A tender matches a group if it contains at least one strong keyword OR at least two supporting keywords, and is relevant if it matches at least one group. Matching is case-insensitive and folds umlauts (so “Klärschlamm” and “Klaerschlamm” both match).

kw <- tender_keywords()
names(kw)        # the research-group slugs
str(kw$groundwater)

# Score against a custom subset (e.g. only two groups):
scored <- score_relevance(tenders, keywords = kw[c("groundwater", "water-risk")])

Edit the inst/extdata/keywords_<slug>.yml files to tune the keywords, or add a new file to add a group – no code change needed.

Two relevance layers

Beyond the result-table title (layer 1), check_tenders(screen_details = TRUE) (the default) opens each ongoing tender’s public detail page and matches the full description text plus the CPV procurement codes (mapped to groups via inst/extdata/cpv_groups.yml). The report’s match_source column shows which layer flagged each tender (title / detail / cpv). This needs no login (the detail page is public). The scheduled job caches detail results (on gh-pages) and only deep-screens new tenders, so daily runs stay cheap while coverage grows; cap the per-run fetches with max_detail.

check_tenders(screen_details = TRUE, max_detail = 50)

Automation (GitHub Actions)

The workflow .github/workflows/check-tenders.yaml runs check_tenders() on a schedule (weekdays, 05:00 UTC by default), commits the updated report to the repository and uploads the Excel file as a build artifact. Change the cron: expression to adjust the frequency.