Introduction
The Barcamp ‘Open Science’ took place on 12 March 2018 in the rooms of the Wikimedia Deutschland, Tempelhofer Ufer 23-24, 10963 Berlin. The event is organised by the Leibnitz Research Allience Science 2.0.
Central aim of the Barcamp is to build a network between people working in the field of Open Science. Opposed to a Conference, the Barcamp concept follows a bottom-up approach.
Among the participants were researchers, software developers, librarians, e.g. from
- Bielefeld GitLab
- Leibniz Institute for Astrophysics Potsdam (AIP)
- Open Science Radio
- Max Planck Institute for Human Development
- Naturkundemuseum Berlin
In the introduction round they named e.g. the following topics:
- Metadata
- Open Knowledge Maps
- Open Methods
- Open Research Data Quality
- Research Data Management
- Training Open Science
- Transparency
- Workflows
The Barcamp started with a session planning where topics for sessions of 45 minutes duration were proposed.
Content
- Ignition Talk by Lambert Heller
- Session: What kind of tools should we use?
- Session: How do we motivate / reward doing Open Science?
- Session: Software citation
- Session: Valid reasons for opting out of open science
- Session: CC0/PD vs. (CC)Licences for Open Science
- Session: Pain Points in Open Science
- Session: Research software publishing/repositories
- Session: Metadata / Codebooks
- Session: Open Knowledge Maps
- Session: Doing research outside academia / Citizen Science
- OK-Lab Berlin: Presentation
Ignition Talk by Lambert Heller
Elsevier suggestion add geoblocking to open access
Lessons-learnt: platforms require trust, but often exploit it!
2001: BitTorrent. It turned the Client-Server approach of the internet upside-down
2013: Decentralised web movement (DAT). The “Dat Project” (A distributed data community) started on github.
We have a “trusted platforms” problem in science
Multiple tools are needed
The GHTorrentProject, Debian CD images with BitTorrent
Why move schoolary publishing to peer-to-peer (P2P) networks?
In order to get their research done, researchers should be able to get hold of lots of data without additional effort. It is up to the research infrastructure.
We owe it to the scholarly work incorporated in that data
Replacing privileged access with permissionless innovation levels the playing field for business model innovations.
Pragmatic school, https:/doi.org/ck99
Economic school
Internet Resources
Session: What kind of tools should we use?
https://etherpad.wikimedia.org/p/workshop_OpenScienceFellows_BarcampSession5
Background
Sind Tools wie Hypothes.is und GitHub zu empfehlen? - Are tools like Hypothes.is and GitHub recommended for open science?
https://nextcloud.gbv.de/nextcloud/index.php/s/Rfg199D4xg0RptB
Kriterien
1 Quelloffen (source code is open)
2 Migration muss möglich sein (Exit-Strategie - eingebauter Knopf) - migration needs to be possible
3 Langfristig gesicherter Anbieter ( >10 Jahre) - offering should be long term (more than 1o years)
4 Vertrauenswürdiger Anbieter (Datenschutz, Non-profit, pro-Europe…) - trustful offering (e.g. data security, non-profit, pro-europe…)
5 Wissenschaftlich verlässlicher Anbieter. z.B. Fighshares hat keine Tombstone: https://doi.org/10.6084/M9.FIGSHARE.1381402 - Scientifically reliable provider.
6 Sichtbarkeit (leicht in Google findbar) - visibility
7 Offene Beteiligungsmöglichkeit (externe Nutzer/Accounts leicht möglich) - open participation should be possible (external users/accounts)
8 Usability
A reference that should be considered: Bilder G, Lin J, Neylon C (2015) Principles for Open Scholarly Infrastructure-v1, retrieved [date], http://dx.doi.org/10.6084/m9.figshare.1314859“ - https://cameronneylon.net/blog/principles-for-open-scholarly-infrastructures/
Notes
Liste mit Tools für 101 innovations: https://101innovations.wordpress.com/ (tool list of 101 innovations)
Daten sind hier verfügbar: https://zenodo.org/record/49583#.WobxiRPwYWo (data from the list)
Reference: https://figshare.com/articles/NPOS_Workflow-perspective-Bosman-Kramer_pptx/5065534/1
Applied Criteria
ResearchGate: 6, 7, 8
Open Science Framework: 1, 2, 5, 6, 7, 8
FigShare 6, 7, 8
Zenodo 1, 2, 3, 4, 5, 6, 7, 8
Overleaf 2, 6, 7, 8
GitHub 2, 6, 7, 8
ScienceOpen
Criteria need to be applied:
arXiv 1?, 2?, 3, 4?, 5?, 6?, 7?, 8?
bioRxiv 1?, 2?, 3?, 4?, 5?, 6?, 7?, 8?
Jupyter 1?, 2?, 3?, 4?, 5?, 6?, 7?, 8?
Authorea 1?, 2?, 3?, 4?, 5?, 6?, 7?, 8?
MyExperiment 1?, 2?, 3?, 4?, 5?, 6?, 7?, 8?
protocols.io 1?, 2?, 3?, 4?, 5?, 6?, 7?, 8?
OpenNotebookScience 1?, 2?, 3?, 4?, 5?, 6?, 7?, 8?
GitLab (CE at institutions) 1?, 2?, 3?, 4?, 5?, 6?, 7?, 8?
Dryad 1?, 2?, 3?, 4?, 5?, 6?, 7?, 8?
Dataverse 1?, 2?, 3?, 4?, 5?, 6?, 7?, 8?
AsPredicted 1?, 2?, 3?, 4?, 5?, 6?, 7?, 8?
Hypothes.is 1?, 2?, 3?, 4?, 5?, 6?, 7?, 8?
Zotero 1, 2, 3?, 4?, 5?, 6?, 7?, 8?
RIO 1?, 2?, 3?, 4?, 5?, 6?, 7?, 8?
Eigene Blogwebseiten 1?, 2?, 3?, 4?, 5?, 6?, 7?, 8?
Twitter 1?, 2?, 3?, 4?, 5?, 6?, 7?, 8
Barcamp Discussion
Session: How do we motivate / reward doing Open Science?
https://etherpad.wikimedia.org/p/oscibar2018_session5
Point of View: How open science helps researchers succeed https://doi.org/10.7554/eLife.16800.001
Felix Schönbrodt: https://twitter.com/nicebread303/status/973138967091654656
Discussion https://twitter.com/BrianNosek/status/949756218817630208
Data stories:
Bad: young researcher, who works in a lab who regulary perform scientific misconduct
Session: Software citation
Session: Valid reasons for opting out of open science
https://etherpad.wikimedia.org/p/oscibar2018_session13
Background:
Opt-out policy in H2020: ‘projects can “opt-out” of these provisions before or after the signature of the grant agreement (thereby freeing themselves from the associated obligations) on the following grounds:’
Incompatibility with the Horizon 2020 obligation to protect results that are expected to be commercially or industrially exploited
Incompatibility with the need for confidentiality in connection with security issues
Incompatibility with rules on protecting personal data
Incompatibility with the project’s main aim
If the project will not generate / collect any research data, or
If there are other legitimate reasons not to provide open access to research data
Problem:
To general. Too simple to out-out!
Make the reasons more detailed!
Session: CC0/PD vs. (CC)Licences for Open Science
https://etherpad.wikimedia.org/p/oscibar2018_session17
Session: Pain Points in Open Science
https://etherpad.wikimedia.org/p/oscibar2018_session3
Pain Points:
- Replication studies are rejected for publication. Impact factor is old style but it is the currency you are paid
- Often, Open Data are not allowed or researchers do not want to open their data
- Tools, e.g. git is a pain to use
- Efforts: it takes money and resources to change things
- Project proposals are not focused on data
Solution:
- Allocate money
- You need supporting structures
- You need an Open Data Strategy! Strategy could be: publish metadata of sensitive data, together with a contact information. Then, people interested in the data can contact the owner of the data and a contract about what is allowed with the data can be made.
- Birte Pfeiffer (Research data manager at GIGA, Hamburg): “You have to talk to the researchers! The question must be: How can I help you?”
- Take care of the data from the beginning -> Data management plan -> is a “living” document
Session: Research software publishing/repositories
https://etherpad.wikimedia.org/p/oscibar2018_session6
- Welche Probleme gibt es, Software zu finden?
- GitHub + DOI
- Allianzinitiative: FAIR principles
- Eigenentwicklung (wer?): Metadatenportal, nur aggregierte Daten, zugänglich ist ein (noch privates) GitHub repo, Software auf Django basierend
- Anja Busch, Leibniz-Informationszentrum Wirtschaft: Metadaten-Harvesting
- Birte Pfeiffer, GIGA Hamburg: Suche nach Datenerhebungs-Tools -> Wie soll ich das alles testen?
- Sandbox-Anwendungen und Ansprechpartner sind wichtig zum Evaluieren von Software -> using mybinder?
Resources
openscience.org, The OpenScience Project de-rse.org, Research Software Engineers
Session: Metadata / Codebooks
https://etherpad.wikimedia.org/p/oscibar2018_session9
- standardised vs. flexible, concise vs. comprehensive, incentives vs. requirements
- Ruben Arslan: Codebook
- Data Documentation Initiative (DDI)
- Overview of metadata standards is missing
- Standards should “talk to each other”
- e.g. biodiversity: ontologies to map metadata, controlled vocabularies, scientist does a pre-annotation, final annotation is done by a data curator (one per repository)
- PANGAEA. Data Publisher for Earth & Environmental Science, https://www.pangaea.de/
- each repository is using its own ontology and metadata standard
- collaboratively improve documentation of datasets in repositories?
- use wikidata for that?
- LabNotebook will automatically derive metadata
- “It takes time to take the right standard and to use it”
- JasonLD: JSON for Linking Data, https://json-ld.org/
Session: Open Knowledge Maps
https://etherpad.wikimedia.org/p/oscibar2018_session14
- Open Knowledge Maps, https://openknowledgemaps.org/ is a non-profit organisation
- https://github.com/openknowledgemaps
- Knowledge Discovery
- Many papers remain uncited
- 85 % of published data remains uncited
- similar papers are placed in bubbles, based on text similarity
- BASE: Bielefeld Academic Search Engine
- Partner:
Session: Doing research outside academia / Citizen Science
https://etherpad.wikimedia.org/p/oscibar2018_session20
- European Citizen Science Association (ECSA), https://ecsa.citizen-science.net/
- Citizen Science: inclusion of members of the public in some aspect of scientific research
- Levels:
- 1: Crows sourcing (citizens as sensors)
- 2: Distributed intelligence (Citizens as basic interpretors)
- 3: ?
- 4: ?
- EyesOnALZ: find cloggings in blood vessels
- BioBlitz
- ECSA: 10 Principles of Citizen Science
OK-Lab Berlin: Presentation
- What is Open Data?
- machine readable
- open licences
- without payment
- sharing and remixing allowed
- allowed for commercial purposes
- metadata: additional pdf or something
- Formats for open data in increasing quality: pdf -> xls -> csv -> rdf -> lod
Data portals
Example applications
- fragdenstaat.de
- Swimming spots: berlin.codefor.de/badestellen
- Where can I buy vegetables?
- schulsanierung.tursics.de
- www.muenchen-transparent.de
- luftdaten.info
- Deutsche Bahn Preiskalender, https://bahn.guru/
References
Personal Contacts
Alexander Struck, HU Berlin: We may have a look at Seafile as a software tool for storing our raw data.