Using a warehouse to disseminate data

You are here :

The open dissemination of research data guarantees that research is transparent and reproducible. Researchers are now obliged to open some of their data following the axiom “as open as possible, as closed as necessary”.

Storing and depositing data are two distinct operations: while storing pertains to the moment when data is collected or treated, depositing occurs to preserve, share, and disseminate data, and to allow data to be linked to publications. Depositing data thus guarantees that research is transparent and reproducible.

What data to open ?

Depositing data in a warehouse

Data deposited in a warehouse exists independently of the scientific article: it needs to be described by the richest metadata possible to make it easy to find, thus encouraging sharing and reuse. A permanent identifier or access number is attributed to each dataset making it visible, accessible, and stable, just like the publication.

How to make your data FAIR :

Contenu du texte déplié

make your data findable by :

Tip : most warehouses attribute a permanent archiving identifier to datasets.

Contenu du texte déplié

make your data accessible by ensuring that :

  • The warehouse used to share data attributes permanent identifiers enabling data to be recuperated.

  • Metadata is, as far as possible, accessible, even if data is not. The access procedure may involve authentication and authorisation steps if necessary.

Contenu du texte déplié

make data interoperable by using :

  • Whenever possible, open-source and widely used formats, software, and languages, enabling exchange between IT systems and increasing the capacity for combining metadata

  • Permanent identifiers: DOI, PMID, SWHid, arXiv ID

  • Repositories: idRef, ORCID, RNSR

  • Controlled vocabularies: DC, RDF, FOAF, SKOS, BILBO, Fabio

Contenu du texte déplié

make data reusable by ensuring that :

  • Data is well documented to support correct interpretation.

  • That a clear and accessible user licence is attributed so that other researchers may know what types of reuse are authorised.

  • Information on provenance is available to clearly indicate how, why, and by whom the data was created and processed.

Choosing a warehouse

A Research Data Warehouse or Repository is a database for gathering and conserving research data, and making it visible and accessible. Its role is to enable data to be collected or deposited, accessed, and shared for reuse.

Each warehouse tends to have a policy for depositing, describing, and disseminating data. One of the criteria for choosing a warehouse may be the possibility of attributing a licence imposing citation of those who created the data when it is used.

There are several types of warehouse :

  • disciplinary
  • multidisciplinary
  • institutional
  • publisher-specific
  • project-specific

To choose a reliable warehouse, it is recommended that you

  • Check if a warehouse is recommended by one of the parties involved in your project (your funder, publisher, or institution)

  • Find a warehouse adapted to your needs by using warehouse directories and/or looking for certified warehouses.

For HASS, a particularly noteworthy warehouse is Nakala, run by Huma-Num, which meets most needs.