FAIR principles

The FAIR data principles are guiding principles on how to make data Findable, Accessible, Interoperable and Reusable, formulated by Force11. On this website, we explain the principles (based on the DTLS website) and translate them into practical information for Radboud University researchers.

Why should you make your data FAIR?

Giving digital data FAIR properties will benefit the academic community, and therefore support discovery and innovation, including (based on the ANDS website):

  • Gaining maximum potential from data assets
  • Increasing the visibility and citations of research
  • Improving the reproducibility and reliability of research
  • Staying aligned with international standards and approaches
  • Attracting new partnerships with researchers, business, policy and broader communities
  • Enabling new research questions to be answered
  • Using new innovative research approaches and tools
  • Achieving maximum impact from research

Radboud University policy

Radboud University states in its strategic vision that from 2020 onwards all data belonging to publications should be stored FAIR.

FAIR: how are data made findable?

Findable

Issues to be addressed

Practical

F1. (Meta)data are assigned a globally unique and persistent identifier (PID)

Any data object should be uniquely and persistently identifiable. Use a public repository that issues a PID (e.g. DOI or handle) to archive your data at publication. A PID also allows citation in case of reuse of data

  • Most (disciplinary) repositories will assign a PID when archiving a dataset
  • Using the Radboud Data Repository (RDR), a DOI will automatically be assigned to your dataset
  • The campus network can’t be used for FAIR data archiving as it does not assign a PID

F2. Data are described with rich metadata

Metadata describe the data. The more elaborate metadata are, the better findable the data are

  • Using the RDR, extensive metadata are required, based on the Dublin Core and DataCite metadata standards
  • For language resources consider CLARIN’s metadata categories
  • Include elaborate information about the context, content and characteristics of the data, for instance in the documentation of your data

F3. Metadata clearly and explicitly include the identifier of the data it describes

The metadata used to describe your data should always include the persistent identifier

  • This is automated when using the RDR for archiving data

F4. (Meta)data are registered or indexed in a searchable resource

Check whether the repository of your choice is indexed by regular search engines, such as Google Scholar

  • Repositories with a certificate (such as the CoreTrustSeal) are indexed. An overview of data repositories can be found here
  • This is automated when using the RDR for archiving data
  • For language resources this is automated in CLARIN’s Virtual Language Observatory

FAIR: how are data made accessible?

Accessible

Issues to be addressed

Practical

A1. (Meta)data are retrievable by their identifier using a standardized communications protocol

A1.1. The protocol is open, free, and universally implementable

A1.2. The protocol allows for an authentication and authorization procedure, where necessary

Limitations on and protocols for the use of data are made explicit. Data should be retrievable by anyone with a computer and an internet connection, if he or she is authorized, with a well-defined protocol

Accessible data does not automatically imply open or free access. Data published with restricted access can also be FAIR. Although at Radboud University open access is stimulated, there can be ethical or legal reasons not making data open access

  • Choose a repository that facilitates your access needs. An overview of data repositories can be found here
  • The RDR is suitable for data preservation for internal use as well as data sharing with the external scientific community
  • The metadata/ documentation should include clear licenses and conditions of use: who can access the data and in what way?

A2. Metadata are accessible, even when the data are no longer available

Because of the costs and relevance of keeping (large) datasets online, over time datasets might not be longer available, or there may be updated versions.

Because metadata are valuable by itself, it is important to preserve the accessibility of the metadata (for example for replication studies, or for tracking down researchers, institutions or publications)

  • According to the Radboud University policy, data has to be archived for at least ten years after publication of the results
  • If you request a dataset to be deleted, make sure you specifically request the metadata to remain in the repository

FAIR: how are data made interoperable?

Interoperable

Issues to be addressed

Practical

I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation

It should be possible for people and computers to interpret the data and combine it with other datasets. Clearly, this is a very challenging requirement to meet

  • Include proper documentation, use preferred file formats and use a clear file structure
  • If existing in your discipline, make use of standard vocabularies, ontologies and thesauri in your (meta)data, or provide mapping of your data to these vocabularies, ontologies and thesauri
  • For language resources see CLARIN’s Concept Registry

I2. (Meta)data use vocabularies that follow FAIR principles

Easy findable and accessible vocabularies contribute to reuse

  • If possible, use existing vocabularies in your discipline when archiving data (e.g. MeSH classification, VEST registry)

I3. (Meta)data include qualified references to other (meta)data

Include all meaningful links between (meta)data resources in order to enrich the contextual knowledge about the data

  • Enrich your metadata/ documentation files
  • Use RIS to register your public data for internal and external exposure, and link it to the associated publications and datasets. This is automated when using the RDR

FAIR: how are data made reusable?

Reusable

Issues to be addressed

Practical

R1. (Meta)data are richly described with a plurality of accurate and relevant attributes

Potential reusers should easily decide if the data are actually useful in their context, so that data can be replicated or combined in future research

  • In your metadata, describe the context under which the data was generated, such as the scope, particularities, limitations, conditions, parameters, type of data and variable names
  • If relevant and possible, include open source code or software in your dataset. If this is not possible, a reference to the required software should be included in the metadata/documentation

R1.1. (Meta)data are released with a clear and accessible data usage license

Specify if and how the data are licensed.

  • Read more about licenses and copyright at Radboud University
  • This is automated when using the RDR for archiving data

R1.2. (Meta)data are associated with detailed provenance

A potential reuser needs to know who to cite and acknowledge

  • Include in your documentation a description of the process that resulted in your data: who generated and collected the data? How has it been processed? Has it been published before? Does it contain data from someone else that you have potentially transformed or completed?

R1.3. (Meta)data meet domain-relevant community standards

If in your discipline standards or best practices for data archiving and sharing exist, FAIR data should meet these standards. Note that quality issues are not addressed by the FAIR principles: how reliable data are lies in the eye of the potential reuser

  • Check out disciplinary standards or best practices when archiving data