OIML BULLETIN - VOLUME LXVI - NUMBER 3 - July 2025

 

e v o l u t i o n

 

The FAIR principles

Valid guidelines for legal metrology?

 

Till Biskup https://orcid.org/0000-0003-2913-0004 1, Daniel Lübbert https://orcid.org/0000-0003-3852-5665 1, and Giacomo Lanza https://orcid.org/0000-0002-2239-3955 2

1. PTB https://ror.org/05r3f7h03, Berlin, Germany
2. PTB https://ror.org/05r3f7h03, Braunschweig, Germany

 

Citation: T. Biskup et al. 2025 OIML Bulletin LXVI(3) 20250303

 

Abstract

Digitization does not stop short of metrology, neither does the ever-increasing stream of data we need to deal with. Data producers in both research and metrology are requested to comply with various “good practice” guidelines, regulations and criteria. On the one hand, legal metrology sets strict requirements to ensure the reliability and compliance of data with legal regulations, with a focus on data quality. On the other hand, scientific disciplines working with large amounts of data have designed a set of formal criteria, known as the FAIR principles, to improve the automated processing of and access to data and metadata.

We ask: How far do these two sets of requirements overlap? Are they compatible with and/or complementary to each other? Is it possible to apply one in the context where the other was conceived? What can the two perspectives, legal metrology and research data management, possibly learn from each other?

Our critical review addresses these questions, and discusses some aspects of the FAIR principles which could be beneficial also to legal metrology in the 21st century and vice versa. To this end, we outline the characteristics of both legal metrology and research data management, and analyse similarities between them. We strive to provide the relevant context necessary to comprehensively understand the FAIR principles, and to allow for an informed answer to the question raised in the title. Finally, we argue that a focus on data availability and on data quality are similarly important aspects, which should both be highly valued. Therefore, a combination of aspects of the FAIR principles with the quality standards from legal metrology could make a true difference for the future data culture.

1. Introduction

The amount of available information has continuously and exponentially risen over the past centuries, and the increasing digitization is neither an exception nor a particularly novel phenomenon in this regard [1, 2], although it is sometimes perceived as “data deluge” [3]. While data form the basis of both empirical sciences and metrology, they are neither information per se nor insight. To make use of the available data, they need to be organized, processed, and analysed, and thus the relevant information extracted. Nevertheless, there is growing demand [4, 5] for access to (raw) data, particularly those from public funding. The increasing amount of available (digital) data had led to data-driven science, sometimes called the “Fourth Paradigm” [6], where the usual workflow of empirical science starting with data acquisition is discontinued and instead solely based on already available data. While this is a valid approach for data based on systematic screenings, observation campaigns, or large databases with sufficiently homogeneous data, it relies on the assumption that the available data are either complete or a representative sample, but not biased. Furthermore, this requires the available data to be supplemented by sufficient context, i.e. metadata [7], to be useful. This is where the FAIR guiding principles for scientific data management and stewardship, or FAIR principles in short, come into play. They were developed originally in the context of biomedical imaging [8] and focus on machine-actionable digital data as input for training machine-learning applications [9]. Nevertheless, they have made a huge impact in the academic world and are nowadays often regarded as synonymous with research data management. While building on the results of others is an intrinsic part of the scientific progress and method, the FAIR principles provide no information about the quality of the data and metadata. In short, data may be perfectly FAIR according to the principles, but nevertheless useless due to inferior quality of either data or metadata or lack of reproducibility. At the same time, data may be of the highest quality possible, meticulously acquired, fully trackable and well documented, but neither findable nor accessible nor interoperable, hence hardly reusable. The FAIR principles can be used to address some important questions related to handling and managing data.

Metrology in general – and legal metrology in particular – follows strict standards and well-established protocols to ensure the highest quality of the results and full "trackability" from the final result all the way back to the original data acquisition. Trackability of data, i.e. the possibility of reconstructing the full data provenance, is complemented in the metrological context by the traceability, i.e. the possibility of relating them to a primary reference through a documented unbroken chain of calibrations of all methods and devices involved. Besides that, legal metrology from its very origins was meant to provide industry and the economy with reliable, trusted and independent assessments and calibrations, hence making these results available to the customer and in a sense interoperable with the customers' internal workflows. However, given the steadily increasing digitization, the amount of data available, and the need for automated workflows integrating metrological data, legal metrology is facing challenges with respect to providing data in an interoperable, digital way, while guaranteeing and ensuring the legal requirements.

This leads to a simple yet important question: Can legal metrology learn from the FAIR principles and vice versa, and if so, which aspects could (and should) have a positive impact on the way data and workflows are organized?

2 FAIR principles

The FAIR principles [8] are often, and somewhat mistakenly, seen as synonymous to research data management. Before critically reviewing these principles, we provide a short overview and summary of their contents.

2.1 Overview and summary

While research data management [10] deals with all aspects regarding the research data life cycle, the FAIR principles [8] focus on digital data and define some very general demands concerning their presentation and propagation, namely, that they should be Findable, Accessible, Interoperable and Reusable. A more detailed declination breaks them down to more specific demands concerning metadata, terminology, formats, as shown in Table 1. 

Table 1: The FAIR guiding principles for scientific data management and stewardship, as proposed in [8], highlight four fields relevant for reuse of data outside their original context represented by the four letters of the acronym: Findability, Accessibility, Interoperability, and Reusability.
Findability
F1 (meta)data are assigned a globally unique and persistent identifier
F2 data are described with rich metadata (defined by R1 below)
F3 metadata clearly and explicitly include the identifier of the data it describes
F4 (meta)data are registered or indexed in a searchable resource
Accessibility
A1 (meta)data are retrievable by their identifier using a standardized communications protocol
A1.1 the protocol is open, free, and universally implementable
A1.2 the protocol allows for an authentication and authorization procedure, where necessary
A2 metadata are accessible, even when the data are no longer available
Interoperability
I1 (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation
I2 (meta)data use vocabularies that follow FAIR principles
I3 (meta)data include qualified references to other (meta)data
Reusability
R1 (meta)data are richly described with a plurality of accurate and relevant attributes
R1.1 (meta)data are released with a clear and accessible data usage license
R1.2 (meta)data are associated with detailed provenance
R1.3 (meta)data meet domain-relevant community standards

We summarize the requirements listed in the 15 points above into the following 5 categories:

  • File formats
    Data should preferably be stored in an open, long-lived, standardized and well-documented format, in order to guarantee readability for a long time after generation.
  • Metadata
    Data need to be appropriately described by metadata that constitute the “common language” of data storage platforms, data search engines and software ingesting or producing data and are hence crucial to guaranteeing the interoperability of the data. Therefore, metadata should be rich, accurate, persistent (even in case of data deletion), and contain all relevant (administrative and scientific) information. Field names (properties/attributes) and allowed field values (classes/data types) should be expressed in a standard language, i.e. picked out of controlled vocabularies, wherever possible, or at least indexed.
  • Identifiers
    All referenced objects (publications, projects, persons, institutions, terms, …) should be designated by persistent (PID) and unique (UID) identifiers, in order to disambiguate homonyms and anticipate any changes in denominations or addresses.
  • Licenses
    All data and metadata should have assigned a license to clarify if and how they can be used legitimately. Licenses should be expressed in a clear, solid, and complete way, always accessible, and wherever possible standardized.
  • Protocols
    The way data are accessed should be open, free of charge, and universally implementable independently of the environment/architecture.

2.2 FAIR principles: a critical review

The FAIR principles are primarily concerned with the possibility of sharing and reusing data, with a focus on automated search and reuse by machines. Therefore they enforce a mainly formal check of the data and particularly of the metadata used to describe them. No criterion is imposed on the meaningfulness, consistency, accuracy or quality of the data and metadata – this must be addressed by a subsequent evaluation step. Furthermore, questions and problems arising from reusing data beyond their original context or even across disciplinary boundaries [2] are not within the scope of the FAIR principles.

Due to their very generic character, it is difficult to unequivocally assess the “degree of FAIRness” of a dataset, or to compare it across datasets, as different FAIRness evaluators can give nonequivalent results. Yet it is possible to compare a dataset with itself at different stages using the identical tools to quantify improvements in their FAIRness. For further detail, see the “FAIR Data Maturity Model” [11].

Conformity with the FAIR data principles should not be confused with other recommendations or indicators which arise in the context of handing (research) data. To name just the most important aspects:

  • Good Scientific Practice
    Good Scientific Practice, as described e.g. by the German Research Foundation [12] or the National Academies of Sciences, Engineering, and Medicine [13], provides guidelines for ethical behaviour with respect to one's own and external data, prescribes a minimum data storage time span for documentation purposes, and a citation obligation (besides articles) for reused data.
  • Data openness
    Moreover, data FAIRness should not be confused with data openness: the good practice implicitly prescribed by the FAIR data principles should be applied independently of the possibility of sharing a dataset.
  • Research Data Management
    Finally, the FAIR principles do not exhaust the whole subject of research data management, which includes all tasks, tools, and solutions connected with research data, including also aspects not related to reusability, such as security, protection, planning, and documentation.

3 Data exchange in legal metrology

Metrology in general, and legal metrology in particular, provides a highly regulated environment, quite in contrast to (fundamental) research. While based on scientific insight, metrology applies this knowledge in a controlled manner and provides reliable, qualified answers as to whether and to what extent a given specimen conforms to predefined criteria. Metrology and science share a common set of criteria, such as reliability, accuracy of results, and independent reproducibility. However, the degree of obligation is different in the two contexts: in science, conformity to these criteria is delegated to the personal responsibility and organizational capability of the individual scientists, while legal metrology rests on a statutory mandate, whose strictness depends on the regulations of each country. Furthermore, metrology requires and ensures strict traceability of calibration results to primary standards and fundamental (SI) units and always accompanies quantity values with uncertainties. To this end, metrology follows clearly outlined, formalized, and documented protocols (standard operation procedures) that have been meticulously developed over the last 150 years or so, and employs quality management tools and procedures to ensure compliance with the required high standards and the legal mandate.

The primary “product” of legal metrology is a certificate of conformity or calibration certificate. Internally, the entire process from the test or measurement setup via data acquisition, data processing and data analysis to the final certified result needs to be trackable and hence documented. Sharing the underlying data, however, is typically not required, in part due to confidentiality constraints typically encountered in collaboration with industrial partners. Hence, while the result of a certification could be regarded as data, it is clearly not research data in the sense the FAIR principles are typically concerned with. Furthermore, efforts have only recently been made to provide the results of a calibration in a standardized, digital form to its end users, namely the digital calibration certificate (DCC) [14]. This promises to facilitate processing of calibration certificates within the context of (increasingly digital) quality management systems established in industry. However, it can be seriously questioned whether the amount of globally available results of certifications will ever amount to “big data” in the sense of valid input for machine-learning algorithms. They may, however, enter as “big metadata” into “Industry 4.0” processes where for every sensor relevant calibration data including the respective uncertainty budget are relevant and eventually made available in a machine-actionable form.

Another important aspect of legal metrology worth mentioning here are round-robin tests. These require exchanging measured (digital) data between institutions, and to properly assess their quality and allow for sensible comparisons, they need to be contextualized with rich and relevant metadata. These data come pretty close to research data and their reuse, as addressed by the FAIR principles. Round-robin tests cannot be regarded as data-driven science, and the amount of data involved is on the “little data” side. Nevertheless, data exchange between metrology institutions clearly benefits from the scientific community dealing with the challenges of and developing concepts for exchanging data.[15]

4 Components of a constructive exchange of ideas

The FAIR data principles are built sequentially, i.e. their ultimate goal is to enable reuse of data outside the discipline and context they were acquired in. This is not the primary scope of legal metrology regulation, whose highest principles are traceability and trustworthiness. From this point of view, the quality of the content is of higher importance than the correctness of the formal description. The data and their generation are checked thoroughly by means of procedures defined in norms, standards and quality management regulations, the conformity to which is a necessary condition which guarantees quality and (re)usability of results within the foreseen context, in analogue or digital ways. In contrast, conformity to the FAIR principles guarantees formal data reusability (conditioned by accessibility), in a digital context and even automated, without any guarantee of the correctness of the information contained. Successfully sharing and reusing data requires a shared background and sufficient context, i.e. metadata. Knowing whether you have provided sufficient context (as data provider) and whether you have been provided with sufficient context (as data user/consumer) is far from trivial, rarely conscious and mostly unresolved. [2, 15] The practical use of the FAIR principles in the scientific and particularly in the research data management community seems to largely ignore these aspects. Nevertheless, the FAIR principles can be used to address relevant questions of sharing digital data, even in a local or highly restricted context, and the tools developed in the different scientific communities for handling and sharing (digital) data should definitely be considered and evaluated for their applicability within the context of legal metrology. Similarly, the research data community could learn a lot from legal metrology, particularly with respect to the quality and scientific rigour of data, results, and their description.

4.1 Aspects of the FAIR principles relevant for legal metrology

  • Data sharing
    In general, the idea of preparing and handling data in a digital format as if they were meant for sharing, e.g. describing them thoroughly – ideally by means of machine-actionable metadata – and identifying them uniquely, should be more broadly applied. This applies also to relevant information on the uncertainty budget of instrumentation, as typically contained in (digital) calibration certificates.
  • Data provenance
    Two aspects need to be distinguished: data acquisition and data processing and analysis. All relevant information related to data acquisition should be covered: sample/specimen, setup, measurement conditions. Furthermore, reproducibility requires to automatically provide a gap-less record of each individual data processing and analysis step, including all implicit and explicit parameters as well as versions of the software routines used and crucial parameters of the computing environment, aiming to answer these questions: who has done what with whom when how and why? See the entire discussion in [16] and [17] for further details.
  • Data storage and access
    Data and their accompanying metadata need to be stored in a systematic way, and protected by technical means (backups) against corruption and accidental loss. Access to data and metadata, even when only protected and/or locally available, should be granted via unique, persistent and resolvable identifiers i.e. mechanisms similar to a locally available ”DOI”. See the discussion in [18] for some of these aspects.

4.2 Aspects of legal metrology relevant for research data management

  • Quality management
    Quality management for data and measurement procedures is more established and mature in the controlled environment of legal metrology. Generally, science (and research) have high standards regarding quality of the processes and results. However, at least in the academic world and more fundamental research, we often have a lack of quality assurance. For some ideas coming from medical research for introducing concepts of quality management into research see [19].
  • Representation of numerical information
    Proper and reliable representation of numerical information is definitely the key expertise of metrology and should be adopted in the scientific community as an essential part of good scientific practice. This includes a semantically correct and machine-interpretable writing of quantities, numbers and units as prescribed e.g. in the “International Vocabulary of Metrology” [20] as well as in the “IUPAC Green Book” [21] for the physical and chemical sciences, but nevertheless sometimes ignored even by scientific journal publishers. Furthermore the provision of an uncertainty budget a priori for measured and simulated values, as prescribed in the “Guide to the expression of uncertainty in measurement” [22] and already described back in 1935 by Fisher [23], heavily conditions the experiment design necessary to allow a meaningful statistical analysis a posteriori. A digital counterpart of the aforementioned recommendations is being built within the SI Digital Framework, coordinated by the BIPM [24]. For a general introduction into the problems of estimating uncertainties and the quality of fits to data, see [25]. For a discussion why quantifying uncertainties may be more important than reproducibility, see [26].
  • Metadata schemes
    The development and adoption of dedicated and specialized metadata schemes (collections of field names), of the corresponding terminologies (collections of allowed field values) and data/metadata formats for different “application profiles” (combination of subject and method) is necessary, but adopted so far only within few communities. These schemes can be in the form of standards (defined, published and governed by an authoritative institution) or at least as conventions/quasistandards (agreed informally within a community). In this context, national metrology institutions can provide the necessary expertise, connections and political power to promote high-quality and reliable conventions to binding standards.
  • Formalized and controlled environments
    The provision of a highly formalized and controlled environment, as common in legal metrology with its well-defined processes, can ease the use of controlled terminologies (controlled vocabularies or ontologies, depending on complexity) and validation of data descriptions against them and therefore increasing its acceptance.
  • Documentation
    Metrology typically has much higher standards and demands than often encountered in research. However, documenting processes in a digital manner with as little media disruption as possible is a great challenge faced by both metrology and science. Science-wise, this typically comprises scientific record-keeping via lab notebooks that are increasingly digital as well. For a more detailed discussion of electronic lab notebooks (ELNs) and what they are (not), see [27].

5 Conclusions

Taken together, the FAIR principles are clearly not sufficient for future-proof data handling in legal metrology, and perhaps they do not even constitute a necessary prerequisite. Nevertheless, many aspects summarized in the FAIR principles can potentially be useful for legal metrology in the 21st century. This includes, but is not limited to, tools for describing data provenance, collecting all necessary metadata throughout the entire data life cycle, developing standards for open formats including metadata schemes, facilitating data exchange with customers and between metrology institutions, and promoting platforms such as data repositories and searchable catalogues that provide systematic access to metrological data.

Legal metrology, on the other hand, traditionally has its own specific strengths. Among these are a systematic use of legal units, a conscious handling of uncertainties, and generally a focus on data quality, accuracy and traceability of measured values. We are convinced that both worlds can and should learn from each other. If, in particular, the rigorous virtues of legal metrology were transferred and systematically applied in the broader area of research data management, the best of both worlds in terms of data culture could be within reach.

 

References

[1] Derek John de Solla Price, 1963, Little Science, Big Science, Columbia University Press, New York

[2] Christine L. Borgman, 2015, Big Data, Little Data, No Data: Scholarship in the Networked World, MIT Press, Cambridge, MA. ISBN: 978-0-262-52991-4

[3] Gordon Bell, Tony Hey, and Alex Szalay, 2009, Beyond the data deluge, Science 323, 1297–1298. doi: 10.1126/science.1170411

[4] OECD Principles and Guidelines for Access to Research Data from Public Funding, 2007, OECD Publishing, Paris. doi: 10.1787/9789264034020-en-fr

[5] UNESCO Recommendation on Open Science, 2021, UNESCO, Paris, doi: 10.54677/MNMH8546.

[6] Tony Hey, Stewart Tansley, and Kristin Tolle, eds., 2009, The Fourth Paradigm, Microsoft Research, Redmont, Washington

[7] Jenn Riley. 2017, Understanding Metadata, NISO, Baltimore, MD

[8] Mark D. Wilkinson et al., 2016, The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data 3, 160018. issn: 2052-4463. doi: 10.1038/sdata.2016.18

[9] Barend Mons. 2020, Invest 5% of research funds in ensuring data are reusable, Nature 578, 491–491. doi: 10.1038/d41586-020-00505-7

[10] Carly Strasser, 2015, Research Data Management. NISO, Baltimore, MD

[11] Christophe Bahim et al. , 2020, The FAIR Data Maturity Model: An Approach to Harmonise FAIR Assessments, Data Science Journal 19, 41. doi: 10.5334/dsj-2020-041

[12] Guidelines for Safeguarding Good Research Practice. Code of Conduct, 2022, Deutsche Forschungsgemeinschaft. doi: 10.5281/zenodo.6472827

[13] Fostering Integrity in Research, 2017, National Academies Press, Washington, DC. ISBN: 978-0-309-39125-2. doi: 10.17226/21896

[14] Siegfried Hackel et al., 2021, The fundamental architecture of the DCC, Measurement: Sensors 18, 100354. doi: 10.1016/j.measen.2021.100354.

[15] Christine L. Borgman., 2012, The conundrum of sharing research data, J. Am. Soc. Information Science and Technology 63, 1059-1078. doi: 10.1002/asi.22634

[16] Bernd Paulus and Till Biskup, 2023, Towards more reproducible and FAIRer research data: documenting provenance during data acquisition using the Infofile format, Digital Discovery 2, 234–244. doi: 10.1039/D2DD00131D

[17] Jara Popp and Till Biskup, 2022, ASpecD: A modular framework for the analysis of spectroscopic data focussing on reproducibility and good scientific practice, Chemistry–Methods 2, e202100097. doi: 10.1002/cmtd.202100097

[18] Till Biskup, 2022, LabInform: A modular laboratory information system built from open source components, In: ChemRxiv. doi: 10.26434/chemrxiv-2022-vz360

[19] Ulrich Dirnagl et al., 2018, Quality management for academic laboratories: burden or boon?, EMBO Reports 19, e47143. doi: 10.15252/embr.201847143

[20] BIPM et al., 2012, International vocabulary of metrology – Basic and general concepts and associated terms (VIM), JCGM 200:2012 (3rd edition), Joint Committee for Guides in Metrology. doi: 10.59161/JCGM200-2012

[21] Ian Mills et al., eds., 1993, Quantities, Units and Symbols in Physical Chemistry (2nd edition), Blackwell Science, Oxford

[22] BIPM et al., 2023, Guide to the expression of uncertainty in measurement – Part 1: Introduction. JCGM GUM-1:2023. doi: 10.59161/JCGMGUM-1-2023

[23] R. A. Fisher, 1935, The Design of Experiments, Oliver and Boyd, London

[24] SI Reference Point. url: https://si-digital-framework.org/SI (visited on 05/28/2025).

[25] David W. Hogg, Jo Bovy, and Dustin Lang, 2010, Data analysis recipes: Fitting a model to data, arXiv: 1008.4686 [astro-ph.IM].

[26] Anne Plant and Robert Hanisch, 2020, Reproducibility in Science: A Metrology Perspective, Harvard Data Science Review 2.4 . doi: 10.1162/99608f92.eb6ddee4

[27] Mirjam Schröder and Till Biskup, 2023, LabInform ELN: A lightweight and flexible electronic laboratory notebook for academic research based on the open-source software DokuWiki. In: ChemRxiv. doi: 10.26434/chemrxiv-2023-2tvct

 

 

<< previous  |    contents     |     next >>