Collections Digitization Glossary

A defined term set in the domain of museum collections digitization

  • The Biological Collections Ontology
  • The BCO supports the interoperability of biodiversity and biodiversity related data, including data on museum collections, environmental/metagenomic samples, and ecological surveys. A key aspect of the BCO is distinguishing among material samples (i.e. specimens), observing processes, and data about either of those entities.
  • https://github.com/BiodiversityOntologies/bcom/p/bco/
  • Basic Formal Ontology
  • The Basic Formal Ontology (BFO) is a small, upper level ontology that is designed for use in supporting information retrieval, analysis and integration in scientific and other domains.
  • https://basic-formal-ontology.org/
  • CHAracterisation MEthodology Ontology (CHAMEO)
  • A semantic framework designed to harmonize materials characterisation methodologies. CHAMEO provides a standardized terminology and structure to model common aspects across various characterisation techniques, facilitating interoperability and knowledge sharing in materials science.
  • https://emmo-repo.github.io/domain-characterisation-methodology/index.html
  • Core Ontology for Biology and Biomedicine
  • COB brings together key terms from a wide range of OBO projects to improve interoperability.
  • https://obofoundry.org/COB/
  • Croissant Format Specification
  • The Croissant metadata format simplifies how data is used by ML models. It provides a vocabulary for dataset attributes, streamlining how data is loaded across ML frameworks such as PyTorch, TensorFlow or JAX. In doing so, Croissant enables the interchange of datasets between ML frameworks and beyond, tackling a variety of discoverability, portability, reproducibility, and responsible AI (RAI) challenges.
  • https://docs.mlcommons.org/croissant/docs/croissant-spec.html
  • CIDOC Conceptual Reference Model
  • A formal ontology intended to facilitate the integration, mediation and interchange of heterogeneous cultural heritage information and similar information from other domain
  • http://www.cidoc-crm.org/cidoc-crm/
  • Contributor Role Ontology
  • A classification of the diverse roles performed in the work leading to a published research output in the sciences. Its purpose to provide transparency in contributions to scholarly published work, to enable improved systems of attribution, credit, and accountability.
  • https://ontobee.org/ontology/CRO
  • DataCite Metadata Schema
  • DataCite Metadata Working Group. (2024). DataCite Metadata Schema Documentation for the Publication and Citation of Research Data and Other Research Outputs. Version 4.6. DataCite e.V. [https://doi.org/10.14454/mzv1-5b55](https://doi.org/10.14454/mzv1-5b55)
  • https://doi.org/10.14454/mzv1-5b55
  • Data Catalog Vocabulary (Version 3)
  • DCAT is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web. This document defines the schema and provides examples for its use.
  • https://www.w3.org/TR/vocab-dcat-3/
  • Darwin Core
  • Darwin Core is a standard maintained by the Darwin Core Maintenance Group. It includes a glossary of terms (in other contexts these might be called properties, elements, fields, columns, attributes, or concepts) intended to facilitate the sharing of information about biological diversity by providing identifiers, labels, and definitions.
  • https://dwc.tdwg.org/
  • EDAM - Ontology of bioscientific data analysis and data management
  • EDAM is a comprehensive ontology of well-established, familiar concepts that are prevalent within scientific data analysis and data management (both within and beyond life sciences and [imaging](https://github.com/edamontology/edam-bioimaging)). EDAM includes topics, operations, types of data and data identifiers, and data formats. EDAM provides a set of concepts with preferred terms and synonyms, related terms, definitions, and other information - organised into a simple and intuitive hierarchy for convenient use
  • https://edamontology.org
  • Experimental Factor Ontology
  • The Experimental Factor Ontology (EFO) provides a systematic description of many experimental variables available in EBI databases, and for projects such as the GWAS Catalog. It combines parts of several biological ontologies, such as UBERON anatomy, ChEBI chemical compounds, and Cell Ontology.
  • https://www.ebi.ac.uk/efo/
  • Elementary Multiperspective Material Ontology
  • The Elementary Multiperspective Material Ontology (EMMO) is the result of a multidisciplinary effort within the EMMC, aimed at the development of a standard representational ontology framework based on current materials modelling and characterization knowledge. Instead of starting from general upper level concepts, as done by other ontologies, the EMMO development started from the very bottom level, using the actual picture of the physical world coming from applied sciences, and in particular from physics and material sciences.
  • https://emmo-repo.github.io/index.html
  • FRBR-aligned Bibliographic Ontology (FaBiO)
  • FaBiO, the FRBR-aligned Bibliographic Ontology, is an ontology for recording and publishing on the Semantic Web descriptions of entities that are published or potentially publishable, and that contain or are referred to by bibliographic references, or entities used to define such bibliographic references.
  • http://www.sparontologies.net/ontologies/fabio
  • The Funding, Research Administration and Projects Ontology
  • The Funding, Research Administration and Projects Ontology (FRAPO) is an ontology for describing the administrative information of research projects, e.g., grant applications, funding bodies, project partners, etc.
  • https://sparontologies.github.io/frapo/current/frapo.html
  • GConsent is a consent ontology based on the GDPR
  • GConsent is an OWL2 ontology for representing consent for GDPR compliance. The ontology is based on an analysis of modelling metadata requirements related to the consent lifecycle for GDPR compliance. It allows modelling and representation of information related to compliance in an extensible and comprehensive manner.
  • https://openscience.adaptcentre.ie/ontologies/GConsent/docs/ontology
  • Gender, Sex, and Sexual Orientation (GSSO) ontology
  • The Gender, Sex, and Sexual Orientation (GSSO) ontology has terms for annotating interdisciplinary information concerning gender, sex, and sexual orientation for primary usage in the biomedical and adjacent sciences.
  • https://gsso.research.cchmc.org/
  • International Virtual Observatory Alliance (IOVA) Provenance Data Model Version 1.0
  • Describes how provenance information can be modeled, stored and exchanged within the astronomical community in a standardized way.
  • http://www.ivoa.net/documents/ProvenanceDM
  • Ontology for Informatics Research Artifacts
  • Informatics and computer science researchers usually contribute to scientific knowledge by delivering and tangible outputs, namely research artifacts. Typical examples of informatics research artifacts are software prototypes, datasets, ontologies, methodologies, frameworks. Computer science conferences such as ISWC, ESWC, and others have started to use resource tracks to allow for resource papers that describe these artifacts. The goal of the Ontology for Informatics Research Artifacts (IRAO) is to fill this gap.
  • https://w3id.org/def/InformaticsResearchArtifactsOntology
  • ISO 12651-2:2014(en) Electronic document management — Vocabulary — Part 2: Workflow management
  • ISO 12651-2:2014 defines terms and concepts relevant to electronic document workflow management.
  • https://www.iso.org/standard/42673.html
  • ISO 16439:2014 Information and documentation — Methods and procedures for assessing the impact of libraries
  • ISO 16439:2014 defines terms for impact assessment of libraries and specifies methods for such assessment: for the purpose of strategic planning and internal quality management of libraries; to facilitate comparison of library impact over time and between libraries of similar type and mission; to promote the libraries' role and value for learning and research, education and culture, social and economic life; to support political decisions on levels of service and strategic goals for libraries.
  • https://www.iso.org/standard/56756.html
  • ISO 17369:2013(en) Statistical data and metadata exchange (SDMX)
  • ISO 17369:2013 provides an integrated approach to facilitating Statistical Data and Metadata Exchange (SDMX), enabling interoperable implementations within and between systems concerned with the exchange, reporting and dissemination of statistical data and related metadata.
  • https://www.iso.org/standard/52500.html
  • ISO 18461:2016 International museum statistics
  • ISO 18461:2016 specifies rules for the museum community on the collection and reporting of statistics. It provides definitions and counting procedures for all types of resources and services that museums offer to their users.
  • https://www.iso.org/standard/62504.html
  • ISO 18530:2021 Health informatics — Automatic identification and data capture marking and labelling — Subject of care and individual provider identification
  • This document outlines the standards needed to identify and label the Subject of Care (SoC) and the Individual Provider on objects such as identification (wrist) bands, identification tags or other objects, to enable automatic data capture using data carriers in the care delivery process.
  • https://www.iso.org/standard/77333.html
  • ISO 19101-1:2014(en) Geographic information — Reference model — Part 1: Fundamentals
  • ISO 19101-1:2014 defines the reference model for standardization in the field of geographic information. This reference model describes the notion of interoperability and sets forth the fundamentals by which this standardization takes place.
  • https://www.iso.org/standard/59164.html
  • ISO 19104:2016(en) Geographic information — Terminology
  • ISO 19104:2016 specifies requirements for the collection, management and publication of terminology in the field of geographic information.
  • https://www.iso.org/standard/63541.html
  • ISO 19126:2021(en) Geographic information — Feature concept dictionaries and registers
  • ISO 19126 specifies a schema for feature concept dictionaries to be established and managed as registers. It does not specify schemas for feature catalogues or for the management of feature catalogues as registers. However, as feature catalogues are often derived from feature concept dictionaries, this document does specify a schema for a hierarchical register of feature concept dictionaries and feature catalogues.
  • https://www.iso.org/standard/78898.html
  • ISO 21246:2019 Information and documentation — Key indicators for museums
  • ISO 21246 specifies a set of key indicators for assessing the quality of museums: — for the purpose of strategic planning and internal management of museums; — for reporting to stakeholders such as funding institutions, policy makers, or the public; — to promote the museums' role and value for learning and research, education and culture, social and economic life; — for comparing results over time and between museums.
  • https://www.iso.org/standard/70231.html
  • ISO 21506:2024 Project, programme and portfolio management — Vocabulary
  • ISO 21506:2024 defines terms used in the field of project, programme and portfolio management.
  • https://www.iso.org/standard/87900.html
  • ISO 22932-2:2020(en) Mining — Vocabulary — Part 2: Geology
  • ISO 22932-2:2020 specifies the geologic terms commonly used in mining. Only those terms that have a specific meaning in this field are included.
  • https://www.iso.org/standard/75387.html
  • ISO 24480:2024(en) Biotechnology — Validation of database used for nucleotide sequence evaluation
  • ISO 24480:2024 describes a practical procedure for nucleotide sequence database evaluation and validation.
  • https://www.iso.org/standard/78877.html
  • ISO 24635-1:2025(en) Language resource management — Corpus annotation project management
  • ISO 24635-1:2025 establishes a core model of project management for corpus annotation, to specify the work packages of project teams, required processes and deliverables.
  • https://www.iso.org/standard/79083.html
  • ISO 2789:2013(en) Information and documentation — International library statistics
  • This International Standard provides guidance to the library and information services community on the collection and reporting of statistics.
  • https://www.iso.org/standard/60680.html
  • ISO 5127:2017(en) Information and documentation — Foundation and vocabulary
  • ISO 5127:2017 provides a concept system and general vocabulary for the field of documentation within the whole information field.
  • https://www.iso.org/standard/59743.html
  • ISO 6707-2:2017(en) Buildings and civil engineering works — Vocabulary — Part 2: Contract and communication terms
  • ISO 6707-2:2017 defines terms applicable to contracts and communication in relation to buildings and civil engineering works.
  • https://www.iso.org/standard/70040.html
  • ISO 9241-125:2017(en) Ergonomics of human-system interaction — Part 125: Guidance on visual presentation of information
  • ISO 9241-125:2017 provides guidance for the visual presentation of information controlled by software, irrespective of the device. It includes specific properties such as the syntactic or semantic aspects of information, e.g. coding techniques, and gives provisions for the organization of information taking account of human perception and memory capabilities.
  • https://www.iso.org/standard/64839.html
  • ISO/IEC 23000-15:2016 Information technology — Multimedia application format (MPEG-A)Part 15: Multimedia preservation application format
  • ISO/IEC 23000-15:2016 specifies the standard representation of the multimedia description information (MPDI) generated and used by an organization in the process of preserving a multimedia asset for the purpose of facilitating the exchange of multimedia content between archives or other stakeholders (e.g. publishers, broadcasters, service providers and the like), as well as subsequent preservation and use.
  • https://www.iso.org/standard/66430.html
  • ISO/IEC 5207:2024(en) Information technology — Data usage — Terminology and use cases
  • This document sets out terminology and use cases for data use, sharing and exchange. This document provides use cases detailing various types of data usage from both historical and hypothetical perspectives.
  • https://www.iso.org/standard/80998.html
  • ISO/IEC TR 15067-3-8:2020(en) Information technology — Home Electronic System (HES) application model — Part 3-8: GridWise transactive energy framework
  • ISO/IEC TR 15067-3-8:2020(E), which is a Technical Report, provides a conceptual framework for developing architectures and designing solutions related to transactive energy (TE).
  • https://www.iso.org/standard/81781.html
  • ISO/IEC/IEEE 24748-7000:2022 Systems and software engineering — Life cycle management
  • The standard establishes a set of processes by which engineers and technologists can include consideration of ethical values throughout the stages of concept exploration and development, which encompass system initiation, analysis, and design.
  • https://www.iso.org/standard/84893.html
  • ISO/TR 14872:2019(en) Health informatics — Identification of medicinal products
  • The purpose of this document is to describe the core principles and proposed service delivery model for supporting implementation and ongoing maintenance of IDMP terminologies.
  • https://www.iso.org/standard/65714.html
  • ISO/TS 16710-1:2024(en) Ergonomics methods — Part 1: Feedback method
  • This document describes the “Feedback Method”, a method designed specifically to collect the contribution of machinery end-users by reconstructing and understanding how work is actually performed (i.e. the real work). This method can help to improve technical standards, as well as the design, manufacturing, and use of machinery.
  • https://www.iso.org/standard/84153.html
  • Lightweight Information Describing Objects
  • LIDO is an XML schema intended for delivering metadata, for use in a variety of online services, from an organization’s collections database to portals of aggregated resources, as well as exposing, sharing and connecting data on the web.
  • http://lido-schema.org/schema/latest/lido.html
  • Medical Subject Headings
  • MeSH (Medical Subject Headings) is the National Library of Medicine's controlled vocabulary thesaurus. It consists of sets of terms naming descriptors in a hierarchical structure that permits searching at various levels of specificity. This thesaurus is used by NLM for indexing articles from biomedical journals, cataloguing of books, documents, etc.
  • https://w3id.org/biopragmatics/resources/mesh/mesh.ofn
  • Metadata4Ing: An ontology for describing the generation of research data within a scientific activity.
  • The ontology Metadata4Ing provides a framework for the semantic description of research data and of the whole data generation process, embracing the object of investigation, all sample and data manipulation methods and tools, the data files themselves, and the roles of persons and institutions. The structure and application of the ontology are based on the principles of modularity and inheritance.
  • https://nfdi4ing.pages.rwth-aachen.de/metadata4ing/metadata4ing/
  • NCI Thesaurus OBO Edition
  • NCI Thesaurus (NCIt)is a reference terminology that includes broad coverage of the cancer domain, including cancer related diseases, findings and abnormalities. The NCIt OBO Edition aims to increase integration of the NCIt with OBO Library ontologies. NCIt OBO Edition releases should be considered experimental.
  • https://ontobee.org/ontology/NCIT
  • Next generation Biobanking Ontology
  • Next Generation Biobanking Ontology (NGBO) is an open application ontology representing contextual data about omics digital assets in biobank. The ontology focuses on capturing the information about three main activities: wet bench analysis used to generate omics data, bioinformatics analysis used to analyze and interpret data, and data management.
  • https://github.com/Dalalghamdi/NGBO
  • Ontology for Biomedical Investigations
  • The Ontology for Biomedical Investigations (OBI) project is developing an integrated ontology for the description of life-science and clinical investigations.
  • http://obi-ontology.org/
  • Ontology for BIoBanking (OBIB)
  • The Ontology for Biobanking (OBIB) is an ontology for the annotation and modeling of the activities, contents, and administration of a biobank. Biobanks are facilities that store specimens, such as bodily fluids and tissues, typically along with specimen annotation and clinical data. OBIB is based on a subset of the Ontology for Biomedical Investigation (OBI), has the Basic Formal Ontology (BFO) as its upper ontology, and is developed following OBO Foundry principles. The first version of OBIB resulted from the merging of two existing biobank-related ontologies, OMIABIS and biobank ontology.
  • http://purl.obolibrary.org/obo/obib.owl
  • Ontology of Precision Medicine and Investigation
  • The Ontology of Precision Medicine and Investigation (OPMI) aims to ontologically represent and standardize various entities and relations associated with precision medicine and related investigations at different conditions.
  • https://github.com/OPMI/opmi
  • The OPMW-PROV Ontology
  • The Open Provenance Model for Workflows (OPMW) is an ontology for describing workflow traces and their templates based on the Open Provenance Model. It has been designed as a profile for OPM, extending and reusing OPM's core ontologies [OPMV (OPM-Vocabulary)](http://purl.org/net/opmv/ns) and [OPMO (OPM-Ontology)](http://openprovenance.org/model/opmo). Since 2013, a standard for provenance has been approved by the W3C. Therefore, OPMW has been updated to extend the standard ([PROV-O](http://www.w3.org/ns/prov-o)). It also extends the [P-plan ontology](http://purl.org/net/p-plan), designed to represent scientific processes.
  • https://www.opmw.org/model/OPMW/
  • P-Plan Ontology
  • The Ontology for Provenance and Plans (P-Plan) is an extension of the PROV-O ontology [[PROV-O](https://vocab.linkeddata.es/p-plan/index.html#PROV-O)] created to represent the plans that guided the execution of scientific processes. P-Plan describes how the plans are composed and their correspondence to provenance records that describe the execution itself.
  • https://www.opmw.org/model/p-plan/
  • PAV - Provenance, Authoring and Versioning
  • PAV is a lightweight ontology for tracking Provenance, Authoring and Versioning. PAV specializes the W3C provenance ontology PROV-O in order to describe authorship, curation and digital creation of online resources.
  • https://pav-ontology.github.io/pav/
  • Physicalistic Interpretation of Modelling and Simulation - Interoperability Infrastructure
  • A mid-level ontology with a focus on documenting cognitive processes and epistemic metadata
  • http://www.molmod.info/semantics/pims-ii/
  • PROV-O: The PROV Ontology
  • The PROV Ontology (PROV-O) provides a set of classes, properties, and restrictions that can be used to represent and interchange provenance information generated in different systems and under different contexts. It can also be specialized to create new classes and properties to model provenance information for different applications and domains.
  • https://www.w3.org/TR/prov-o/
  • ProvONE: A PROV Extension Data Model for Scientific Workflow Provenance
  • ProvONE, a standard for scientific workflow provenance representation, is defined as an extension of the W3C recommended standard PROV, aiming to capture the most relevant information concerning scientific workflow computational processes, and providing extension points to accommodate the specificities of particular scientific workflow systems.
  • http://purl.dataone.org/provone/2015/01/15/ontology#
  • The Publishing Workflow Ontology (PWO)
  • The Publishing Workflow Ontology (PWO) is a simple ontology for describing the steps in the workflow associated with the publication of a document or other publication entity.
  • https://sparontologies.github.io/pwo/current/pwo.html
  • Repository Asset Distribution (RADion)
  • RADion, and the higher level vocabularies that build upon it, are intended as a model that facilitates federation and co-operation. It is not the primary intention that repository owners redesign or convert their current systems and data to conform to RADion, but rather that it acts as a common layer among repositories that want to exchange data.
  • https://www.w3.org/ns/radion
  • International Council on Archives Records in Contexts Ontology (ICA RiC-O) version 1.1
  • RiC-O (Records in Contexts-Ontology) is an OWL ontology for describing archival record resources. As the third part of Records in Contexts standard, it is a formal representation of Records in Contexts Conceptual Model (RiC-CM). This version, which is v1.1, is the latest official release. It is compliant with RiC-CM v1.0.
  • https://www.ica.org/standards/RiC/RiC-O_1-1.html
  • Schema.org
  • Schema.org is a collaborative, community activity with a mission to create, maintain, and promote schemas for structured data on the Internet, on web pages, in email messages, and beyond.
  • https://schema.org/
  • Semanticscience Integrated Ontology
  • The semanticscience integrated ontology (SIO) provides a simple, integrated upper level ontology (types, relations) for consistent knowledge representation across physical, processual and informational entities. It provides vocabulary for the Bio2RDF (http://bio2rdf.org) and SADI (http://sadiframework.org) projects.
  • http://sio.semanticscience.org/
  • Semantic Sensor Network Ontology
  • The Semantic Sensor Network (SSN) ontology is an ontology for describing sensors and their observations, the involved procedures, the studied features of interest, the samples used to do so, and the observed properties, as well as actuators.
  • https://www.w3.org/TR/vocab-ssn/
  • USGS Thesaurus
  • Topics and methods of scientific study carried out by USGS, with product types, scientific disciplines, geologic time, and types of institutional structure and activities. Broad and shallow, used to help people find scientific information.
  • https://apps.usgs.gov/thesaurus/download/USGSThesaurus.rdf
  • The Workflow Fragment Description Ontology
  • The Workflow Fragment Description Ontology (wf-fd) is a simple ontology designed to link the common workflow fragments detected by applying graph mining techniques to a collection of workflows to the original workflow collection. That is, wf-fd links the common workflow fragments to the workflows where they appear.
  • https://vocab.linkeddata.es/wffd/
  • The Workflow Motif Ontology
  • This document explains the ontology for the Workflow Motif catalogue described in [Workflow Catalogue]. The catalogue highlights the results obtained from a manual analysis performed over a set of real-world scientific workflows from Taverna [Taverna], Wings [Wings], Galaxy [Galaxy] and Vistrails [Vistrails]. Workflow Motifs outline the kinds of data-intensive activities that are observed in workflows (data-operation motifs) and the different manners in which activities are implemented within workflows (workflow-oriented motifs). These motifs are helpful to identify the functionality of the steps in a given workflow, to develop best practices for workflow design, and to develop approaches for automated generation of workflow abstractions
  • http://purl.org/net/wf-motifs
  • The WICUS Software Stack ontology
  • The WICUS Software Stack ontology have been developed to describe the software elements of a computational resource. These descriptions can be used to describe both, the already deployed software components and the software requiremens of a workflow. This ontology is part of the [WICUS Ontology network](http://purl.org/net/wicus).
  • https://vocab.linkeddata.es/wicus/stack/
  • w3C Provenance Activity Type Codelist
  • This value set includes W3C PROV Data Model Activity concepts, which are treated as codes in this valueset. Some adaptations were made to make these concepts suitable values for the Provenance.activity element. Coded concepts are from PROV-DM and the display names are their counterparts in PROV-N (human readable notation syntax specification).
  • http://hl7.org/fhir/w3c-provenance-activity-type
  • CIDOC Conceptual Reference Model version 7.1.3
  • The CIDOC Conceptual Reference Model (“CIDOC CRM”) is a formal ontology intended to facilitate the integration, mediation and interchange of heterogeneous cultural heritage information and similar information from other domain.
  • https://cidoc-crm.org/sites/default/files/cidoc_crm_version_7.1.3.pdf