Open Access

Provenance Description of Metadata Vocabularies for the Long-term Maintenance of Metadata


Cite

Purpose

The purpose of this paper is to discuss provenance description of metadata terms and metadata vocabularies as a set of metadata terms. Provenance is crucial information to keep track of changes of metadata terms and metadata vocabularies for their consistent maintenance.

Design/methodology/approach

The W3C PROV standard for general provenance description and Resource Description Framework (RDF) are adopted as the base models to formally define provenance description for metadata vocabularies.

Findings

This paper defines a few primitive change types of metadata terms, and a provenance description model of the metadata terms based on the primitive change types. We also provide examples of provenance description in RDF graphs to show the proposed model.

Research limitations

The model proposed in this paper is defined based on a few primitive relationships (e.g. addition, deletion, and replacement) between pre-version and post-version of a metadata term. The model is simplified and the practical changes of metadata terms can be more complicated than the primitive relationships discussed in the model.

Practical implications

Formal provenance description of metadata vocabularies can improve maintainability of metadata vocabularies over time. Conventional maintenance of metadata terms is the maintenance of documents of terms. The proposed model enables effective and automated tracking of change history of metadata vocabularies using simple formal description scheme defined based on widely-used standards.

Originality/value

Changes in metadata vocabularies may cause inconsistencies in the longterm use of metadata. This paper proposes a simple and formal scheme of provenance description of metadata vocabularies. The proposed model works as the basis of automated maintenance of metadata terms and their vocabularies and is applicable to various types of changes.

eISSN:
2543-683X
Language:
English
Publication timeframe:
4 times per year
Journal Subjects:
Computer Sciences, Information Technology, Project Management, Databases and Data Mining