Open Access

A Novel Method for Resolving and Completing Authors’ Country Affiliation Data in Bibliographic Records


Cite

Purpose

Our work seeks to overcome data quality issues related to incomplete author affiliation data in bibliographic records in order to support accurate and reliable measurement of international research collaboration (IRC).

Design/methodology/approch

We propose, implement, and evaluate a method that leverages the Web-based knowledge graph Wikidata to resolve publication affiliation data to particular countries. The method is tested with general and domain-specific data sets.

Findings

Our evaluation covers the magnitude of improvement, accuracy, and consistency. Results suggest the method is beneficial, reliable, and consistent, and thus a viable and improved approach to measuring IRC.

Research limitations

Though our evaluation suggests the method works with both general and domain-specific bibliographic data sets, it may perform differently with data sets not tested here. Further limitations stem from the use of the R programming language and R libraries for country identification as well as imbalanced data coverage and quality in Wikidata that may also change over time.

Practical implications

The new method helps to increase the accuracy in IRC studies and provides a basis for further development into a general tool that enriches bibliographic data using the Wikidata knowledge graph.

Originality

This is the first attempt to enrich bibliographic data using a peer-produced, Web-based knowledge graph like Wikidata.

eISSN:
2543-683X
Language:
English
Publication timeframe:
4 times per year
Journal Subjects:
Computer Sciences, Information Technology, Project Management, Databases and Data Mining