Published June 24, 2025 | https://doi.org/10.5438/bbr6-9590

Mapping Mathematics – Integrating zbMATH Open and the PID Graph

Creators & Contributors

Feature image

zbMATH: A Backbone of Mathematical Research

Imagine having a super-powered map that connects every important idea, tool, and discovery in mathematics — that's what zbMATH, and its incorporated platforms such as swMATH, a specific portal for mathematical research software, do. These platforms are like the ultimate librarians of math research, gathering and organizing knowledge from academic papers, software, datasets, and even citation networks. But they don't just collect information; they make it smarter.

They use persistent identifiers (PIDs) to weave a web of connections between research outputs, software, datasets, and authors. Think of it as a digital trail of breadcrumbs that never disappears—ensuring that every piece of research stays discoverable, traceable, and reusable for years to come.

Why does this matter? Because in the fast-moving world of math, discoverability and reproducibility are everything. With PIDs, zbMATH Open helps researchers pinpoint exactly how a theorem, algorithm, or software tool evolves across different studies. No more dead ends or broken links—just clearer, more transparent math.

In short, they're not just databases; they're the connective tissue of modern mathematical research, making sure that every breakthrough is built on solid, well-documented ground.

Strengthening the Backbone Through FAIRCORE4EOSC Collaboration

An opportunity to strengthen this backbone came through the FAIRCORE4EOSC (FC4E) project, a strategic initiative aimed at developing core components that advance the FAIR (Findable, Accessible, Interoperable, Reusable) principles within the European Open Science Cloud (EOSC). A central focus of this initiative is the PIDGraph, a key component designed to enhance research interoperability by establishing persistent, structured links between scholarly entities, including publications, datasets, software, and researchers. To demonstrate its effectiveness, FC4E incorporates case studies that explore the integration of domain-specific metadata into the PIDGraph, ensuring seamless connectivity between research objects. One such case study, led by the mathematical community through FIZ, seeks to integrate zbMATH Open services into the PIDGraph, enriching metadata with manually curated and aggregated information. This integration facilitates advanced discoverability, interlinks mathematical research software with broader scientific outputs, and extends the reach of zbMATH Open within the EOSC ecosystem, fostering cross-disciplinary collaboration and enhancing the overall transparency and accessibility of research knowledge. Thus, this integration not only enhances the visibility and accessibility of mathematical research software but also ensures that mathematicians who contribute to its development receive proper recognition and the credit they deserve.

Beginning in late 2023, following on from discussions at the in-person FC4E project meeting in The Hague ahead of the Beta Release Milestone of the project, the teams from DataCite and FIZ Karlsruhe had a series of meetings to lay the groundwork for the collaborative work necessary to integrate the Mathematics Case Study with the PID Graph component.

A key outcome from these meetings was the decision to focus on using the enhancements in zbMATH Open to records that were associated with existing DataCite-registered DOIs (primarily preprints registered through arXiv) to also enrich the PIDGraph data by ingesting new connections between those records and other scholarly research outputs, such as citations and references, as well as other entities represented by PIDs within the graph.

By bringing in a new source of curated, high-quality metadata, the utility of the PIDGraph would be enhanced by the broadening of coverage, and downstream consumers of the graph data would have access to the zbMATH Open data through an existing and interoperable system, providing benefit to the wider scholarly community.

Metadata Mapping

As part of the FAIRCORE4EOSC initiative, significant collaborative efforts were undertaken to map zbMATH Open and swMATH metadata for articles and software respectively to the DataCite Metadata Schema and metadata guidelines using XSLT. This process was facilitated through close cooperation between Mike Bennett on behalf of DataCite and Shiraz Malla Mohamad and Maxence Azzouz-Thuderoz representing FIZ Karlsruhe.

It started with the production of an initial XSLT crosswalk between the zbMATH metadata and the DataCite Metadata Schema produced by the zbMATH team, and simultaneous architecture planning and code development at DataCite to prepare for inclusion of the new source of data.

Once the first crosswalk was created, a sample set of data from zbMATH Open was transformed to the DataCite Metadata Schema. With the assistance of other colleagues from the wider DataCite team, some recommendations for fixing inconsistencies with the schema and improving the provision of data in appropriate fields were produced and implemented by the FIZ team, ultimately leading to a comprehensive transformation of the data into a form compatible with DataCite systems.

There were some challenges along the way – for example, not every record contained enough data to populate all of the properties that are required by the DataCite Metadata Schema, and data licensing requirements meant that some data could not be openly shared. By making use of the standardised values for unknown information provided by the schema, both of these problems were overcome, allowing the XML records to meet the validation requirements for mandatory fields.

Another issue was the strict order requirements for elements imposed by the XSD used for validating the schema, where any deviation would result in metadata validation errors despite the data being present and correct. This was corrected during the rounds of feedback and had the added bonus of providing useful input into discussions for the design of the next version of the DataCite Metadata Schema.

Whilst the work on producing the crosswalks was ongoing at FIZ, the DataCite team was busy with adding a new ingest agent to the Event Data service to handle the ingestion of the data from the zbMATH OAI server. By integrating the new source into an existing system for sharing PID connections, it was possible to take advantage of connected functionality to provide increased utility and visibility of the data – for example, the DataCite Commons frontend would automatically include any new citations and references derived from the data in the visual display of DOI records.

This new code for ingesting from OAI also went through several rounds of development and improvement, with each pass improving the amount and quality of the data added to the system, and tackling challenges inherent in the OAI-PMH protocol and the sheer quantity of data provided by the zbMATH Open service.

Shiraz and Mike worked meticulously to ensure a flawless implementation and this iterative approach, supported by regular discussions online and strengthened through in-person meetings at FC4E events, led to the successful integration of metadata, meeting the expected standards and enhancing interoperability between zbMATH Open, swMATH, and DataCite services.

Results

Once both the crosswalk and ingestion code were complete, the focus switched to ingesting the data into DataCite production systems and making it available as part of the PIDGraph services and the full set of transformed data was processed to extract relevant connections between PID and add them to the Event Data service, instantly surfacing them through all PIDGraph systems.

Overall, several million new connections were ingested into the graph, significantly enhancing the records covered (for example, this record gained 82 new references, none of which were previously known to the graph).

The ingested data also became part of the latest release of the DataCite PID Links Data File and was also published to Zenodo as a dedicated dataset, providing additional avenues for consumption and reuse.

By using the well-adopted DataCite Metadata Schema for the records, and the provision of the data via the standardised OAI-PMH protocol, we committed to promoting the usage of interoperable methods of data exchange. This also ensured that the data can be harvested and reused by other interested parties, both from the PIDGraph services and from the zbMATH Open OAI server.

Next Steps

Although the scoping of this work to connections between PIDs was defined by the requirements of the FAIRCORE4EOSC project, where the PIDGraph component had a requirement to ingest new vertices into the graph, the zbMATH Open records have further enhancements to the metadata of the records (for example subject classification using controlled vocabularies) that have the potential to provide more enrichment of the DataCite DOI records in the future.

As DataCite works towards the goal of enriching and enhancing DOI metadata, we look forward to exploring future possibilities and partnerships to enhance the scholarly record for the benefit of the entire community, building on the work done during this collaboration.


Acknowledgments

The work to produce the data, ingest it, and generate the PID Links data file was made possible through support from the FAIRCORE4EOSC project. FAIRCORE4EOSC has received funding from the EU's Horizon Europe research and innovation programme under Grant Agreement no. 101057264.

Additional details

Description

zbMATH: A Backbone of Mathematical Research Imagine having a super-powered map that connects every important idea, tool, and discovery in mathematics — that's what zbMATH , and its incorporated platforms such as swMATH , a specific portal for mathematical research software, do. These platforms are like the ultimate librarians of math research, gathering and organizing knowledge from academic papers, software,

Identifiers

UUID
e3960070-e601-47ea-9511-f8b8107ea31f
GUID
https://datacite.org/?p=13097
URL
https://datacite.org/blog/mapping-mathematics-integrating-zbmath-open-and-the-pid-graph/

Dates

Issued
2025-06-24T08:07:46
Updated
2025-06-24T16:59:50