Building Better DOI Metadata: Lessons from the PID Network Germany Pilot at LMU Munich
Creators & Contributors
The quality and interoperability of metadata are essential for connecting research outputs, people, organizations, and infrastructures across the scholarly record. As a project partner in the PID Network Germany project, DataCite addresses this need by bringing together stakeholders from research and cultural heritage sectors to strengthen the adoption, standardization, and practical implementation of persistent identifier (PID) systems in Germany. As part of this work, pilot implementations with LMU Munich and Technical University Dortmund explore how community-informed metadata guidance can be translated into concrete improvements in local repository and DOI workflows. These experiences closely align with DataCite's vision of metadata as a shared global asset that is open, connected, reusable, and maintained through community collaboration. This interview shines a light on the work done in the pilot at LMU Munich and how it contributes to DataCite's overall goal of improving metadata.
Improving DataCite DOI Metadata at Ludwig Maximilian University of Munich
At Ludwig Maximilian University of Munich, the University Library (UB) and the IT-Gruppe Geisteswissenschaften (ITG) have been working together for several years on improving the quality, consistency, and reuse of DataCite metadata. This collaboration resulted in an updated version of a DataCite Metadata Generator for schema version 4.3 as well as a DataCite Best Practice Guide, both originally developed within the project eHumanities interdisziplinär (funded by the Bavarian State Ministry of Science and the Arts between 2018–2023). The goal of this work was to establish and extend a FAIR-oriented metadata model for research data in the Digital Humanities in order to improve discoverability, interoperability, and reuse across repositories and library systems. Because research data in the Digital Humanities often require both formal and detailed subject-specific descriptions, the project combined the widely used DataCite Metadata Schema with recommendations for more structured and standardized metadata creation.
Building on user feedback and practical experience gained from institutional workflows following the initial release, the DataCite Metadata Generator was completely reimplemented, modernized to align with current technical standards, and enhanced to meet evolving functional requirements. As part of the project Aufbau HITS FDM within the Digitalverbund Bayern, the project partners ITG and UB once again collaborated closely throughout the development process. The result is a modern web-based application, that anyone can use to create, validate, reuse, and enrich DataCite metadata in both XML and JSON formats.


This work aligns closely with the Praxisorientierte Leitlinien für Lieferanten von DataCite DOI-Metadaten ("Practical guidelines for providers of DataCite DOI metadata") published by the PID Network Germany project, where LMU Munich participated as a pilot in the implementation phase with funding through the PID Network Germany. Both the DataCite Metadata Generator and the accompanying DataCite Best Practice Guide aim to make high-quality metadata creation more practical, interoperable, and sustainable for repositories, research institutions, and data providers.
- Which concrete challenge around DataCite DOI metadata did you want to address through the pilot?
One of the main challenges we observed was that creating high-quality DataCite metadata often requires substantial manual work and metadata expertise. Researchers and repository staff frequently struggle with incomplete metadata, inconsistent entries, or uncertainty about how to apply the DataCite Metadata Schema correctly.
Another issue was the limited reuse of existing metadata. In many workflows, metadata records already exist in XML or JSON formats, but they cannot easily be validated, improved, or adapted for new publication processes. This often results in duplicated work and inconsistent metadata quality across systems.
With the newly created and improved DataCite Metadata Generator, we wanted to address these issues by creating a workflow that supports both metadata creation and metadata reuse in a more efficient and user-friendly way.
- How do the DataCite Metadata Generator and the DataCite Best Practice Guide currently support your metadata workflows?
The updated DataCite Metadata Generator supports our workflows by providing a structured, web-based interface for creating and editing DataCite metadata. Users can generate metadata records by completing an interactive form that directly generates XML or JSON metadata files.
A key aspect of the tool is that it also allows users to upload existing metadata files. These records can then be checked for completeness, edited, enriched, and exported again. This significantly improves metadata reuse and reduces redundant manual work.

The DataCite Best Practice Guide and the Praxisorientierte Leitlinien für Lieferanten von DataCite DOI-Metadaten served as an important reference throughout the development process. These guides provide practical orientation for implementing the DataCite Metadata Schema.
- Which metadata fields or recommendations were especially relevant for your pilot, and why?
Persistent identifiers (PIDs) played a central role in the updated version of the DataCite Metadata Generator. In particular, ORCID identifiers for creators and contributors and ROR identifiers for organizations (e.g., creator, contributor, publisher, or funder) were highly relevant because they improve consistency, disambiguation, and machine readability.
To support this, the generator integrates external APIs such as ORCID and ROR. This allows metadata to be automatically enriched with additional information linked to these identifiers, reducing manual input and helping users create more standardized records.

We also placed a strong focus on making metadata as complete, consistent, and reusable as possible. To support this, the application offers dropdown menus for fields with fixed value lists in the DataCite Metadata Schema (e.g., Resource Type and Identifier Schema), and provides guidance for using structured information for licensing information, language lists, and subject schema. These features significantly improve metadata interoperability and downstream reuse.
- What changes, improvements, or decisions emerged during the pilot work?
The pilot phase led to several important improvements and architectural decisions. One major step was expanding the generator beyond the creation of new metadata records. The new version now also supports uploading, validating, editing, and re-exporting existing records, making the tool much more useful in real-world repository workflows.
Another key insight from the pilot was the importance of openness, flexibility, and maintainability. To support easy adoption by other institutions, the architecture was intentionally kept as lightweight and modular as possible, simplifying integration into existing infrastructures. Particular attention was also given to long-term sustainability: instead of tightly coupling the application logic to a single DataCite Metadata Schema version, the system was designed to accommodate future schema updates with minimal effort.
To encourage reuse and further development, the source code will be published openly on GitHub soon, enabling other institutions to integrate the tool into their own infrastructures and extend it with institution-specific requirements.
- What did you learn about making metadata guidance more practical and usable for researchers, repository staff, or other data providers?
One key lesson was that guidance becomes significantly more useful when it is directly embedded into workflows and tools. Many users do not want to work through extensive documentation while entering metadata. Instead, they benefit from contextual support, validation feedback, and automated enrichment integrated directly into the interface.
We also learned that flexibility is essential. Different institutions and repositories have different metadata requirements, workflows, and levels of technical expertise. Providing reusable and adaptable tooling therefore proved just as important as providing recommendations themselves.
Finally, the pilot highlighted that metadata quality improves when users can easily reuse and refine existing records instead of starting from scratch each time.
- What advice would you give to other institutions that want to improve the quality and completeness of their DOI metadata?
We would recommend focusing on practical workflows rather than only on formal metadata requirements. High-quality metadata is much easier to achieve when tools actively support users through validation, PID integration, and reusable metadata structures.
Institutions should also prioritize interoperability and openness from the beginning. Using standardized identifiers such as ORCID and ROR, supporting common exchange formats like XML and JSON, and designing workflows that can evolve alongside new schema versions are all important long-term investments.
The post Building Better DOI Metadata: Lessons from the PID Network Germany Pilot at LMU Munich appeared first on DataCite.
Additional details
Description
The quality and interoperability of metadata are essential for connecting research outputs, people, organizations, and infrastructures across the scholarly record.
Identifiers
- UUID
- c15280f1-47ac-4fe4-91b7-fcd3e894a72d
- GUID
- https://datacite.org/?p=15321
- URL
- https://datacite.org/blog/building-better-doi-metadata-lessons-from-the-pid-network-germany-pilot-at-lmu-munich/
Dates
- Issued
-
2026-06-02T15:41:45
- Updated
-
2026-06-02T17:07:59