Data sharing, visibility and evaluation at PLOS: An interview with Veronique Kiermer
Creators & Contributors
DOI: 10.60804/71BR-9Z42
Can you tell us a bit about your role at PLOS, and your involvement with open science?
I'm the Chief Scientific Officer at PLOS and I have responsibility for the Editorial department. PLOS's mission is to drive open science forward through meaningful changes in publishing. Therefore, open science is really everyone's priority at PLOS and everyone contributes in one way or another. I focus on strategic projects to push this mission forward — trying to understand how we can leverage and adapt what we do as a publisher to contribute to the open science ecosystem. I also try to contribute to initiatives outside of PLOS either to advocate for an open science perspective or to help with tools and infrastructures that enable open science. For example, I've done some work with the US National Academies on inclusive and fair authorship culture, I serve on the NISO Standing Committee for the CRediT taxonomy, the Scientific Advisory Board of Europe PMC — and of course, I'm honored to be an advisor for Make Data Count.
PLOS pioneered the requirement for data sharing with publications through its data policy and was an early supporter of the Data Citation Principles. Why do you view it as important for journals to adopt data citation practices? What are key milestones and lessons learnt from PLOS' experience with open data?
What we've learned from the data availability policy is that in general researchers are willing to share their data, but they often lack the resources, knowledge or motivation to do it in a way that really supports open science —that is, through deposition in repositories and with good citation practices. The data sharing policy at PLOS requires all authors to make data available openly at publication. We've measured adherence to this policy with Open Science Indicators – an initiative we started with Dataseer to help us measure where we start and how much progress we are making with our interventions to promote open science. We estimate that 77% of articles published in the PLOS portfolio in the first half of 2025 have associated openly shared data. We do make exceptions to our data sharing policy for legitimate scenarios when open sharing is not desirable (for example for privacy concerns), which explains only some of that gap in compliance. But notably, only 32% of articles have data shared in an open repository, the vast majority of the rest is shared in the article or its supplementary information. While not ideal, it's probably sufficient in some cases, but importantly it seems to indicate that the obstacle is not the willingness to share but the means or motivations to put in the additional effort to share in repositories. We've experimented with trying to support authors to use repositories with mixed success (the findings and data are available).
In terms of data citations, while it is a very important principle, this practice is not commonly adopted by researchers. We put efforts in ensuring that the Data Availability Statement associated with each article (in a human- and machine-readable way) is complete and accurate, but this doesn't often translate into citations. This is why I think the approach that Make Data Count has taken with the Data Citation Corpus is so important. We can achieve more recognition of the impact of data through detecting mentions in articles, via methods like those that Make Data Count is developing, than by relying on structured data citations because those still require a big cultural change.
PLOS is working on a project to enhance the visibility of research outputs -e.g data, code, and methods- connected to journal publications. Can you tell us about the motivation and goals for this project?
Through our experience with data but also other research outputs (like code and protocols), by looking at survey results year after year and through many conversations with researchers, I have come to believe that one of the biggest obstacles to increasing open science practices is that these practices are not sufficiently recognized in the academic reward system. Researchers are evaluated primarily on the basis of the number of articles they publish and where they publish them. Early career researchers are disproportionately affected because open science takes time and resources but in a competitive environment they feel they must favor other objectives, like publishing in 'prestige journals', which they feel are essential for their career advancement. On the other hand, research funders who promote open science and want their grantees to share their results openly often struggle to find the open research outputs associated with the work they fund. This is a systemic problem, and solving it will require sustained multi-stakeholder efforts.
At PLOS we want to address the parts of the problem that publishers can influence. It won't be sufficient in and of itself, but it can enable others to make changes too. There are already influential movements afoot among other stakeholders, such as the various research assessment reform movements that are getting traction in different regions of the world. We want to complement these efforts by demonstrating that publishers can change their systems to play their part. Specifically, the approach we will be pursuing is to make open science contributions (the sharing of diverse research outputs like data, code and methods) more visible and discoverable, to link them effectively to other related outputs (in order to facilitate context and reuse) and importantly, to enable them to be attributed to individual researchers and evaluated for their intrinsic quality and impact. Our vision is that if you develop, say, some code that is essential to the findings of an article and becomes an important resource used by other research groups, you should be able to showcase that contribution in and of itself, be recognized for it academically, and not merely be "author number n" on a long author list. We are exploring how publishers could do more to ensure this contribution is discoverable and usable to enable its effective recognition.
As part of this project, PLOS seeks to create more connections between articles and other open research outputs, and maximize their discoverability. What would you highlight as milestones achieved in the project, and what challenges have you encountered?
Our project so far has been focused on research and design (R&D) —we haven't built anything yet. We acknowledge the complexity of the systemic change required and therefore we wanted to invest in an extended R&D phase to work with and understand the motivations and challenges of different stakeholders in the research enterprise. It was also important for us to include those underrepresented in the current open science system, to avoid unintended consequences, build awareness and facilitate buy-in. We've assembled a panel of more than 2500 volunteer researchers who give feedback on design prototypes, participate in user testing and in surveys. They represent different regions, disciplines and economic contexts. In addition, we are consulting with other stakeholders —research funders, leaders of academic institutions and libraries. We've done a lot of interviews and we've organized three workshops on three continents so far. (We still have one workshop in preparation, focused on open infrastructure and how to integrate with other initiatives to support the discovery, context, assessment and reuse of content).
For me, it's been incredibly useful to hear the different perspectives and to validate that different stakeholders all want to see better representation of the work and outputs involved in a research project beyond an article. But we also have had clear signals that researchers are very much attached to communicating their research in the form of a narrative article, and while they want to get recognition for different outputs, this is more likely to happen in the context of data and code associated with a publication. This isn't surprising given the cultural context we are in, and it gives us a place to start: providing attributions and experimenting with means of evaluating data and code associated with journal articles.
Make Data Count is developing the Data Citation Corpus, a large aggregate of data-article connections. What opportunities do you envision for the Data Citation Corpus to support your ongoing work to increase the visibility and recognition of outputs associated with publications?
The Data Citation Corpus is one of the important resources that we hope we can integrate with. Ultimately, research assessors will want to know how much potential a dataset has for reuse (for example, whether it meets the FAIR criteria and is well documented) and, over time, whether it has been reused or referred to in different contexts. Being able to track mentions of data in the scientific literature, as the Data Citation Corpus does, could contribute to these measures of impact. Publishers should be able to integrate with that corpus to both contribute to it effectively and surface measures of impact.
In your view, what are the most pressing needs in relation to data evaluation and responsible data metrics? What advances would you like to see in this space?
One of the challenges we hear from those assessing researchers' work is how to evaluate the quality of a dataset and its potential impact. It's particularly difficult for hiring and promotion committees in which the committee members may not have the specialist expertise required to make these judgments. But even researchers who are looking at reusing someone else's data find it challenging to get all the information they need to do so efficiently. Responsible data metrics will require more than mere counting, but qualitative assessments are often discipline-specific and quickly become complex and costly.
I'm optimistic that with the advances in machine learning, we will see new tools to automate basic assessments for completeness of information and matching with context elements such as a publication or a protocol. I'd like to see advances on two fronts. First, automation of evaluations that provide a sense of the reuse potential of data. Second, better tracking of data mentions in different contexts to give a sense of realized impact. By integrating advances in these two areas, there is an opportunity not only for better metrics but importantly to facilitate, value and reward data reuse. For me, this is important to reduce waste and increase the ability to scrutinize results, but also to enable more diverse actors to participate in research, including people who are in resource-limited settings and can only produce or only analyze/use data. Both contributions are important and should be equally rewarded.
Additional details
Description
DOI: 10.60804/71BR-9Z42 Can you tell us a bit about your role at PLOS, and your involvement with open science? I'm the Chief Scientific Officer at PLOS and I have responsibility for the Editorial department. PLOS's mission is to drive open science forward through meaningful changes in publishing.
Identifiers
- UUID
- 423396a5-1086-4b2b-88f8-9ff5c23b1b28
- GUID
- https://makedatacount.org/?p=1603
- URL
- https://makedatacount.org/read-our-blog/veronique-kiermer/
Dates
- Issued
-
2025-12-02T16:40:10
- Updated
-
2025-12-04T09:56:33