Published August 20, 2024 | https://doi.org/10.59350/rjt2m-2hy19

Supporting the core work in research software

Creators & Contributors

(Please cite this post as https://doi.org/10.59350/rjt2m-2hy19)

A few things recently have made me think about how we support the core work of maintaining research software and supporting its users.

  1. I was interviewed by Georgia Iacovou for her report, “The state of the open source ecosystem & how to fund away 'software collapse'“.
  2. I’ve been chairing some of the meetings of the Research Software Alliance (ReSA) Funders Forum, and co-organized part of a side meeting on research software at the Global Research Council’s 2024 meeting.
  3. I attended the 2024 NSF CSSI/Cyber Training/SCIPE PI Meeting.
  4. We’re getting towards the end of awards to Parsl for sustainability and maintenance from NSF (award1, award2) and CZI (award3).
  5. I’ve spent significant time planning and now working on the CORSA project.
  6. I’ve been chatting with Mike Woster from the Linux Foundation.

In these discussions and this work on Parsl, it’s become clear to me that there are parts of a medium-size open-source research software project like Parsl that can be sustained by a community. This includes adding new features and some user support, which we can do through a combination of community contributions and collaborations, funding from projects that use Parsl, and funding for our own research that requires us to maintain Parsl.

However, as the project grows both in code and users and as it ages, ongoing core work is needed, which we cannot always provide without some funding or voluntary effort specifically for this purpose. This core work includes user support, which grows with the number of users, though some of this can be done, and for us, is now being done by advanced users. While some maintenance work also will always be needed to avoid software collapse, how its magnitude changes over time is less clear, as our goal has been to pay down the technical debt that came with our initial “experimental” code, and arguably, we are succeeding in doing this. The core work also includes our community manager, who has provided tremendous value to helping Parsl grow; however, this is a function that is very hard to convince existing users to pay for.  In both cases, volunteer effort alone isn’t really sufficient: we don’t just need someone to do these things, but we need someone specific with the right experience in the project to do them.

My thinking on this is probably specific to projects that are similar to Parsl in size (of code and community) and in research orientation. I think there are valid models to support the core work for other types of projects, such as both smaller (e.g., a single researcher) and larger (e.g., 5+ core developers) research projects as well as many open-source projects that are not research focused, some of which have a large user community and a solid business and financial model. But it’s not clear to me that we understand how to support the core work for a project with a community and developers similar in size to Parsl’s.

About the time I finished working at NSF, leading activities to fund sustainable research software, I was optimistic and created a set of models that I thought would be sufficient, based in very large part on work done around how developers in open source could fund their work.  Today, I think that projects in Parsl’s position are going to remain dependent on grants, at least for this core maintenance and community work. For Parsl in particular, I think we really need about 0.5 FTE/year for core maintenance, and about 0.5 FTE/year for our community work.

In the longer term, I could imagine a few additional options. One would be institutional support from universities and other research institutions. In the case of Parsl, this would be the University of Chicago and University of Illinois Urbana-Champaign, and perhaps other institutions that want to get involved and take more of an ownership and governance stake in the project, by supporting the project via internal (not grant) funds and treating the project as a part of the institution’s assets rather than just a research project that happens to have some contributors at the institution. This would be similar to how some non-research open-source projects work, with multiple companies contributing time from their employees.

Another option would be more indirect funding from funders (government, philanthropic, industry) who want to support the open-source research software ecosystem, because doing so will support their overall goals. This might be similar to what DOE is trying with the Consortium for the Advancement of Scientific Software (CASS) (where I’m a co-PI in one of the member organizations, CORSA). My interpretation of DOE’s future vision for this is that they provide funding to sustain key software to CASS, which uses a decision-making process that hasn’t yet been fully determined to in turn provide funds to maintain the software that DOE research depends on, where DOE provides funding for the ecosystem without directly deciding which projects it funds, leaving this decision to be made by a set of software and user stakeholders. It seems that this model could be expanded to include multiple global funders and their goals, since most research software is developed, maintained, and used globally, and typically is useful for more than one research project and research goal.

I should mention here that this discussion is fairly specific to projects like Parsl (open-source research software, medium community size, most users are publicly-funded research projects). There are many other situations where other models might work, such as Globus with its model of closed source software provided as a service and supported by institutions, or Charm++, which was free for non-commercial use, and has just recently switched to an OSI-compliant license. There’s also software like Amber, which continues to be free for non-commercial use, but has a license fee for commercial use. These projects all have used fees to support the core software work. Outside research software, more general open source models, such as the Free and Open Source Software (FOSS) Sustainability Fund and GitHub Sponsors, seem promising, but I can’t see them helping research software. 

In some sense, none of this thinking is new, and it all goes back to the fact that while open-source research software is typically seen as free because it is free to obtain, it is not free to develop or maintain. As ReSA has expanded its work and impact, I think more government and philanthropic funding agencies have started to recognize this and create programs to address it, but these programs aren’t coordinated, and they aren’t sufficient to support all of the software that research needs. And industry overall hasn’t been brought into this discussion about research software, though a small number of companies have proven willing to support some of the software they use.

To wrap up this discussion for now, I think the open-source ecosystem has some types of projects that can be successfully supported, while others currently don’t fit well into these existing models, due at least in part to the size of the developer and user communities, and the fact that they are primarily research software. As discussed above, there are potentially two paths forward that I see:

  1. Institutions start to see open-source software as reputational and operational important and invest in it. For institutions that have a strong, well-resourced research software organization (e.g., an RSE organization, a research computing center) this seems possible, although given the current forces to reduce institutional spending, it seems relatively unlikely to happen, at least in universities.
  2. The model of CASS might provide hope for the future, particularly if it could be expanded to more funders. I hope funders and the community will continue to talk about this, in part via the ReSA Funders Forum.

In addition, it’s clear to me that part of the job of Parsl and similar projects is to raise awareness about this, and to share lessons and successes. For Parsl, our successes do include that the community growth has led to members who are willing to get more involved in parts of the software project, and infrastructure providers (e.g., NERSC) who make sure Parsl works with their resources and their users.

Acknowledgements: My thinking on this has greatly benefited from many discussions, including with Michelle Barker, Greg Watson, Addi Thakur Malviya, Elaine M. Raybourn, Bill Hoffman, Dana Robinson, Mike Heroux, Neil Chue Hong, Ian Foster, Kyle Chard, Ben Clifford, Mike Woster, Florian Mannseicher, Matthias Katerbow, Maria Cruz, Kenton McHenry, Tom Honeyman, and many members of the RSE and SSI communities.

Additional details

Description

(Please cite this post as https://doi.org/10.59350/rjt2m-2hy19) A few things recently have made me think about how we support the core work of maintaining research software and supporting its users.

Identifiers

UUID
bd69ca5e-7d40-4032-9ef8-b8e77db63b44
GUID
https://danielskatzblog.wordpress.com/?p=1741
URL
https://danielskatzblog.wordpress.com/2024/08/20/supporting-core-research-softtware-work/

Dates

Issued
2024-08-20T23:24:30
Updated
2024-08-31T11:20:39