Preprints
https://doi.org/10.5194/gmd-2024-97
https://doi.org/10.5194/gmd-2024-97
Submitted as: development and technical paper
 | 
05 Jun 2024
Submitted as: development and technical paper |  | 05 Jun 2024
Status: this preprint is currently under review for the journal GMD.

Software sustainability of global impact models

Emmanuel Nyenah, Petra Döll, Daniel S. Katz, and Robert Reinecke

Abstract. Research software for simulating Earth processes enables estimating past, current, and future world states and guides policy. However, this modelling software is often developed by scientists with limited training, time, and funding, leading to software that is hard to understand, (re)use, modify, and maintain, and is, in this sense, non-sustainable. Here we evaluate the sustainability of global-scale impact models across ten research fields. We use nine sustainability indicators for our assessment. Five of these indicators – documentation, version control, open-source license, provision of software in containers, and the number of active developers – are related to best practices in software engineering and characterize overall software sustainability. The remaining four – comment density, modularity, automated testing, and adherence to coding standards – contribute to code quality, an important factor in software sustainability. We found that 29 % (32 out of 112) of the global impact models (GIMs) participating in the Inter-Sectoral Impact Model Intercomparison Project were accessible without contacting the developers. Regarding best practices in software engineering, 75 % of the 32 GIMs have some kind of documentation, 81 % use version control, and 69 % have open-source license. Only 16 % provide the software in containerized form which can potentially limit result reproducibility. Four models had no active development after 2020. Regarding code quality, we found that models suffer from low code quality, which impedes model improvement, maintenance, reusability, and reliability. Key issues include a non-optimal comment density in 75 %, insufficient modularity in 88 %, and the absence of a testing suite in 72 % of the GIMs. Furthermore, only 5 out of 10 models for which the source code, either in part or in its entirety, is written in Python show good compliance with PEP 8 coding standards, with the rest showing low compliance. To improve the sustainability of GIM and other research software, we recommend best practices for sustainable software development to the scientific community. As an example of implementing these best practices, we show how reprogramming a legacy model using best practices has improved software sustainability.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Emmanuel Nyenah, Petra Döll, Daniel S. Katz, and Robert Reinecke

Status: open (until 31 Jul 2024)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Emmanuel Nyenah, Petra Döll, Daniel S. Katz, and Robert Reinecke

Data sets

Software sustainability of global impact models (Dataset and analysis script) Emmanuel Nyenah, Petra Döll, Daniel S. Katz, and Robert Reinecke https://doi.org/10.5281/zenodo.11217739

Emmanuel Nyenah, Petra Döll, Daniel S. Katz, and Robert Reinecke

Viewed

Total article views: 244 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
178 59 7 244 11 6 4
  • HTML: 178
  • PDF: 59
  • XML: 7
  • Total: 244
  • Supplement: 11
  • BibTeX: 6
  • EndNote: 4
Views and downloads (calculated since 05 Jun 2024)
Cumulative views and downloads (calculated since 05 Jun 2024)

Viewed (geographical distribution)

Total article views: 244 (including HTML, PDF, and XML) Thereof 244 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 13 Jun 2024
Download
Short summary
Research software is crucial for scientific progress but is often developed by scientists with limited training, time, and funding, leading to software that is hard to understand, (re)use, modify, and maintain. Our study across 10 research sectors highlights strengths in version control, open-source licensing, and documentation while emphasizing the need for containerization and code quality. Recommendations include workshops, code quality metrics, funding, and adherence to FAIR standards.