.jpeg)
Sebastian Neumaier
—Oct 22, 2025
In the CrOSSD project, we are exploring how the health of Open Source Software (OSS) can be measured objectively and comparably. Sustainability, maintainability, and trustworthiness are crucial — especially for research software.
On our CrOSSD Health platform, we already collect key metrics that describe the state of a project, such as:
These metrics are based, among other things, on the CHAOSS metrics of the Linux Foundation and research such as “Is this GitHub Project Maintained?” (2020). They provide a solid foundation — but the real challenge lies in evaluating this information automatically, at scale, and in context. This is where we are currently exploring the use of LLMs and agentic AI systems.
Many OSS metrics in the CrOSSD platform require combining multiple data sources, such as Git commits, issues, package dependencies, and release tags. LLMs can help interpret and enrich this partly unstructured information.
For example, an LLM can automatically determine whether a commit represents a bug fix, a feature, or a refactoring. Research such as Amit and Feitelson (2020) shows that these kinds of commit classifications significantly improve the explanatory power of traditional metrics.
Metrics alone do not tell a story: if, for example, the churn value doubles, is that a problem or a sign of active development?
LLMs can provide context here, for example by comparing projects, identifying trends, or generating reports:
“The project had 12 active contributors in the last quarter (–25%). The average issue resolution time increased from 3 to 7 days. Code churn is at 22%, above the median of comparable projects (18%).”
LLMs achieve high accuracy in structured tasks, but they are not infallible. That is why transparency in metric calculation remains essential: users should be able to understand where a score comes from and how it was calculated. Metrics require context — high code churn can indicate either chaos or innovation.
In the next phase, we plan to:
Traditional OSS metrics, as currently implemented in CrOSSD (activity, churn, code dependencies, etc.), are well established. With LLMs, we aim to interpret them more intelligently and evaluate them automatically, thereby generating understandable, context-aware insights into the health of Open Source Software.