A Scientific Reproducibility Platform

As modern scientific exploration increasingly relies on heavy computational workflows, the environmental footprint of digital research has become impossible to ignore. With major contributions from the Technical University of Munich (TUM) and the University of Amsterdam (UvA), GreenDIGIT Milestone 14 establishes a Reproducibility-as-a-Service (RaaS) framework. This platform guarantees that complex computational experiments are not only technically repeatable but fully accountable regarding their environmental impact.

A Three-Layer Eco-Aware Architecture

The cornerstone of the platform is a modular, extensible architecture tailored for research infrastructures. It organizes notebook execution, metric logging, and packaging across three separate functional layers:

  1. Experiment Execution Layer: The EcoJupyter Virtual Research Environment (VRE), allows researchers to author and execute interactive Jupyter notebook-based workflows.

  2. Monitoring and Infrastructure Layer: Connects Jupyter notebook tasks to backend resources (Kubernetes, OpenStack, JupyterHub) and leverages node-level agents (like Scaphandre) to track real-time CPU package energy use via host-level RAPL counters, maintaining complete user isolation.

  3. Reproducibility and Metadata Layer: Dynamically harmonizes, packages, and registers executing software configurations, dependencies, and runtime metadata.

Flexible Re-execution: Record-Replay vs. Malleable Workflows

To support different scientific validation scenarios, the system leverages machine-readable RO-Crates (Research Object Crates) to drive automatic reproduction.

LLM-Enhanced Automated Documentation

Figure: LLM enhancement

To address the widespread issue of incomplete experiment documentation, TUM integrated an automated LLM-based metadata enrichment functionality. Operating as a post-processing step on the RO-Crate data, this pipeline automatically generates concise, non-speculative text descriptions for scripts, source configurations, and results. It drastically minimizes manual documentation work for scientists. In a second step, the RO-Crate data are visualized as a single-page web application.

Figure: Screenshot of the experiment command visualization

Figure: Screenshot of a detailed script view

Figure: Screenshot of contained energy measurements

Harmonization via the Common Information Model (CIM)

Developed by UvA, the Common Information Model (CIM) acts as the global schema translating raw Jupyter notebook metadata into universally comparable metrics. It pairs computation logs directly with localized environmental telemetry, such as regional grid carbon intensity, hardware architecture specs, and Power Usage Effectiveness (PUE). This ensures published experiment artifacts are optimized for the Federated Data Management Infrastructure (FDMI) and can seamlessly obtain indexed DOIs via Zenodo.

Moving Toward DevGreenOps

With upcoming validation campaigns utilizing the climate-focused IceNet notebook workflow and expanding into GPU acceleration metrics via NVIDIA DCGM, GreenDIGIT is paving the way for DevGreenOps. By weaving carbon and energy awareness straight into the early stages of system and code development, GreenDIGIT turns sustainability from a distant administrative monitoring goal into an active feature of everyday software engineering.