raw2ready

The increasing adoption of automation and digitalization in biomanufacturing has led to the proliferation of high-frequency, multi-source sensor data from bioreactors and associated analytical devices. However, raw data generated by different equipment vendors are typically heterogeneous in format, structure, sampling frequency, and naming conventions, which poses a substantial barrier to automated data processing and meaningful cross-instrument integration. Without harmonization, sensor time series, off-gas measurements, and other process signals often remain locked in proprietary or inconsistent representations, requiring significant manual intervention to extract, align, and organize them into analysis-ready datasets. This fragmentation complicates tasks such as time alignment, unit standardization, and variable renaming, and hinders further downstream workflows such as statistical analysis, predictive modelling, and digital-twin construction, all of which rely on coherent, machine-readable data.

raw2ready addresses these challenges by providing a modular, end-to-end Python framework designed to parse, structure, and merge raw bioprocess data into standardized, human- and machine-readable formats. The tool encapsulates device-specific parsing logic that converts vendor output into a common data layout, applies configurable translation tables to produce meaningful column headers, and aligns time series from disparate sources based on nearest timestamp matching. Additionally, raw2ready supports dynamic calculation of derived metrics through user-defined formulas driven by experiment-specific constants, enabling the computation of composite variables that are crucial for modelling and digital twin workflows. Beyond its command-line interface (CLI), raw2ready now features a graphical user interface (UI) through which users can interactively select and rename columns, assign official units for measurements, and configure processing and calculation steps, thereby lowering the usability barrier for non-programmers while maintaining the flexibility needed for automated, reproducible data harmonization in biomanufacturing.

Programme: Bioindustry 4.0

SEEK ID: https://ibisbahub.eu/projects/105

Public web page: https://gitlab.com/arc3972240/bioindustry/raw2ready

Organisms: No Organisms specified

IBISBA PALs: No PALs for this Project

Project created: 5th Feb 2026

Annotated Properties
Topic annotations
help Tags

This item has not yet been tagged.

Powered by
(v.1.17.3)
Copyright © 2008 - 2026 The University of Manchester and HITS gGmbH
IBISBA is a pan-European research infrastructure currently funded through multiple EU projects: IBISBA 1.0 (H2020 grant agreement No. 730976), PREP-IBISBA (H2020 grant agreement No. 871118) and the follow-on project IBISBA-DIALS (grant agreement No. 101131085) and BIOINDUSTRY 4.0 (Grant agreement No. 101094287). Registering data or other knowledge assets on this platform is the sole responsibility of Users. IBISBA cannot be held responsible for misuse or misappropriation of data and assets belonging to a Third Party.