Early Sepsis Prediction
using Time-Series
Machine Learning

A Multi-Dataset Analysis and Dynamic Risk Stratification

Martina Falasca¹ · Agostino Aiezzo² · Silvia Angeletti³ · Letizia Chiodo⁶ · Simonetta Filippi⁶
Mario Garofano⁴ · Alessandro Loppini⁶ · Maria Vittoria Ristori³ · Adolfo Santoro² · Silvia Spoto⁵ · Silvia Scarpetta¹

↗ Read Paper ↗ View Poster ↗ MIMIC-III Dataset

Sepsis remains a time-critical condition whose prognosis deteriorates markedly with each hour of diagnostic delay, underscoring the imperative for robust early-warning systems capable of identifying prodromal physiological derangements. In the present work, we introduce an integrated computational framework that synthesises exploratory phenotypic analysis with temporally-aware deep learning and permutation-based interpretability methods, yielding a comprehensive toolkit for early detection and dynamic risk stratification. The framework is validated across two complementary cohorts: SepsisExp, a purpose-collected dataset with expert-adjudicated ground-truth labels, and MIMIC-III, the de-facto benchmark for retrospective intensive care research. The phenotypic exploration phase employs rigorous interquartile-range-based outlier filtration, bivariate and partial correlation mapping, and ts-KMeans clustering to delineate heterogeneous clinical trajectories preceding sepsis onset. Building upon this characterisation, Long Short-Term Memory (LSTM) architectures are trained on multivariate time-series windows spanning the 24 hours prior to onset, with the final h hours systematically excluded to simulate realistic deployment latency across prediction horizons h ∈ {1, 2, 3, 6, 9, 12}. To bridge the interpretability gap between high-capacity neural models and bedside decision-making, we apply PermFit, a permutation-based feature-importance algorithm which quantifies each biomarker's marginal contribution by measuring degradation in predictive performance under random shuffling. The resulting importance trajectories reveal a clinically coherent temporal shift: haemodynamic and respiratory variables dominate predictions at short horizons, while biochemical and inflammatory indices acquire prominence as the prediction window extends. By rendering complex model behaviour transparent and clinically interpretable, the proposed methodology establishes a principled bridge between high-performance computation and actionable decision support in time-critical settings.

2 Datasets validated
6 Prediction horizons (h ∈ {1, 2, 3, 6, 9, 12})
0.72 VE ↔ RR correlation
−0.28 HR ↔ SV inverse correlation
DATA
SepsisExp (expert labels)
MIMIC-III (ICU gold standard)
PHENOTYPIC EXPLORATION
IQR outlier detection
Correlation analysis
K-Means · DBSCAN
Hierarchical clustering
ts-KMeans
LSTM MODEL
Input: 24h pre-onset
Excluded: final h hours
h ∈ {1, 2, 3, 6, 9, 12}
Multi-class output
PERMFIT
Permutation-based
Feature importance
Dynamic biomarker
trajectories
CLINICAL OUTPUT
Early warning
Risk stratification
Interpretable
decision support

Phenotypic Exploration

Interquartile-range-based outlier analysis revealed substantial dispersion across respiratory, metabolic, inflammatory, and tissue-perfusion biomarkers. Bivariate correlation mapping identified a robust positive association between minute ventilation and respiratory rate (r = 0.72), and a moderate inverse relationship between heart rate and stroke volume (r = −0.28). Clustering experiments employing K-Means, DBSCAN, and agglomerative hierarchical methods demonstrated more sharply delineated septic versus non-septic partitions in binary formulations relative to multi-class severity-stratified groupings.

LSTM Temporal Model

A Long Short-Term Memory architecture processes multivariate time-series clinical data drawn from the 24-hour window preceding sepsis onset. The model is evaluated across six prediction horizons (h ∈ {1, 2, 3, 6, 9, 12} hours before onset), with the terminal h hours systematically excluded to emulate realistic clinical deployment constraints. Training was performed on both SepsisExp and MIMIC-III under stratified k-fold cross-validation, with hyperparameters optimised via grid search over learning rate, hidden-unit dimensionality, and dropout regularisation strength.

PermFit Interpretability

PermFit assesses the marginal contribution of each input variable by quantifying the degradation in a pre-specified performance metric following random permutation of that feature's values, thereby breaking its association with the target while preserving the marginal distribution. The resulting importance trajectories exhibit a clinically coherent temporal progression: haemodynamic and respiratory variables dominate at prediction horizons proximal to onset (h = 1–3), whereas biochemical and inflammatory markers, consistent with their slower response kinetics, acquire greater relative importance at extended horizons (h = 6–12).

Dynamic biomarker relevance as clinical events approach

Hemodynamic & respiratoryBiochemical & inflammatory
* Representative values for visualisation. Refer to the full paper for exact figures.

LSTM performance metrics by prediction horizon

* Representative values for visualisation. Refer to the full paper for exact figures.

Top biomarker importance: h=1 vs h=12 before onset

* Representative values. Refer to the paper for exact PermFit scores.

Correlation network, selected clinical variables

SepsisExp dataset 0.72 0.65 −0.28 Minute Ventilation Respiratory Rate Tidal Volume Heart Rate Stroke Volume SpO₂ SBP Temp WBC

"Haemodynamic signals speak first.
Biochemistry answers last."

As the prediction window extends from 1 to 12 hours before sepsis onset, the dominant biomarker class undergoes a systematic transition from real-time physiological signals (heart rate, respiratory rate, blood pressure, and oxygen saturation) to slower-responding inflammatory and metabolic markers, including white blood cell count, lactate, creatinine, bilirubin, and C-reactive protein. This temporal progression is consistent with the established pathophysiology of septic deterioration, wherein haemodynamic decompensation precedes the full elaboration of the systemic inflammatory response.

−12h−9h−6h−3h−1hOnset+post
Haemodynamic & respiratory dominance Biochemical & inflammatory dominance
PRIMARY

SepsisExp

Expert-annotated clinical dataset with gold-standard sepsis labels derived through adjudicated physician review. Serves as the primary training and validation corpus. Comprises multivariate time-series observations spanning 24-hour pre-onset windows across multiple severity classes.

expert-validated primary training multi-class labels
CROSS-VALIDATION

MIMIC-III

Medical Information Mart for Intensive Care III, the de-facto open-access benchmark for ICU clinical research, encompassing de-identified health data from approximately 40,000 critical care admissions. Employed for cross-dataset validation to assess the generalisability of the framework beyond the primary training distribution.

gold standard ICU benchmark generalisability
↗ PhysioNet

The authors gratefully acknowledge the support of the Italian National Recovery and Resilience Plan (NRRP), Mission 4, Component 2, funded by the European Union, NextGenerationEU, through the Research Programme "National Centre for HPC, Big Data and Quantum Computing", Project CN00000013, Spoke 6 (Cascade Funding Spoke 6, Project "PRISM, Predictive Risk Identification of Sepsis with Machine Learning").

EU European Union NextGenerationEU Spoke 6 PNRR Italia Domani M4C2 · Inv. 1.4 ICSC Centro Nazionale HPC CN00000013
[1] Schamoni S, Hagmann M, Riezler S. Ensembling Neural Networks for Improved Prediction and Privacy in Early Diagnosis of Sepsis. arXiv:2209.00439, 2022. doi:10.48550/arXiv.2209.00439
[2] Camacho J, Bonet I, Gil B, Iadanza E. Machine Learning Models for Early Prediction of Sepsis on Large Healthcare Datasets. Electronics. 2022;11:1507. doi:10.3390/electronics11091507
[3] Moor M, Horn M, Rieck B, Roqueiro D, Borgwardt K. Early recognition of sepsis with Gaussian process temporal convolutional networks and dynamic time warping. MLHC Conference. PMLR, 2019.
[4] Goh KH, Wang L, Yeow AYK, Poh H, Li K, Yeow JJL, Tan GYH. Artificial intelligence in sepsis early prediction and diagnosis using unstructured data in healthcare. Nat Commun. 2021;12:711. doi:10.1038/s41467-021-20910-4
[5] Mi X, Zou B, Zou F, Hu J. Permutation-based identification of important biomarkers for complex diseases via machine learning models. Nat Commun. 2021;12(1):3008. doi:10.1038/s41467-021-22756-2