Advancing Laboratory Medicine Practice With Machine Learning: Swift yet Exact
2025; 45(1): 22-35
Ann Lab Med 2025; 45(2): 117-120
Published online January 8, 2025 https://doi.org/10.3343/alm.2024.0696
Copyright © Korean Society for Laboratory Medicine.
Department of Laboratory Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
Correspondence to: Sollip Kim, M.D., Ph.D.
Department of Laboratory Medicine, Asan Medical Center, University of Ulsan College of Medicine, 88 Olympic-ro 43-gil, Songpa-Gu, Seoul 05505, Korea
E-mail: sollip_kim@amc.seoul.kr
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Cardiac arrest is the sudden cessation of cardiac mechanical activity, necessitating chest compressions or defibrillation. It remains a major global health challenge, responsible for 15 – 20% of deaths worldwide [1]. Despite advancements in first responder systems and resuscitation techniques, survival rates remain low, with many patients not achieving a return of spontaneous circulation [1]. Cardiac arrests are classified by location into in-hospital cardiac arrest (IHCA) and out-of-hospital cardiac arrest, which differ in epidemiology, comorbidities, care processes, and provider characteristics [2]. While both require ongoing research, IHCA has received less research attention.
Despite advances in IHCA management and healthcare, the incidence of IHCA, along with associated mortality, has increased over the past decade [3]. Early identification of IHCA is life-saving [3]. Conventional scoring systems, such as the Modified Early Warning Score [4] and UK National Early Warning Score [5], aimed at detecting patient deterioration, often lack sensitivity, produce high false alarm rates, and heavily rely on staff interpretation, limiting their utility [3]. Advances in artificial intelligence (AI), particularly in machine learning (ML), allow highly accurate analysis of complex data [6, 7], making AI a promising tool for predicting IHCA risk [1, 3]. IHCA often occurs in high-risk patients, and timely prediction can prevent arrests, optimize outcomes, and improve survival. Real-time monitoring and clinical data availability in hospital settings allow for the development and implementation of AI-driven prediction tools.
Over the past decade, multiple studies have utilized ML to predict IHCA in various clinical settings [3, 8–16] (Table 1). Most studies incorporated vital signs from conventional scoring systems, such as systolic blood pressure, respiratory rate, body temperature, and heart rate. Some also included the consciousness level, along with predictors such as clinical data, demographics, laboratory values, and heart rate variability metrics. Common ML methods include support vector machines, which classify data through hyperplanes in high-dimensional spaces; random forests, which utilize ensemble learning with decision trees; and neural networks, which are loosely modeled after the human brain. However, whether ML significantly outperforms traditional methods in predicting IHCA remains uncertain.
Authors (publication year) | Country | Patient group | Sample size, N | Key variables | Outcome | Best ML model | IHCA prediction performance, AUROC | Reference |
---|---|---|---|---|---|---|---|---|
Ong, et al. (2012) | Singapore | ED | 925 | Demographics, vital signs, HRV metrics | Cardiac arrest | SVM | 0.781 | [29] |
Liu, et al. (2014) | Singapore | ED | 702 | Vital signs, HRV metrics | Major adverse cardiac events* | SVM | 0.812 | [30] |
Churpek, et al. (2014) | US | GW | 269,999 | Demographics, vital signs, laboratory values | Cardiac arrest, ICU transfer, or death | RF (eCARTTM) | 0.77 | [31] |
Green, et al. (2018) | US | GW | 107,868 | Demographics, vital signs, laboratory values | Cardiac arrest, ICU transfer, or death | RF (eCARTTM) | 0.801 | [32] |
Bartkowiak, et al. (2018) | US | Postoperative | 32,537 | Demographics, vital signs, laboratory values | Cardiac arrest, ICU transfer, or death | RF (eCARTTM) | 0.79 | [33] |
Kwon, et al. (2018) | South Korea | GW | 52,131 | Vital signs | Cardiac arrest or ICU transfer | LSTM(DeepCARSTM) | 0.850 | [34] |
Jang, et al. (2019) | South Korea | ED | 374,605 | Demographics, chief complaint, vital signs, consciousness level | Cardiac arrest | MLP-LSTM | 0.936 | [35] |
Kim, et al. (2019) | South Korea | ICU | 29,181 | Vital signs, treatment history, health status, recent surgery | Cardiac arrest | LSTM | 0.896 | [36] |
Cho, et al. (2020) | South Korea | GW | 8,039 | Vital signs | Cardiac arrest or ICU transfer | LSTM(DeepCARSTM) | 0.865 | [37] |
Chae, et al. (2021) | South Korea | GW | 83,543 | Demographics, vital signs, laboratory values | Cardiac arrest | Various | No data | [9] |
Kim, et al. (2022) | South Korea | ED | 1,350,693 | Demographics, vital signs, oxygen supply, oxygen saturation, ED occupancy | Cardiac arrest | XGBoost | 0.927 | [11] |
Lee, et al. (2023) | South Korea | ICU | 4,821 | HRV metrics | Cardiac arrest | LGBM | 0.881 | [13] |
Cho, et al. (2023) | South Korea | GW | 55,083 | Vital signs | Cardiac arrest or ICU transfer | LSTM(DeepCARSTM) | 0.869 | [12] |
Ding, et al. (2023) | China | GW | 7,779 | Laboratory values | Cardiac arrest | ETC | 0.920 | [14] |
Lu, et al. (2023) | Taiwan | ED | 316,465 | Demographics, chief complaints, vital signs, BMI, oxygen saturation, consciousness | Cardiac arrest | RF | 0.931 | [10] |
Wu, et al. (2024) | Taiwan | GW | 32,719 | Demographics, vital signs, laboratory values, BMI, CNS medication use | Cardiac arrest | SVM | 0.811 | [16] |
Lee, et al. (2024) | Taiwan | GW | 114,276 | Demographics, comorbidities, presenting illness, vital signs | Cardiac arrest | SVM | 0.945 | [15] |
Park, et al. (2025) | South Korea | GW, ICU | 62,061 | Demographics, vital signs, laboratory values, ICD-10 code | Cardiac arrest | XGBoost | 0.934 | [17] |
*Major adverse cardiac events include death, cardiac arrest, sustained ventricular tachycardia, and hypotension requiring inotropes or intra-aortic balloon pump insertion.
Abbreviations: ML, machine learning; IHCA, in-hospital cardiac arrest; AUROC, area under ROC curve; BMI, body mass index; CNS, central nervous system; ED, emergency department; ETC, extra trees classifier; GW, general ward; HRV, heart rate variability; ICU, intensive care unit; LGBM, light gradient boosting machine; LR, logistic regression; LSTM, long short-term memory; MLP, multilayer perception; RF, random forest; RNN, recurrent neural network; SVM, support vector machine; XGBoost, eXtreme Gradient Boosting; ICD-10, International Classification of Disease, Tenth Revision.
In this issue of Annals of Laboratory Medicine, the study by Park and Park [17], titled “A machine learning approach for predicting in-hospital cardiac arrest using single-day vital signs, laboratory test results, and International Classification of Disease-10 Block for Diagnosis,” advances the IHCA-predictive capability by employing an ML-based approach that integrates multiple clinical data types. The authors conducted a comprehensive retrospective cohort study involving more than 62,000 patients spanning 12 yrs at a healthcare institution [17]. The large scale of the study, incorporating data from both general wards (GWs) and intensive care units (ICUs), covering a wide range of patient scenarios, ensured model reliability and generalizability.
The study explored three tiers of predictive variables: vital signs, laboratory test results, and International Classification of Disease-10 Block codes (ICD10BD) [17]. Previous IHCA prediction models often relied on limited datasets, such as vital signs and/or laboratory values. While these models demonstrated utility, their scope was limited, and they frequently failed to capture the full complexity of a patient’s clinical condition. The study by Park and Park [17] overcame these limitations by integrating the above three critical data sources, reflecting the nuanced interplay of physiological, biochemical, and diagnostic factors that underpin IHCA risk. In addition, the study findings highlighted the critical role of laboratory test results and diagnostic codes in enhancing predictive accuracy. Notably, a feature importance analysis revealed that ICD10BD variables were among the top predictors, highlighting the value of diagnostic insights provided by clinicians.
Park and Park [17] utilized the eXtreme Gradient Boosting (XGBoost) algorithm, a decision tree-based boosting method that learns by sequentially connecting decision trees and compensating for their errors [18]. XGBoost, known for its efficiency and robust performance in handling complex datasets, achieved high predictive accuracy, with area under the ROC curve (AUROC) scores of 0.934 and 0.896 for patients in the GW and ICU, respectively. Compared with earlier models that relied on logistic regression or random forest methods, the XGBoost-based algorithm can better handle complex, high-dimensional data, ensuring a more precise and actionable output.
Despite its strengths, the study acknowledges certain limitations. While the inclusion of ICD10BD codes enhances predictive accuracy, it introduces potential variability due to differences in diagnostic coding practices among clinicians. Additionally, the reliance on cardiopulmonary resuscitation prescription records as a proxy for IHCA events because the exact timing of cardiac arrests was unavailable introduces temporal imprecision. Standardizing these practices or incorporating natural language processing to interpret unstructured clinical notes may help address this challenge. Furthermore, the single-center study design limits the generalizability of the findings. External validation using diverse datasets across multiple institutions is essential to confirm the model’s robustness.
When using laboratory values, ensuring data quality requires addressing several key considerations. Healthcare data, often not systematically collected for research, frequently lack standardization in terminology and traceability [19–22]. Results can vary because of differences in reagents, instruments [23], and the quality status of laboratories [24, 25], even for standardized tests. Developing robust ML models necessitates clear documentation of data standardization, detailed information about reagents and instruments used, and consideration of the quality status of the laboratories involved. Further studies are required to enhance model generalizability across institutions or environments while accounting for these factors. Existing methods for evaluating laboratory data quality offer valuable guidance [26–28].
The study by Park and Park [17] contributes to the growing body of evidence supporting the integration of ML into clinical practice. By combining diverse data sources, the proposed model exemplifies how AI can bridge gaps in traditional warning systems. However, implementing such models in practice requires addressing key challenges, including data standardization, the interoperability of electronic health records, and training clinicians in interpreting ML outputs.
The author confirms sole responsibility for manuscript conception and preparation.
None declared.