Article

Original Article

Ann Lab Med 2025; 45(2): 209-217

Published online December 13, 2024 https://doi.org/10.3343/alm.2024.0315

Copyright © Korean Society for Laboratory Medicine.

A Machine Learning Approach for Predicting In-Hospital Cardiac Arrest Using Single-Day Vital Signs, Laboratory Test Results, and International Classification of Disease-10 Block for Diagnosis

Haeil Park , M.D., Ph.D.1 and Chan Seok Park, M.D., Ph.D.2

1Department of Laboratory Medicine, College of Medicine, The Catholic University of Korea, Seoul, Korea; 2Division of Cardiology, Department of Internal Medicine, College of Medicine, The Catholic University of Korea, Seoul, Korea

Correspondence to: Chan Seok Park, M.D., Ph.D.
Division of Cardiology, Department of Internal Medicine, College of Medicine, The Catholic University of Korea, 222 Banpo-daero, Seocho-gu, Seoul 06591, Korea
E-mail: chanseok@catholic.ac.kr

Received: June 23, 2024; Revised: August 15, 2024; Accepted: December 5, 2024

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Background: Predicting in-hospital cardiac arrest (IHCA) is crucial for potentially reducing mortality and improving patient outcomes. However, most models, which rely solely on vital signs, may not comprehensively capture the patients’ risk profiles. We aimed to improve IHCA predictions by combining vital sign indicators with laboratory test results and, optionally, International Classification of Disease-10 block for diagnosis (ICD10BD).
Methods: We conducted a retrospective cohort study in the general ward (GW) and intensive care unit (ICU) of a 680-bed secondary healthcare institution. We included 62,061 adults admitted to the Department of Internal Medicine from January 2010 to August 2022. IHCAs were identified based on cardiopulmonary resuscitation prescriptions. Patient-days within three days preceding IHCAs were labeled as case days; all others were control days. The eXtreme Gradient Boosting (XGBoost) model was trained using daily vital signs, 14 laboratory test results, and ICD10BD.
Results: In the GW, among 1,299,448 patient-days from 62,038 patients, 1,367 days linked to 713 patients were cases. In the ICU, among 117,190 patient-days from 16,881 patients, 1,119 days from 444 patients were cases. The area under the ROC curve for IHCA prediction model was 0.934 and 0.896 in the GW and ICU, respectively, using the combination of vital signs, laboratory test results, and ICD10BD; 0.925 and 0.878, respectively, with vital signs and laboratory test results; and 0.839 and 0.828, respectively, with only vital signs.
Conclusions: Incorporating laboratory test results or combining laboratory test results and ICD10BD with vital signs as predictor variables in the XGBoost model potentially enhances clinical decision-making and improves patient outcomes in hospital settings.

Keywords: Cardiac arrest, Diagnosis, Hospital, International Classification of Disease, Machine learning, Prediction

The incidence rate of in-hospital cardiac arrest (IHCA) per 1,000 hospital admissions varies globally; it is reportedly 1.5–2.8 in Europe [1], 1.6 in the United Kingdom, 9–10 in the United States [2], and 2.46 in South Korea [3]. The approximate ratio of IHCA occurrences in the intensive care unit (ICU) to those in the general ward (GW) is 6:4 [4]. In the United States, the survival-to-discharge rate of IHCA is reportedly 25% [2]. Disturbances in vital signs often precede IHCAs by several hours [4]. The early detection of such deterioration has been associated with improved survival-to-discharge rates and better neurological outcomes [5]. Hospitals use early warning scoring systems to identify patients at high risk of deterioration and significant adverse events [612]. The Modified Early Warning Score is a benchmark for evaluating predictive models [713]. Predictor variables can include vital signs [6, 9, 10, 1416], but some models incorporate both vital signs and laboratory test results [68, 1719]. Laboratory data can be an important variable in studies that utilize machine learning [20].

Few studies have compared models that solely use vital signs with those that combine vital signs and laboratory test results, particularly when using the same machine learning model to predict IHCA. We found one study that reported no significant difference in the area under the ROC curve (AUROC) between the two approaches [17]. However, no studies that evaluated the generalizability of these findings to other patient populations or alternative prediction models were identified. Furthermore, none of the previous studies that used machine learning to predict IHCA included diagnosis as a predictive variable [611, 1419].

We hypothesized that diagnoses recorded in the hospital information system reflect physicians’ clinical perceptions of patient conditions, even when provisional. We aimed to determine whether using vital signs, laboratory test results, and diagnosis blocks as predictors would enhance the predictive accuracy of eXtreme Gradient Boosting (XGBoost) for IHCA compared with that of those using both vital signs and laboratory test results or only vital signs as predictor variables. Our secondary aim was to evaluate whether using both vital signs and laboratory test results as predictors would enhance the predictive capability of the model compared with that relying solely on vital signs.

Clinical setting

We obtained data from the Clinical Data Warehouse (CDW) of the Catholic Medical Center (Seoul, Korea) [21]. The data comprised adults aged ≥18 yrs hospitalized for at least 1 day in the Department of Internal Medicine of Bucheon St. Mary’s Hospital, The Catholic University of Korea, between January 2010 and September 2022. This hospital is a secondary healthcare institution with 680 beds located in the metropolitan area of Seoul.

Data processing

Data tables, including comprehensive information about patient cohorts, admission specifics, vital signs, laboratory test results, diagnoses, prescriptions, and nursing records, were consolidated into a structured CSV file format. Entities were labeled as patient-days rather than individual patients, with a single patient potentially representing multiple patient-days.

Outcome labels

The model aimed to predict the occurrence of IHCA within 1–3 days after measuring the predictor variables. The occurrence and date of IHCA were determined based on the day of cardiopulmonary resuscitation (CPR), with CPR events identified by locating CPR codes in the prescription records. Patient-days occurring 1–3 days before CPR were designated as cases, whereas patient-days occurring ≥4 days before CPR and all days for patients without CPR events were categorized as controls. The days of CPR administration, along with the ensuing days, were excluded from model training, assessment, and analysis (Fig. 1).

Figure 1. Assignment of patient-days to case and control groups. IHCA events were assigned using CPR codes in prescription records, categorizing 1–3 days before CPR as cases and ≥4 days before cardiopulmonary resuscitation or all days for patients without CPR as controls. Days of and after CPR were excluded from the modeling dataset.
Abbreviations: IHCA, in-hospital cardiac arrest; CPR, cardiopulmonary resuscitation.

Predictor variables

Basic clinical information included sex, age, weight, height, and body mass index (BMI). Vital signs encompassed body temperature (BT), systolic blood pressure (SBP), diastolic blood pressure (DBP), mean blood pressure (MBP), pulse rate (PR), and respiration rate (RR). Measurements were categorized into intervals: 00:00–08:00 (T1), 08:00–16:00 (T2), and 16:00–24:00 (T3). Multiple readings within each interval were averaged, and missing readings were filled forward/backward on the same day. Daily averages were derived from measurements in T1, T2, and T3 and were used as predictor variables instead of using T1, T2, and T3 vital signs individually. For the analysis and modeling, we used periods with complete vital sign data. We selected the laboratory test items to be used as predictor variables based on the need to minimize the necessity for the imputation of missing values. To achieve this, we prioritized laboratory test items with the highest availability in the dataset. We listed all available laboratory test items in order of frequency of occurrence and selected only those with an occurrence rate of ≥1%. Consequently, the final selection of laboratory test items included 14 variables, with the following frequencies: Hb 1.8%, platelet count 1.8%, white blood cell (WBC) count 1.8%, creatinine 1.6%, potassium 1.6%, sodium 1.6%, urea nitrogen 1.6%, AST 1.4%, ALT 1.4%, albumin 1.4%, total protein 1.3%, chloride 1.2%, total bilirubin 1.1%, and glucose 1.1%. When multiple daily results were available, the daily average was used.

Diagnosis code as a predictor variable

Attending physicians enter ICD-10 diagnosis codes into the EMR during patient hospitalization, often while noting or prescribing, with the date and time recorded. The CDW data used in this study include these codes as logged on specific days. Once recorded, these codes remain unaltered; new codes are added along with their entry timestamps. The diagnosis codes for cases were recorded 1–3 days before the occurrence of the IHCA and remained unmodified thereafter. From the ICD-10 diagnosis code, we extracted the first three characters to establish an ICD-10 block code and corresponding block title for the diagnosis (ICD10BD).

Imputation of missing values

The imputation of missing values was conducted in two stages. The first imputation stage was performed before splitting the data into training/validation and test sets. This was done on the entire dataset, which consisted of patient-day entities, using forward and backward filling. This approach is based on the practice that a physician relies on the most recent available patient information until new information becomes available. Therefore, performing imputation before the data split realistically reflects clinical practice. The second imputation stage was conducted after splitting the data into training/validation and test sets. Any remaining missing values after the first stage were imputed separately within each set using median filling.

Data split

Daily patient locations in the GW or ICU were determined based on nursing records. The entire dataset was divided using stratified random sampling, ensuring a 3:1 ratio between the training/validation and test sets while maintaining consistent class ratios within each set. Notably, this stratified random sampling was performed at the patient-day entity level rather than at the patient level. This approach was chosen because the training and performance evaluations were based on data structured at the patient-day entity level. The rationale for using patient-day entities instead of patient-level data is grounded in the need for consistency in predictor variable length across all instances. Given the variability in-hospital stay durations among patients, using a fixed period for predictor variables would result in the exclusion of data from patients with shorter stays. To ensure the practical applicability of the prediction model in clinical settings, we opted to simplify the input data by using predictor variable values from a single day rather than over a more extended period. Therefore, the data were organized and split at the patient-day entity level.

The split was performed only once, and once the datasets were separated, they were never combined or shuffled together again. The training/validation set was solely used for the learning process, including model training and hyperparameter tuning. The test set was exclusively used for the final performance evaluation.

Significant differences between the training/validation and test sets were analyzed using the Wilcoxon rank-sum test for numerical continuous variables and the chi-squared test for categorical variables.

Predictive model building

For predictive modeling using logistic regression (LR) and XGBoost, we employed the “linear_model” and “xgboost” libraries within the Python module scikit-learn 1.3.0 [22]. Model parameters were optimized for AUROC through grid searches and stratified five-fold cross-validation. To address the class imbalance, cases and controls received weights based on their relative frequencies, and the Youden method was used to determine the optimal cutoff value [23].

We constructed and evaluated six prediction models, comprising combinations of two machine learning techniques (LR and XGBoost) and the incremental inclusion of predictor variable groups: vital signs (feature set 1 [FS1]), vital signs combined with laboratory test results (feature set 2 [FS2]), and the integration of vital signs, laboratory test results, and ICD10BD (feature set 3 [FS3]). For predictor variables, numerical values were subjected to robust scaling, whereas ICD10BD was subject to one-hot encoding. The models were applied to patients in the GW and ICU. All models included the following core variables: sex, age, weight, height, and BMI.

Predictive model performance

The predictive model was assessed using a test set, with the primary metric focusing on AUROC for discrimination. Secondary metrics encompassed the average precision score (APS) for discrimination and the F1-score, precision, and recall for calibration. When comparing performance measures, differences were considered significant when the 95% confidence intervals did not overlap and insignificant when they overlapped.

Feature importance

We conducted a feature importance analysis on XGBoost, as it demonstrated superior predictive performance compared with that of LR when used as the predictive model.

Ethics approval and consent to participate

This study was approved by the Institutional Review Board of The Catholic University of Korea (Seoul, Korea) on October 4, 2022 (approval No. HC22WISI0073) and was conducted in accordance with the 1975 Helsinki Declaration. Informed consent was waived because of the use of fully anonymized clinical data.

The dataset initially comprised 1,423,818 patient-days, encompassing 62,144 individual patients and including daily records from three variable groups: vital signs, laboratory test results, and ICD10BD. For patients who survived and recovered from IHCA, subsequent patient-days, totaling 5,956, were excluded. For patients who experienced an IHCA, the specific day of the event was omitted, accounting for 1,224 patient-days. Consequently, we employed data corresponding to 1,416,638 patient-days from 62,061 patients. Among these, 1,299,448 patient-days (62,038 patients) and 117,190 patient-days (16,881 patients) were derived from the GW and ICU, respectively. In the GW group, the case group comprised 1,367 patient-days from 713 patients. In the ICU, the case group included 1,119 patient-days, representing 444 patients (Fig. 2). Using patient-days as the unit of analysis, the distribution of values for each variable is presented in Supplemental Data Table S1 for the GW and in Supplemental Data Table S2 for the ICU. In both the GW and ICU, no statistically significant differences were observed between the training/validation and test sets for the variables.

Figure 2. Flowchart of data compilation. Patient-day was used as the unit of analysis.

XGBoost consistently demonstrated superiority over LR in terms of AUROC and APS. While the AUROC was consistently higher in the GW, the APS was notably elevated in the ICU. In the GW, the AUROC for FS3 (0.934) exceeded that of FS2 (0.925), although the difference was not statistically significant. The AUROC for the FS2 group was significantly higher than that for the FS1 group (0.839). Similarly, the AUROC for FS3 was significantly higher than that for FS1 (0.934 vs. 0.839) (Fig. 3A). The APS for FS3 (0.034) was greater than that for FS2 (0.026), but the difference was not statistically significant. The APS for FS2 was significantly greater than FS1 (0.007). Furthermore, the APS for FS3 was significantly higher than that for FS1 (0.034 vs. 0.007) (Supplemental Data Fig. S1A). Similarly, in the ICU, the AUROC for FS3 (0.896) was higher than that for FS2 (0.878); however, the difference was not statistically significant. The AUROC for FS2 was significantly higher than that for FS1 (0.828). The AUROC for FS3 was also significantly higher than that for FS1 (0.896 vs. 0.828) (Fig. 3B). The APS for FS3 (0.179) was greater than that for FS2 (0.115), albeit not statistically significant. The APS for FS2 was higher, but not significantly, than that for FS1 (0.097). The APS for FS3 was significantly higher than that for FS1 (0.179 vs. 0.097) (Supplemental Data Fig. S1B). These findings are presented in Table 1.

Figure 3. ROC curves for three feature sets derived from the three-stage expansion of variable groups in (A) the general ward and (B) the intensive care unit.
Abbreviations: LR, logistic regression; XGBoost, eXtreme Gradient Boosting; FS1, feature set 1; FS2, feature set 2; FS3, feature set 3.

Discriminatory performances of LR and XGBoost
LocationData setModelPredictor variablesAUROC (95% CI)APS (95% CI)
GWTraining/validation setLRFS10.809 (0.796–0.823)0.010 (0.008–0.012)
FS20.867 (0.855–0.878)0.011 (0.010–0.013)
FS30.899 (0.890–0.908)0.013 (0.012–0.016)
XGBoostFS10.969 (0.966–0.973)0.107 (0.090–0.126)
FS20.992 (0.991–0.993)0.226 (0.199–0.252)
FS30.993 (0.992–0.993)0.201 (0.180–0.224)
Test setLRFS10.795 (0.774–0.819)0.008 (0.006–0.012)
FS20.862 (0.842–0.881)0.011 (0.009–0.015)
FS30.876 (0.859–0.894)0.013 (0.010–0.017)
XGBoostFS10.839 (0.817–0.859)0.007 (0.006–0.010)
FS20.925 (0.912–0.937)0.026 (0.018–0.040)
FS30.934 (0.923–0.944)0.034 (0.024–0.050)
ICUTraining/validation setLRFS10.719 (0.703–0.735)0.026 (0.022–0.030)
FS20.785 (0.769–0.798)0.030 (0.027–0.034)
FS30.818 (0.806–0.831)0.036 (0.032–0.041)
XGBoostFS11.000 (0.999–1.000)0.946 (0.933–0.957)
FS20.994 (0.993–0.995)0.726 (0.700–0.755)
FS31.000 (1.000–1.000)0.978 (0.968–0.985)
Test setLRFS10.729 (0.700–0.759)0.026 (0.021–0.037)
FS20.792 (0.769–0.815)0.031 (0.025–0.038)
FS30.816 (0.798–0.837)0.035 (0.030–0.044)
XGBoostFS10.828 (0.803–0.850)0.097 (0.073–0.135)
FS20.878 (0.859–0.896)0.115 (0.083–0.151)
FS30.896 (0.875–0.917)0.179 (0.142–0.228)

Data are AUROC and APS for three feature sets derived from the three-stage expansion of variable groups in the GW and the ICU.

Abbreviations: LR, logistic regression; XGBoost, eXtreme Gradient Boosting; AUROC, area under the ROC curve; CI, confidence interval; APS, average precision score; GW, general ward; FS1, feature set 1; FS2, feature set 2; FS3, feature set 3; ICU, intensive care unit.



The F1 score, precision, and recall with the predictor variables from the GW and ICU are presented in Supplemental Data Table S3. When XGBoost and FS3 were applied in the ICU, the precision for case identification was 0.257.

When employing FS3 as predictor variables in the XGBoost model, the top five predictors in the GW were all ICD10BDs (Fig. 4A). In the ICU, they were creatinine and four ICD10BDs (Fig. 4B).

Figure 4. Top 10 predictor variables identified by the XGBoost algorithm. The predictor variables are categorized by the three stages of variable group expansion in (A) the general ward and (B) the intensive care unit. (A) K90-K93, other diseases of the digestive system; R10-R19, symptoms and signs involving the digestive system and abdomen; I30-I52, other forms of heart disease; M86-M90, other osteopathies; and G40-G47, episodic and paroxysmal disorders. (B) I60-I69, cerebrovascular diseases; F99-F99, unspecified mental disorders; C15-C26, malignant neoplasms of the digestive organs; R00-R09, symptoms and signs involving the circulatory and respiratory systems; K65-K67, diseases of the peritoneum; K55-K64, other diseases of the intestines; and I30-I52, other forms of heart disease.
Abbreviation: XGBoost, eXtreme Gradient Boosting; FS1, feature set 1; FS2, feature set 2; FS3, feature set 3; RR, respiration rate; PR, pulse rate; DBP, diastolic blood pressure; MBP, mean blood pressure; BMI, body mass index; BT, body temperature; WBC, white blood cell; SBP, systolic blood pressure.

Using the XGBoost algorithm, we found that the fully integrated model (vital signs, laboratory test results, and ICD10BD) did not achieve statistical significance but performed better than the model combining vital signs and laboratory test results and substantially outperformed the model based solely on vital signs. Consistently, the feature importance analysis revealed that the ICD10BD accounted for most of the top five predictors in both the GW and ICU settings. These predictors significantly contributed to the prediction of IHCA. Furthermore, when laboratory test results were added to vital signs, the predictive capabilities notably increased.

In the GW, both the AUROC and the APS for models with FS3 and FS2 were markedly superior to those from the model using results from FS1. Likewise, in the ICU, FS3 metrics outperformed those of FS1, with the AUROC for FS2 also surpassing that based on FS1 outcomes, as shown in Fig. 3 and Table 1.

Direct comparisons among studies on IHCA prediction remain complex because of diverse machine-learning approaches, variable compositions, data collection timings, and patient demographics. Nonetheless, we attempted approximate comparisons. Ueno, et al. [17] used a random forest model with predictor variables spanning 48 hrs for IHCA prediction. In the GW, they reported AUROCs of 0.879 (0.871–0.886) for vital signs and 0.866 (0.858–0.874) for vital signs and laboratory test results, with no significant difference. In contrast, in the ICU, the AUROCs were 0.580 (0.571–0.590) for vital signs and 0.648 (0.635–0.661) for vital signs combined with laboratory test results, with the latter being significantly higher. We employed XGBoost for prediction, and our results differed somewhat from those reported by Ueno, et al. Specifically, the AUROC for FS2 notably surpassed that of FS1 in both the GW and ICU. We found AUROCs for FS2 of 0.925 (0.912–0.937) in the GW and 0.878 (0.859–0.896) in the ICU. These values were significantly higher than the corresponding AUROCs of 0.866 (0.858–0.874) in the GW and 0.648 (0.635–0.661) in the ICU reported by Ueno, et al. Notably, both studies revealed a consistently higher AUROC in the GW than in the ICU. Kwon, et al. [9] used vital signs as predictor variables and test sets sourced from two hospitals, labeled A and B. Using a random forest model, they achieved AUROCs of 0.780 (0.776–0.787) and 0.823 (0.812–0.828) and areas under the precision-recall curve (AUPRCs) of 0.014 (0.012–0.014) and 0.203 (0.184–0.218), respectively. When using deep learning, the AUROCs were 0.850 (0.847–0.853) and 0.837 (0.829–0.857), and the AUPRCs were 0.044 (0.040–0.046) and 0.239 (0.219–0.257). When employing FS3 as a predictor variable in XGBoost, we achieved an AUROC of 0.934 (0.923–0.944) and an APS of 0.034 (0.024–0.050) in the GW and an AUROC of 0.896 (0.875–0.917) and an APS of 0.179 (0.142–0.228) in the ICU (Table 1). The AUROC value with our method was significantly higher than that reported by Kwon, et al. However, the relative performance of our APS varied depending on the test set used for comparison.

Determining which predictive variables optimize an IHCA prediction model is essential. Ueno, et al. explored the inclusion of laboratory test results as predictors in consistent prediction models [17]. Our results validate their efficacy. Additionally, the effect of adding a diagnosis, reflective of the patient’s clinical status, to the predictive variables had not been evaluated. Our findings indicate that including an ICD10BD and laboratory test results significantly enhances prediction accuracy relative to models using solely vital signs. Instead of using sequential data spanning more than a day, we used data from a single day. This approach was chosen to maximize the amount of clinically relevant data for prediction model training and its applicability to real-world scenarios. Nevertheless, our model performed as well as previously published IHCA prediction models that utilized sequential data [911, 15, 1719].

In the ICU, the frequent measurement of predictor variables offers a dense dataset that, when condensed into daily summaries, might result in a greater loss of information than in data from the GW. This may contribute to lower AUROC scores for ICU patients as the summarization process potentially overlooks nuanced fluctuations indicative of IHCAs. Conversely, the higher APS in the ICU might be attributable to the greater prevalence of IHCA in this setting, where critical events are more common. These insights suggest that the mode of data aggregation and the context of the care environment both play a significant role in predictive model performance. Future model refinements may benefit from considering the frequency of data collection and the inherent differences in patient populations between the ICU and GW.

Our model offers several notable advantages. First, by employing 1-day predictor variables in our model, we ensured minimal data wastage relative to that when using multi-day sequences for training, aligning well with real-world clinical applications. Second, we comprehensively trained and evaluated the model in both the GW and ICU settings, enabling us to tailor the model to each environment and to obtain valuable insights into its performance.

This study also has several limitations. First, precise timestamps for IHCA events were inaccessible, necessitating the use of CPR prescription registration dates. Consequently, our pre-IHCA timeframe was expressed in days rather than hours. A more accurate temporal resolution might improve model effectiveness. In future studies, we aim to extract the exact IHCA timing from unstructured free-text data. Second, our model was trained and evaluated solely using data from our institution, reflecting the unique characteristics of our patient population. To assess its broader applicability, external validation using independent datasets from various institutions and diverse patient populations is imperative. Third, we did not apply the model in a real clinical setting or assess its prospective performance. Such evaluations are vital for gauging future clinical performance. We compared model performance with and without ICD10BD; however, this classification has not been clinically implemented. Finally, we used a unified set of predictor variables for outcome prediction, irrespective of the patient clinical characteristics. Given that associations or correlations among variables may differ across various subgroups, the possibility of overfitting in certain subgroups cannot be entirely ruled out. Future studies may benefit from incorporating statistical feature selection methods to mitigate this issue and potentially improve prediction accuracy.

In conclusion, although not reaching statistical significance, using a combination of vital signs, laboratory test results, and the ICD10BD as predictor variables in the XGBoost algorithm improves the predictive performance for IHCA in admitted patients compared to models that only use vital signs and laboratory test results. This enhanced performance is in line with feature importance analysis results, emphasizing the value of integrating diverse data sources for more accurate IHCA prediction, which can lead to better patient care and timely interventions.

Conceptualization: Park H and Park CS. Methodology: Park H. Software: Park H. Validation: Park H. Formal analysis: Park H. Investigation: Park H. Resources: Park H. Data curation: Park H. Writing - Original draft: Park H. Writing - Review & editing: Park CS. Visualization: Park H. Supervision: Park CS. Project administration: Park H.

  1. Grasner JT, Herlitz J, Tjelmeland IBM, Wnent J, Masterson S, Lilja G, et al. European Resuscitation Council Guidelines 2021: Epidemiology of cardiac arrest in Europe. Resuscitation 2021;161:61-79.
    Pubmed CrossRef
  2. Andersen LW, Holmberg MJ, Berg KM, Donnino MW, Granfeldt A. In-Hospital Cardiac Arrest: A Review. JAMA 2019;321:1200-10.
    Pubmed KoreaMed CrossRef
  3. Choi Y, Kwon IH, Jeong J, Chung J, Roh Y. Incidence of Adult In-Hospital Cardiac Arrest Using National Representative Patient Sample in Korea. Healthc Inform Res 2016;22:277-84.
    Pubmed KoreaMed CrossRef
  4. Andersen LW, Kim WY, Chase M, Berg KM, Mortensen SJ, Moskowitz A, et al. The prevalence and significance of abnormal vital signs prior to in-hospital cardiac arrest. Resuscitation 2016;98:112-7.
    Pubmed KoreaMed CrossRef
  5. Brady WJ, Gurka KK, Mehring B, Peberdy MA, O'Connor RE; American Heart Association's Get with the Guidelines I. In-hospital cardiac arrest: impact of monitoring and witnessed event on patient survival and neurologic status at hospital discharge. Resuscitation 2011;82:845-52.
    Pubmed CrossRef
  6. Churpek MM, Yuen TC, Park SY, Gibbons R, Edelson DP. Using electronic health record data to develop and validate a prediction model for adverse outcomes in the wards*. Crit Care Med 2014;42:841-8.
    Pubmed KoreaMed CrossRef
  7. Churpek MM, Yuen TC, Winslow C, Robicsek AA, Meltzer DO, Gibbons RD, et al. Multicenter development and validation of a risk stratification tool for ward patients. Am J Respir Crit Care Med 2014;190:649-55.
    Pubmed KoreaMed CrossRef
  8. Churpek MM, Yuen TC, Winslow C, Meltzer DO, Kattan MW, Edelson DP. Multicenter Comparison of Machine Learning Methods and Conventional Regression for Predicting Clinical Deterioration on the Wards. Crit Care Med 2016;44:368-74.
    Pubmed KoreaMed CrossRef
  9. Kwon JM, Lee Y, Lee Y, Lee S, Park J. An Algorithm Based on Deep Learning for Predicting In-Hospital Cardiac Arrest. J Am Heart Assoc 2018;7.
    Pubmed KoreaMed CrossRef
  10. Kim J, Chae M, Chang HJ, Kim YA, Park E. Predicting Cardiac Arrest and Respiratory Failure Using Feasible Artificial Intelligence with Simple Trajectories of Patient Data. J Clin Med 2019;8.
    Pubmed KoreaMed CrossRef
  11. Cho KJ, Kwon O, Kwon JM, Lee Y, Park H, Jeon KH, et al. Detecting Patient Deterioration Using Artificial Intelligence in a Rapid Response System. Crit Care Med 2020;48:e285-e9.
    Pubmed CrossRef
  12. Romero-Brufau S, Whitford D, Johnson MG, Hickman J, Morlan BW, Therneau T, et al. Using machine learning to improve the accuracy of patient deterioration predictions: Mayo Clinic Early Warning Score (MC-EWS). J Am Med Inform Assoc 2021;28:1207-15.
    Pubmed KoreaMed CrossRef
  13. Subbe CP, Kruger M, Rutherford P, Gemmel L. Validation of a modified Early Warning Score in medical admissions. QJM 2001;94:521-6.
    Pubmed CrossRef
  14. Churpek MM, Adhikari R, Edelson DP. The value of vital sign trends for detecting clinical deterioration on the wards. Resuscitation 2016;102:1-5.
    Pubmed KoreaMed CrossRef
  15. Su CF, Chiu SI, Jang JR, Lai F. Improved inpatient deterioration detection in general wards by using time-series vital signs. Sci Rep 2022;12:11901.
    Pubmed KoreaMed CrossRef
  16. Chen MC, Huang TY, Chen TY, Boonyarat P, Chang YC. Clinical narrative-aware deep neural network for emergency department critical outcome prediction. J Biomed Inform 2023;138:104284.
    Pubmed CrossRef
  17. Ueno R, Xu L, Uegami W, Matsui H, Okui J, Hayashi H, et al. Value of laboratory results in addition to vital signs in a machine learning algorithm to predict in-hospital cardiac arrest: A single-center retrospective cohort study. PLoS One 2020;15:e0235835.
    Pubmed KoreaMed CrossRef
  18. Chae M, Han S, Gil H, Cho N, Lee H. Prediction of In-Hospital Cardiac Arrest Using Shallow and Deep Learning. Diagnostics (Basel) 2021;11.
    Pubmed KoreaMed CrossRef
  19. Choi A, Choi SY, Chung K, Chung HS, Song T, Choi B, et al. Development of a machine learning-based clinical decision support system to predict clinical deterioration in patients visiting the emergency department. Sci Rep 2023;13:8561.
    Pubmed KoreaMed CrossRef
  20. Kim S, Min WK. Toward High-Quality Real-World Laboratory Data in the Era of Healthcare Big Data. Ann Lab Med 2025;45:1-11.
    Pubmed KoreaMed CrossRef
  21. Choi IY, Kim TM, Kim MS, Mun SK, Chung YJ. Perspectives on clinical informatics: integrating large-scale clinical, genomic, and health information for clinical care. Genomics Inform 2013;11:186-90.
    Pubmed KoreaMed CrossRef
  22. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011;12:2825-30.
  23. Fluss R, Faraggi D, Reiser B. Estimation of the Youden Index and its associated cutoff point. Biom J 2005;47:458-72.
    Pubmed CrossRef