Article

Original Article

Ann Lab Med 2025; 45(2): 178-184

Published online December 16, 2024 https://doi.org/10.3343/alm.2024.0304

Copyright © Korean Society for Laboratory Medicine.

Artificial Intelligence in Diagnostics: Enhancing Urine Test Accuracy Using a Mobile Phone–Based Reading System

Hyun Jin Kim , M.D., Ph.D.1,2, Manmyung Kim , Ph.D.3, Hyunjae Zhang , M.E.3, Hae Ri Kim , M.D., Ph.D.1,4, Jae Wan Jeon , M.D., Ph.D.1,4, Yuri Seo , M.D., Ph.D.5, and Qute Choi, M.D., Ph.D.1,2

1Department of Laboratory Medicine, Chungnam National University School of Medicine, Daejeon, Korea; 2Department of Laboratory Medicine, Chungnam National University Sejong Hospital, Sejong, Korea; 3Robosapiens, Inc., Daejeon, Korea; 4Division of Nephrology, Department of Internal Medicine, Chungnam National University Sejong Hospital, Sejong, Korea; 5Department of Family Medicine, Chungnam National University Sejong Hospital, Sejong, Korea

Correspondence to: Qute Choi, M.D., Ph.D.
Departments of Laboratory Medicine, Chungnam National University School of Medicine and Chungnam National University Sejong Hospital, 20 Bodeum 7-ro, Sejong 30099, Korea
E-mail: qutechoi@gmail.com

Received: June 17, 2024; Revised: August 15, 2024; Accepted: December 6, 2024

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Background: Urinalysis, an essential diagnostic tool, faces challenges in terms of standardization and accuracy. The use of artificial intelligence (AI) with mobile technology can potentially solve these challenges. Therefore, we investigated the effectiveness and accuracy of an AI-based program in automatically interpreting urine test strips using mobile phone cameras, an approach that may revolutionize point-of-care testing.
Methods: We developed novel urine test strips and an AI algorithm for image capture. Sample images from the Chungnam National University Sejong Hospital were collected to train a k-nearest neighbor classification algorithm to read the strips. A mobile application was developed for image capturing and processing. We assessed the accuracy, sensitivity, specificity, and ROC area under the curve for 10 parameters.
Results: In total, 2,612 urine test strip images were collected. The AI algorithm demonstrated 98.7% accuracy in detecting urinary nitrite and 97.3% accuracy in detecting urinary glucose. The sensitivity and specificity were high for most parameters. However, this system could not reliably determine the specific gravity. The optimal time for capturing the test strip results was 75 secs after dipping.
Conclusions: The AI-based program accurately interpreted urine test strips using smartphone cameras, offering an accessible and efficient method for urinalysis. This system can be used for immediate analysis and remote testing. Further research is warranted to refine test parameters such as specific gravity to enhance accuracy and reliability.

Keywords: Artificial intelligence, Mobile application, Smartphone, Urinalysis

Urinalysis is a fundamental diagnostic tool that has been used for centuries. Modern dipstick tests are simple and cost-effective tools for disease surveillance and management. Despite their widespread use, these tests are limited by subjective interpretations and potential inaccuracies in their results. Integrating artificial intelligence (AI) with mobile technology may address these issues by providing standardized and accurate analyses. This interdisciplinary innovation may revolutionize point-of-care testing, making it more reliable and accessible worldwide [13].

Traditional urine test strip analysis, which is crucial for rapid diagnostic assessments, is challenging, and manual readings face challenges such as inter-operator variability and environmental factors that can affect the results [4]. Automated systems offer improved consistency but are not universally adopted because of their high cost and complexity. These systems are typically immobile, limiting point-of-care testing, particularly in resource-constrained settings. Despite these limitations, mobile screening applications are uncommon because nuanced interpretations are required.

Mobile technology, which includes high-resolution imaging and computational power, can potentially bridge these gaps [5]. AI-based urinalysis has shown better diagnostic accuracy than conventional methods [5], suggesting the potential for a shift toward more automated, precise, and patient-centric diagnostic solutions. Notably, the evolution of mobile technology has significantly influenced clinical urinalysis. Innovations such as reflectometric analysis for quantitative urine test reading and complementary metal oxide–based semiconductor technologies have improved laboratory testing sensitivity and precision [6]. Concurrently, the digitalization of healthcare, accelerated by the coronavirus disease 2019 (COVID-19) pandemic, has expanded the scope of urinalysis, with smartphone technology enabling remote patient management [7]. These technological advances suggest a trend toward more decentralized and patient-directed healthcare, where diagnostic processes benefit from the widespread availability and advancing capabilities of smartphone technology.

Despite technological advances, the potential of mobile phone technology in clinical settings remains underutilized for urinalysis. Remote monitoring capabilities are necessary in the field of nephrology, and the feasibility of smartphone-based urine testing was demonstrated during the COVID-19 pandemic [7]. However, smartphone technology has not been widely adopted in standard practice, highlighting the need for research to develop more accessible urinalysis methods.

Therefore, we aimed to develop an AI-based program capable of automatically interpretating urine test strips using mobile phone cameras. This study was conducted to validate the effectiveness and accuracy of the program in comparison with standard laboratory equipment.

Development of urine test strips and an AI algorithm for capturing images

We developed novel urine test strips capable of measuring 10 parameters—blood, leukocytes, protein, glucose, ketones, nitrite, bilirubin, urobilinogen, pH, and specific gravity—using existing test strips (REF 0044) as a reference. An AI algorithm was developed to capture images of the novel test strips. We used the Hough circle transform technique from the OpenCV library to locate the circles marked at all four corners of the urine test strip images. The OpenCV perspective transform function was used to map the locations of all four detected circles to the corresponding corners of a new 600×800-pixel image. Finally, to minimize the influence of potential outliers and ensure accurate color representation, we cropped a 16×16-pixel square from the center of each of the 10 designated reagent areas in the transformed image. The average red, green, and blue (RGB) values were calculated to determine the representative color values. This approach, combined with consistent imaging conditions, helped ensure the reliability of our color measurements. A urine test strip and reference chart are shown in Fig. 1.

Figure 1. Urine test strip and reference chart

Data collection

This study was conducted from August 1, 2023 to October 31, 2023 and was approved by the Institutional Review Board of Chungnam National University Sejong Hospital (approval number: 2023-11-004). Residual urine samples were obtained from routine clinical testing at our institution. The samples were subjected to urine test strip analysis within 48 hrs of the initial routine analysis. As inclusion criteria, we selected: (1) leftover specimens after automated urine analyses and (2) specimens with a sufficient volume for strip testing. Samples suspected of contamination and those with an insufficient volume were excluded. Images were captured using a single Samsung Galaxy S9 smartphone at various intervals (60, 75, 90, 105, and 120 secs) after dipping a test strip in a urine sample. The HTML media capture feature was employed to capture images directly from the smartphone camera. A stand was used to maintain a consistent distance and angle, ensuring reproducibility, and two researchers alternatively captured the images. As a reference standard, we used urine analysis results obtained with the CLINITEK Novus Automated Urine Chemistry Analyzer (Siemens Healthineers, Erlangen, Germany). We routinely use this automated analyzer in our hospital laboratory to provide official hospital laboratory reports.

AI algorithm development and validation strategies for urine test strip analysis

The AI algorithm was developed to interpret the color values of the urine test strips using the k-nearest neighbor (k-NN) classification algorithm. This method classifies data points based on the nearest k-neighbor data points in the feature space. To ensure unbiased and robust validation of our AI algorithm, we employed systematic approaches for data preprocessing and validation. We utilized the RepeatedStratifiedKFold function of the scikit-learn library (https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RepeatedStratifiedKFold.html) for data division and cross-validation. This method randomly divides the dataset and maintains the proportion of samples in each class, ensuring a balanced representation in training and test sets. The dataset was divided into four folds, reflecting the overall class distribution. For each cross-validation fold, we used a split ratio of 75% to 25% for the training and test sets, respectively. We repeated the cross-validation process 10 times for each parameter to enhance the reliability of our results, ensuring a robust estimation of the model’s performance. This approach guaranteed that the training and test sets were independent and non-overlapping for each fold, preventing data leakage and maintaining the validity of our results. We evaluated the performance of the AI algorithm in terms of accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (ROC AUC). These metrics were determined for each parameter when evaluating the urine test strips. To assess the algorithm’s accuracy and reliability in real-world scenarios, we compared the results obtained with our AI algorithm with those obtained with automated laboratory equipment to provide a benchmark for evaluating its practical applicability. We implemented this rigorous validation strategy to realistically assess our AI algorithm’s performance and its potential generalizability in urine test strip analysis.

Mobile application development

The mobile application, which enables users to capture and upload photos to a server, was developed in two main stages. First, a webpage was designed using HTML/CSS as part of the client interface to enable users to capture images through the app. JavaScript was used to send the captured images to the server. The server operating environment was set up using Node.js, and the web application structure was organized using the Express.js framework, enabling the server to receive and process images. MySQL version 5.7 was used for database management.

Fig. 2 provides an overview of the experimental workflow used. It illustrates the progression from urine sample collection through image capture and processing to the final AI algorithm analysis. The figure summarizes the key steps of our methodology, which are described in detail in the following subsections.

Figure 2. Schematic representation of the experimental workflow used for AI-based urine test strip analysis.
Abbreviations: RGB, red, green, blue; AI, artificial intelligence; ROC AUC, area under the ROC curve.

We analyzed 2,612 urine test strip images (each representing a unique urine specimen) and their corresponding results from the automated urine test instrument. To ensure data integrity and avoid potential bias from repeated measurements, no urine sample was tested more than once. During validation, 653 images (25%) were used as the test set, and 1,959 (75%) were used as the training set for each fold.

Overall, the accuracy of most parameters improved as the number of datasets increased (Fig. 3). The final accuracy for blood was 95.9%. The accuracy of the AI algorithm for specific gravity initially increased but then gradually decreased. The specific gravity measurements showed the lowest accuracy among all 10 parameters tested.

Figure 3. Comparative analysis of the diagnostic test performance of AI-based urine test strip analysis with multiple datasets in terms of accuracy (A), sensitivity (B), specificity (C), and ROC AUC (D).
Abbreviations: AI, artificial intelligence; blo, blood; leu, leukocytes; pro, protein; glu, glucose; ket, ketones; nit, nitrite; bil, bilirubin; uro, urobilinogen; sg, specific gravity; ROC AUC, area under the ROC curve.

The sensitivity of the algorithm for most parameters also increased throughout the study. The sensitivity of the novel system for urobilinogen, leukocytes, blood, nitrite, and ketones increased as the number of validation sets increased. The sensitivity of the AI algorithm for bilirubin and specific gravity also increased but remained lower than that for the other parameters.

The specificity of the system for most parameters tested was >90%. The specificity of the system was the lowest for pH and specific gravity, although it remained >80%. The specificity for urobilinogen and specific gravity decreased progressively as the number of datasets increased. Glucose and nitrite had high ROC AUC values throughout the study. The ROC AUC values were lower for bilirubin and specific gravity than for other parameters.

Overall, the AI algorithm had the highest accuracy for nitrite (98.7%), followed by glucose (97.3%) and bilirubin (97.1%) (Table 1). The AI algorithm had the highest sensitivity for nitrite (84.6%), and the lowest for bilirubin (23.9%). The AI algorithm had the highest specificity for nitrite (99.5%), followed by glucose (98.4%). The ROC AUC was the highest for glucose (93.5%) and nitrite (92.1%), and the lowest for specific gravity (59.1%).

Performance of the AI algorithm
ParameterAccuracy (%)Sensitivity (%)Specificity (%)ROC AUC
Blood95.983.497.80.906
Leukocytes88.962.294.00.781
Protein91.680.994.70.878
Glucose97.388.698.40.935
Ketones92.966.296.20.812
Nitrite98.784.699.50.921
Bilirubin97.123.998.60.613
Urobilinogen84.163.490.00.767
pH84.873.889.60.817
Specific gravity75.333.484.80.591

Abbreviations: AI, artificial intelligence; AUC, area under the curve.



The performance of the AI algorithm for blood, leukocytes, protein, and glucose peaked at 75 secs after dipping (Fig. 4).

Figure 4. Time-based performance metrics of the artificial intelligence-based urine test strip analysis in terms of accuracy (A), sensitivity (B), specificity (C), and ROC AUC (D).
Abbreviations: blo, blood; leu, leukocytes; pro, protein; glu, glucose; ket, ketones; nit, nitrite; bil, bilirubin; uro, urobilinogen; sg, specific gravity; mean_4, mean of the blo, glu, pro, and nit values; ROC AUC, area under the ROC curve.

We developed and evaluated an innovative approach for urinalysis using AI and mobile technologies. The diagnostic performance of the AI algorithm improved as the number of datasets increased. The results of the novel AI-driven method aligned closely with traditional laboratory results for glucose and nitrite, yielding high accuracy and reliability. However, the algorithm also showed limitations, particularly for specific gravity, suggesting the need for further refinement. The optimal time for capturing the urine test strip results was approximately 75 secs after dipping.

Several attempts have been made to develop smartphone-based colorimetric urinalysis [812]. Azhar, et al. [10] evaluated the consistency of RGB colorimetric measurements of urine images for patients with dengue fever that were captured using different smartphones under different lighting conditions. They reported that color correction significantly improved the concordance of the blue and green values, highlighting the potential of smartphone-based urine colorimetry to noninvasively assess the dehydration status of patients. Flaucher, et al. [11] developed an automated urinalysis system using smartphone camera images that could be applied with standard urine test strips in home settings without additional hardware. Their method was feasible for at-home prenatal care, showed improved accuracy and fewer human errors than a comparator method, and reduced the workloads of medical staff. Our AI algorithm captures urine strip images using a smartphone camera and automatically interprets the results via image analysis.

Various algorithms have been used in studies involving urine or urine metabolites [1315]. Shao, et al. [13] used urine metabolites to predict the presence of bladder cancer with a decision tree algorithm that demonstrated 76.60% accuracy, 71.88% sensitivity, and 86.67% specificity. Sanghvi, et al. [14] adapted a convolutional neural network algorithm to analyze urothelial cells, achieving a sensitivity of 79.5%, a specificity of 84.5%, and an AUC of 0.88. Eisner, et al. [15] explored various machine learning algorithms (including support vector machines, k-NN, and a least absolute shrinkage and selection operator) to analyze urinary metabolites and effectively predict the need for colonoscopy, with a sensitivity and specificity of 64% and 65%, respectively. We used our k-NN algorithm for color classification of the urinalysis dipsticks. The k-NN algorithm, known for its simplicity and ease of implementation, classifies new objects into existing categories using similarity measurements [16]. Using the k-NN algorithm, we achieved an accuracy and specificity of >80% for most parameters and a sensitivity of >80% for four main parameters (blood, glucose, nitrite, and protein). The ROC AUC (the overall performance metric) was >87% for the four main parameters and >75% for all but two parameters. These results are similar to those of the previous studies involving an AI algorithm for urinalysis [1315].

Urine test strip results are typically read between 60 and 120 secs after soaking a strip. The optimal time for image capture with our AI algorithm was 75 secs after dipping. However, this time only reflects the results of four parameters (blood, glucose, protein, and nitrite) and may change when other parameters are included. Further research is needed to determine the optimal times for each parameter and all parameters.

This study has some limitations. First, simultaneous testing was not possible because of the need to use residual samples after routine testing, which may have affected the urine test strip results. Second, the lack of positive samples for some parameters presents substantial challenges for the learning and verification phases of the proposed AI algorithm. The lack of positive cases may have affected the accuracy and reliability of our results, particularly because a balanced dataset is crucial for algorithm training and validation. Third, because of the inherent limitations associated with k-NN classification, increasing the dataset size did not always correspond to enhanced performance. The k-NN algorithm is suitable for color classification; however, it should be improved to perform well for all parameters with various methods, such as by applying recently published algorithms with good performance or mixing multiple algorithms. Finally, we did not collect or analyze demographic data for the patients who provided urine samples, as we primarily focused on the technical performance of the AI algorithm versus standard equipment, using anonymized residual samples to ensure privacy and ethical compliance. Although demographic data can be valuable in clinical studies, we believe that its absence did not significantly influence our results, as the colorimetric analysis should be independent of patient demographics for this technical validation. Future studies on clinical outcomes or population-specific performance may benefit from incorporating such data.

In conclusion, our novel AI-based algorithm facilitates the automatic interpretation of urine test strips with a mobile phone camera. The algorithm achieves satisfactory accuracy, sensitivity, specificity, and ROC AUC for most urinalysis parameters, including blood, glucose, protein, and nitrite. Although additional improvements are required for some parameters, the algorithm is a user-friendly diagnostic tool expected to cause a paradigm shift in urine testing.

Kim HJ designed the study, analyzed the data, and wrote the draft; Kim M analyzed the data, visualized the results, and reviewed and edited the manuscript; Zhang H interpreted the data and reviewed the manuscript; Kim HR supported the data collection and discussed the data; Jeon JW supported the data collection and discussed the data; Seo Y investigated and discussed the data; Choi Q conceived the study, analyzed the data, finalized the draft, and acquired funding. All authors read and approved the final manuscript.

This research was supported by Chungnam National University Sejong Hospital Research Fund, 2021.

  1. Lei R, Huo R, Mohan C, Mohan C. Current and emerging trends in point-of-care urinalysis tests. Expert Rev Mol Diagn 2020;20:69-84.
    Pubmed KoreaMed CrossRef
  2. Kavuru V, Vu T, Karageorge L, Choudhury D, Senger R, Robertson J. Dipstick analysis of urine chemistry: benefits and limitations of dry chemistry-based assays. Postgrad Med 2020;132:225-33.
    Pubmed CrossRef
  3. Khan AI, Khan M, Khan R. Artificial intelligence in point-of-care testing. Ann Lab Med 2023;43:401-407.
    Pubmed KoreaMed CrossRef
  4. Coppens A, Speeckaert M, Delanghe J. The pre-analytical challenges of routine urinalysis. Acta Clin Belg 2010;65:182-9.
    Pubmed CrossRef
  5. De Bruyne S, De Kesel P, Oyaert M. Applications of artificial intelligence in urinalysis: is the future already here?. Clin Chem 2023;69:1348-60.
    Pubmed CrossRef
  6. Oyaert M, Delanghe J. Progress in automated urinalysis. Ann Lab Med 2019;39:15-22.
    Pubmed KoreaMed CrossRef
  7. Stauss M, Dhaygude A, Ponnusamy A, Myers M, Woywodt A. Remote digital urinalysis with smartphone technology as part of remote management of glomerular disease during the SARS-CoV-2 virus pandemic: single-centre experience in 25 patients. Clin Kidney J 2021;15:903-11.
    Pubmed KoreaMed CrossRef
  8. Nixon M, Outlaw F, Leung TS. Accurate device-independent colorimetric measurements using smartphones. PLoS One 2020;15:e0230561.
    Pubmed KoreaMed CrossRef
  9. Hong JI, Chang BY. Development of the smartphone-based colorimetry for multi-analyte sensing arrays. Lab Chip 2014;14:1725-32.
    Pubmed CrossRef
  10. Noor Azhar M, Bustam A, Naseem FS, Shuin SS, Md Yusuf MH, Hishamudin NU, et al. Improving the reliability of smartphone-based urine colorimetry using a colour card calibration method. Digit Health 2023;9:20552076231154684.
    Pubmed KoreaMed CrossRef
  11. Flaucher M, Nissen M, Jaeger KM, Titzmann A, Pontones C, Huebner H, et al. Smartphone-based colorimetric analysis of urine test strips for at-home prenatal care. IEEE J Transl Eng Health Med 2022;10:2800109.
    Pubmed KoreaMed CrossRef
  12. Nixon M, Outlaw F, MacDonald LW, Leung TS. The importance of a device specific calibration for smartphone colorimetry. Proc IS&T 27th Color and Imaging Conf 2019;27:49-54. https://doi.org/10.2352/issn.2169-2629.2019.27.10
    CrossRef
  13. Shao CH, Chen CL, Lin JY, Chen CJ, Fu SH, Chen YT, et al. Metabolite marker discovery for the detection of bladder cancer by comparative metabolomics. Oncotarget 2017;8:38802-10.
    Pubmed KoreaMed CrossRef
  14. Sanghvi AB, Allen EZ, Callenberg KM, Pantanowitz L. Performance of an artificial intelligence algorithm for reporting urine cytopathology. Cancer Cytopathol 2019;127:658-66.
    Pubmed CrossRef
  15. Eisner R, Greiner R, Tso V, Wang H, Fedorak RN. A machine-learned predictor of colonic polyps based on urinary metabolomics. Biomed Res Int 2013;2013:303982.
    Pubmed KoreaMed CrossRef
  16. Sarang P. K-nearest neighbors. Thinking data science. The Springer series in applied machine learning. Cham: Springer International Publishing, 2023:131-41.
    CrossRef