Practical Considerations for Clinical Laboratories in Top-down Approach for Assessing the Measurement Uncertainty of Clinical Chemistry Analytes
2022; 42(6): 630-637
Ann Lab Med 2021; 41(5): 447-454
Published online September 1, 2021 https://doi.org/10.3343/alm.2021.41.5.447
Copyright © Korean Society for Laboratory Medicine.
Haeil Park , M.D., Ph.D. and Younsuk Ko , M.T.
Department of Laboratory Medicine, Bucheon St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea
Correspondence to: Haeil Park, M.D., Ph.D.
Department of Laboratory Medicine, Bucheon St. Mary’s Hospital, The Catholic University of Korea, 327 Sosa-ro, Wonmi-gu, Bucheon 14647, Korea
Tel: +82-32-340-2093
Fax: +82-32-340-2219
E-mail: phi@catholic.ac.kr
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Background: Urine reagent strip test (URST) results are semi-quantitative; therefore, the precision of URSTs is evaluated as the proportion of categorical results from repeated measurements of a sample that are concordant with an expected result. However, URSTs have quantitative readout values before ordinal results challenging statistical monitoring for internal quality control (IQC) with control rules. This study aimed to determine the sigma metric of URSTs and derive appropriate control rules for IQC.
Methods: The URiSCAN Super Plus fully automated urine analyzer (YD Diagnostics, Yongin, Korea) was used for URSTs. Change in reflectance rate (change %R) data from IQC for URSTs performed between November 2018 and May 2020 were analyzed. Red blood cells, bilirubin, urobilinogen, ketones, protein, glucose, leukocytes, and pH were measured from 2-3 levels of control materials. The total allowable error (TEa) for a grade was the difference in midpoints of a predefined change %R range between two adjacent grades. The sigma metric was calculated as TEa/SD. Sigma metric-based control rules were determined with Westgard EZ Rules 3 software (Westgard QC, Madison, WI, USA).
Results: Seven out of the eight analytes had a sigma metric >4 in the control materials with a negative grade (-), which were closer to the cut-offs. Corresponding control rules ranged from 12.5s to 13.5s.
Conclusions: Although the URST is a semi-quantitative test, statistical IQC can be performed using the readout values. According to the sigma metric, control rules recommended for URST IQC in routine clinical practice are 12.5s to 13.5s.
Keywords: Urine reagent strip test, Internal quality control, Sigma metric, Control rule
Urinalysis plays a major role in the screening, diagnosis, and monitoring of renal and urological conditions [1-4]. The urine reagent strip test (URST) is a semi-quantitative test and has become increasingly more sensitive owing to advances in electronic detection [5]. The analytical performance of a URST is typically evaluated based on precision, calculated as the proportion of categorical results from repeated measurements of a sample that are concordant with an expected result [6-11]. In addition, the precision of a URST can be evaluated in terms of a cut-off value [12].
Internal quality control (IQC) of a URST typically involves the measurement of control materials once per run or over a time interval. As in precision evaluation, test results from the control materials are compared with the expected values for the material; QC is assured when the difference is not more than one grade. The statistical QC commonly used in clinical chemistry tests using mean and SD cannot be applied to URST IQC with semi-quantitative ordinal data.
Currently, the performance of quantitative laboratory tests in clinical chemistry is often expressed as a sigma metric, and control rules for IQC can be obtained using a sigma metric [13-21]. Quantitative readout values, such as the reflectance rate, are used to evaluate the performance of a URST [9, 22-26]. Although the precision of a URST has been analyzed quantitatively, there is no IQC report based on quantitative data. If quantitative data are available from a laboratory test, a sigma metric with control rules for IQC can be obtained. Specifically, data on the precision, bias, and total allowable error (TEa) would be required to establish sigma metric-based control rules [27]. The sigma metric depends on the TEa [17, 21, 28]. However, our literature search failed to identify studies that covered the TEa for URSTs and associated sigma metrics, or IQC control rules based on the sigma metric. Therefore, this study aimed to determine the sigma metric of the URST and to derive appropriate control rules based on quantitative readout values from URST IQC data.
This study was conducted retrospectively at Bucheon St. Mary’s Hospital, Bucheon, Korea, using IQC data obtained from November 2018 to May 2020. The study was approved by the Institutional Review Board of the Catholic Medical Center, the Catholic University of Korea (HC20DISI0094).
The URiSCAN Super Plus fully automated urine analyzer (YD Diagnostics, Yongin, Korea) was used with a URiSCAN strip (YD Diagnostics) to assess 10 analytes, including red blood cells (RBC), bilirubin, urobilinogen, ketones, protein, nitrite, glucose, leukocytes, specific gravity (SG), and pH.
URiSCAN Super Plus is one of a series of URST systems that is commonly used in Korea [29]. On reaction with a urine specimen, the degree of color development of the pads on a reagent strip is measured by a charge-coupled device (CCD) color image sensor under illumination with a light-emitting diode. The CCD takes a reading at each wavelength for red (630 nm), green (540 nm), and blue (460 nm). The difference in the reflectance rate before and after the reaction is then converted to the change in the reflectance rate (change %R) value, from which a grade on an ordinal scale (e.g., −, ±, +, 2+, and 3+) is generated as the test result based on a range that is predefined by the manufacturer [29].
The analyte concentration and predefined instrument range of change %R for each grade of the URST result were obtained from the manufacturer’s user manual (Table 1). The midpoint of change %R was determined for each grade, and the difference in the midpoint change %R between a given grade and the adjacent grade (below or above) was calculated. Considering that the allowable difference in the evaluation of precision or IQC is one grade, the TEa for a particular grade was assumed to be the difference of the midpoint change %R between that grade and the adjacent grade. If the predefined range of change %R at a given level of the control material had no valid lower or upper limit, the midpoint change %R could not be determined. In that case, the difference in midpoint change %R between adjacent grades could not be calculated, and the TEa was unavailable. If the TEa for a level of a control material was unavailable, that of the nearest grade was used instead (Table 1). Despite this approach, the TEa was still unavailable for nitrite and SG, since nitrite was measured on a binary rather than an ordinal scale, and the predefined change %R range table of the manufacturer did not include information about SG.
Table 1 . TEa for quantitative readout values of each grade from URSTs derived from the range of change %R
Analyte | Grade | Concentration* or count | Instrument range of change %R | Mid-point | TEaL | TEaH |
---|---|---|---|---|---|---|
RBC | − | 0 RBC/μL | −99 –17 | |||
± | 5 RBC/μL | 18–27 | 22.5 | 19.00 | ||
+ | 10 RBC/μL | 28–55 | 41.5 | 19.00 | 31.50 | |
2+ | 50 RBC/μL | 56–90 | 73.0 | 31.50 | ||
3+ | 250 RBC/μL | 91− | ||||
Bilirubin | − | 0 mg/dL | −99–33 | |||
+ | 0.5 mg/dL | 34–46 | 40.0 | 14.50 | ||
2+ | 1 mg/dL | 47–62 | 54.5 | 14.50 | ||
3+ | 3 mg/dL | 63− | ||||
Urobilinogen | ± | 0.1 mg/dL | −99–34 | |||
+ | 1 mg/dL | 35–46 | 40.5 | 12.00 | ||
2+ | 4 mg/dL | 47–58 | 52.5 | 12.00 | 14.50 | |
3+ | 8 mg/dL | 59–75 | 67.0 | 14.50 | ||
4+ | 12 mg/dL | 76− | ||||
Ketone | − | 0 mg/dL | −99–20 | |||
± | 5 mg/dL | 21–30 | 25.5 | 13.00 | ||
+ | 10 mg/dL | 31–46 | 38.5 | 13.00 | 15.00 | |
2+ | 50 mg/dL | 47–60 | 53.5 | 15.00 | ||
3+ | 100 mg/dL | 61− | ||||
Protein | − | 0 mg/dL | −99 – 25 | |||
± | 10 mg/dL | 26–34 | 30.0 | 18.50 | ||
+ | 30 mg/dL | 35–62 | 48.5 | 18.50 | 27.50 | |
2+ | 100 mg/dL | 63–89 | 76.0 | 27.50 | 21.00 | |
3+ | 300 mg/dL | 90–104 | 97.0 | 21.00 | ||
4+ | 1,000 mg/dL | |||||
Glucose | − | 0 mg/dL | −99–30 | |||
± | 100 mg/dL | 31–100 | 65.5 | 77.50 | ||
+ | 250 mg/dL | 101–185 | 143.0 | 77.50 | 65.00 | |
2+ | 500 mg/dL | 186–230 | 208.0 | 65.00 | ||
3+ | 1,000 mg/dL | 231− | ||||
Leukocytes | − | 0 WBC/μL | −99–10 | |||
± | 10 WBC/μL | 11–22 | 16.5 | 16.00 | ||
+ | 25 WBC/μL | 23–42 | 32.5 | 16.00 | 16.50 | |
2+ | 75 WBC/μL | 43–55 | 49.0 | 16.50 | ||
3+ | 500 WBC/μL | 56− | ||||
pH | 5 | −99–26 | ||||
5.5 | 26–40 | 33.0 | 15.00 | |||
6 | 41–55 | 48.0 | 15.00 | 20.00 | ||
6.5 | 56–80 | 68.0 | 20.00 | 27.50 | ||
7 | 81–110 | 95.5 | 27.50 | 42.50 | ||
7.5 | 111–165 | 138.0 | 42.50 | 42.50 | ||
8 | 166–195 | 180.5 | 42.50 | 22.50 | ||
8.5 | 196–210 | 203.0 | 22.50 | 402.00 | ||
9 | 211–999 | 605.0 | 402.00 | 402.00 |
*Conversion factors from conventional units to Système International (SI) units are: bilirubin, from mg/dL to μmol/L multiply by 17.1; urobilinogen, from mg/dL to μmol/L multiply by 16.93; ketone, from mg/dL to mmol/L multiply by 0.1721; protein, from mg/dL to mg/L multiply by 10; glucose, from mg/dL to mmol/L multiply by 0.0555.
Abbreviations: TEa, total allowable error; URSTs, urine reagent strip tests; TEaL, difference of the midpoints of adjacent lower grades; TEaH, difference of the midpoints of adjacent higher grades; RBC, red blood cells; WBC, white blood cells.
URiTROL liquid urinalysis control (YD Diagnostics) was used for IQC. Control materials are listed in Table 2; two or three levels of the control materials were analyzed once in each of two runs on a weekday. During the study period, 4–5 lots per level of control material were used (Fig. 1). Three lots of reagent strips were used; different lots were assumed to be equivalent, and lots were therefore not considered in the data analysis.
Table 2 . TEa for quantitative readout values of the control materials (URiTROL)
Analyte | Level | Grade | Applied TEa |
---|---|---|---|
RBC | 1 | − | 19.00 |
2 | 3+ | 31.50 | |
Bilirubin | 1 | − | 14.50 |
2 | − | 14.50 | |
3 | 3+ | 14.50 | |
Urobilinogen | 1 | ± | 12.00 |
2 | ± | 12.00 | |
Ketone | 1 | − | 13.00 |
2 | 3+ | 15.00 | |
Protein | 1 | − | 18.50 |
2 | − | 18.50 | |
3 | 3+ | 21.00 | |
Glucose | 1 | − | 77.50 |
2 | 3+ | 65.00 | |
Leukocyte | 1 | − | 16.00 |
2 | 3+ | 16.50 | |
pH | 1 | 5.0 | 15.00 |
2 | 7.5 | 42.50 |
Abbreviations: TEa, total allowable error; RBC, red blood cells.
Quantitative readout values for eight of the 10 analytes (excluding nitrite and SG) were used for data analysis. The mean, SD, and CV of the change %R at each level of control material were calculated for each lot and across all lots. Because the count of readout values (N) differed among lots, it was used as a weight for each lot in the calculation across all lots [30]. The sigma metric was calculated as TEa/SD. The bias from the original equation of (TEa–bias)/SD was assumed to be zero. This assumption was necessary because the target value of change %R was unknown so that bias could not be calculated. Sigma of 4 and 3 is considered to indicate average performance and minimum quality in industry, respectively. Sigma 6 is considered to indicate the best performance and represents a “world-class quality” product [31]. Westgard EZ Rules 3 software (Westgard QC, Madison, WI, USA) was used to determine sigma metric-based control rules with the total number of control measurements (Nc). The probability of false rejections (Pfr) and the probability of error detection (Ped) of IQC were also calculated along with the number of runs over which the control rules were applied (R) [31]. In URST, usually, a negative result in grade (−) is normal, and positive results in grades (+, 2+, and 3+) are abnormal. It is clinically important to distinguish whether the patient’s URST result is normal or abnormal. To this end, the accuracy of the URST must be ensured in the borderline range where the patient’s URST result is positive, but the analyte concentration is low. To monitor the accuracy in that range, the control material must also be positive, but its concentration must be low [9, 10]. Therefore, besides pH, data from level-1 control materials were considered to be more meaningful than those from other levels, because the expected value of the level-1 material, with mostly negative results in grade (−), would be closer to the cut-off values between normal and abnormal test results.
The mean, SD, CV, and sigma metrics obtained from the change %R values for all lots are presented in Table 3. Considering the results from each lot separately, the sigma metric ranged from 3.59 (urobilinogen) to 27.19 (pH) in level-1 control material. Except for one lot showing the worst precision for urobilinogen (U11805), all other analytes in level-1 control material had a sigma metric above 4, which is considered at least average quality in the industry. The sigma metric ranged from 1.35 (pH) to 9.76 (glucose) in level-2 control material and from 2.66 (bilirubin) to 6.27 (protein) in level-3 control material. Except for urobilinogen, all analytes in level-1 control material showed a sigma 6 performance for at least one lot, indicating world-class quality. Control rules with Nc determined for level-1 control material varied from 13s/22s/R4s/41s/8x with Nc=4 for urobilinogen to 13.5s with Nc=2 for pH. For level-2 control materials, the control rules varied from 13s/22s/R4s/41s/8x with Nc=4 for pH to 13.5s with Nc=2 for glucose. For level-3 control material, the control rules varied from 13s/2of32s/R4s/31s/6x with Nc of 6 for bilirubin to 13.5s with Nc=3 for protein.
Table 3 . Sigma metrics and control rules for IQC of URSTs derived from each level of control material across all lots (N=856)
Analyte | Level | Grade | Mean | SD | CV (%) | TEa | Sigma metric | Control rules | Pfr | Ped | Nc | R |
---|---|---|---|---|---|---|---|---|---|---|---|---|
RBC | 1 | – | −0.68 | 3.11 | 460.56 | 19.00 | 6.11 | 13.5s | < 0.01 | > 0.89 | 2 | 1 |
2 | 3+ | 123.85 | 8.84 | 7.14 | 31.50 | 3.56 | 13s/22s/R4s/41s/8x | 0.03 | 0.94 | 4 | 2 | |
Bilirubin | 1 | – | −0.62 | 2.18 | 352.32 | 14.50 | 6.64 | 13.5s | < 0.01 | > 0.96 | 3 | 1 |
2 | – | 1.76 | 1.93 | 109.5 | 14.50 | 7.5 | 13.5s | < 0.01 | > 0.96 | 3 | 1 | |
3 | 3+ | 73.66 | 3.87 | 5.25 | 14.50 | 3.75 | 12.5s | 0.06 | 0.91 | 6 | 1 | |
Urobilinogen | 1 | ± | −0.07 | 2.65 | 3,972.33 | 12.00 | 4.54 | 12.5s | 0.04 | 0.98 | 4 | 1 |
2 | ± | 3.03 | 1.94 | 63.99 | 12.00 | 6.19 | 13.5s | < 0.01 | > 0.89 | 2 | 1 | |
Ketone | 1 | – | 0.30 | 1.89 | 621.45 | 13.00 | 6.89 | 13.5s | < 0.01 | > 0.89 | 2 | 1 |
2 | 3+ | 73.08 | 3.93 | 5.38 | 15.00 | 3.82 | 13s/22s/R4s/41s/8x | 0.03 | 0.98 | 4 | 2 | |
Protein | 1 | – | −2.07 | 3.27 | 158.12 | 18.50 | 5.65 | 13.5s | < 0.01 | > 0.96 | 3 | 1 |
2 | – | 11.66 | 2.52 | 21.57 | 18.50 | 7.35 | 13.5s | < 0.01 | > 0.96 | 3 | 1 | |
3 | 3+ | 94.21 | 3.62 | 3.84 | 21.00 | 5.81 | 13.5s | < 0.01 | > 0.96 | 3 | 1 | |
Glucose | 1 | – | −1.01 | 5.94 | 587.91 | 77.50 | 13.05 | 13.5s | < 0.01 | > 0.89 | 2 | 1 |
2 | 3+ | 235.63 | 9.69 | 4.11 | 65.00 | 6.71 | 13.5s | < 0.01 | > 0.89 | 2 | 1 | |
Leukocytes | 1 | – | −0.79 | 2.76 | 351.21 | 16.00 | 5.79 | 13.5s | < 0.01 | > 0.89 | 2 | 1 |
2 | 3+ | 74.06 | 5.49 | 7.41 | 16.50 | 3.01 | 13s/22s/R4s/41s/8x | 0.03 | 0.63 | 4 | 2 | |
pH | 1 | 5 | 18.43 | 0.81 | 4.40 | 15.00 | 18.51 | 13.5s | < 0.01 | > 0.89 | 2 | 1 |
2 | 7.5 | 121.13 | 21.7 | 17.91 | 42.50 | 1.96 | 13s/22s/R4s/41s/8x | 0.03 | 0.05 | 4 | 2 |
Abbreviations: IQC, internal quality control; URST, urine reagent strip test; TEa, total allowable error; Pfr, probability of false rejection; Ped, probability of error detection; Nc, number of control measurements made; R, number of runs over which the control rules are applied; RBC, red blood cells.
Combining the results from all lots, the sigma metric ranged from 4.54 (urobilinogen) to 18.51 (pH) for level-1 control material, from 1.96 (pH) to 7.50 (bilirubin) for level-2 control material, and from 3.75 (bilirubin) to 5.81 (protein) for level-3 control material. Control rules with Nc determined for level-1 and level-3 control materials varied from 12.5s with Nc=6 (for bilirubin in level-3 control material) to 13.5s with Nc=2 (for RBC, ketone, glucose, leukocyte, and pH in level-1 control material). For level-2 control material, control rules with Nc ranged from 13s/22s/R4s/41s/8x with Nc=4 (for RBC, ketone, leukocyte, and pH) to 13.5s with Nc=2 (for urobilinogen and glucose) (Table 3).
We determined the sigma metric and derived appropriate control rules for IQC from URST quantitative readout values. For level-1 control material with an expected result of negative grade (−), the sigma metrics of URSTs mostly exceeded 4, representing the performance level of average quality in the industry. Control rules determined based on the sigma metric ranged from 12.5s with Nc=2 or 3 to 13.5s with Nc=2 or 3.
In URSTs, quantitation of an analyte present in a urine sample is important, but detecting the presence of the analyte may be considered clinically more important. The cut-off can be considered to be located somewhere between a negative grade (−) and a positive grade (+) test result. Although the control material set was not designed to assign a cut-off value of the analyte, it is presumed that manufacturers have ensured a high URST measurement capability at the decision cut-off between the absence and presence of an analyte. Such an estimate is consistent with the trend of a higher sigma metric obtained with test results with lower grade (−) control material than with higher grade (3+) control material. This assumption can explain the different findings for bilirubin, urobilinogen, and protein for the level-2 control material, which had a slightly higher change %R, than for the level-1 control material, which tended to show a higher sigma metric than level-1 control material, even though the control material of both levels had the same expected test result of negative grade (−) or trace grade (±). This is because the analyte concentration of the level-2 control material with a higher change %R will be closer to the cut-off than that of the level-1 control material.
The sigma metric and corresponding QC rules are obtained for each of the two or three levels of control materials constituting the control material set. However, two to three control rules cannot be used simultaneously. Therefore, to determine the QC rules to be applied in practice, only one level of control material must be chosen. When the control material set is designed to have expected test results of negative (−) and three positive (3+) grades, it would be better to select the control rule obtained from level-1 control material, which has an expected test result of negative grade (−). When the control material set contains expected test results of two identical grades, such as negative grade (−) or trace grade (±), it is considered desirable to select a control rule derived from level-2 control material, which shows a higher change %R and is closer to the cutoff.
Although the SD was used to calculate the sigma metric and derive control rules in this study, the same precision based on the CV could be compared with that obtained in a previous study. Cho,
Quantitative datasets used in most studies on URST precision evaluation were small and were obtained during a short period [9, 22–26]. Another important aspect to consider is the different types of readout values among studies and systems. The CVs obtained in studies using reflectance rate data were lower for control materials with expected test results of lower grades than for those with expected test results of higher grades, because the reflectance rate is inversely related to the analyte concentration [6, 22, 24, 26, 32, 33]. Under such conditions, even when using the CV and TEa as a percentage, it is possible to obtain a sigma metric and control rules at lower analyte concentrations, in which the quantitative readout value is high and the CV is low. URST precision using the change %R would not be comparable with that obtained using the reflectance rate for either SD or CV at low concentrations.
In previous studies on URSTs using quantitative data, the readout values from a grade were compared with the analyte concentrations obtained from a chemistry analyzer, or with the blood cell count obtained from a urine particle analyzer or microscope [6, 22–26, 29, 32, 33]. This is the first study to use URST quantitative readout data for statistical IQC. The readout values from each analyte were quantitatively evaluated for precision, which was then used to derive control rules. At low concentrations, the CV of change %R was too large to obtain a sigma metric, which was overcome by using the SD instead of the CV [27]. We believe that the present data better reflect the real-world performance of URSTs, because we analyzed data from 19 months of use with 4–5 lots of control materials as opposed to data collected over a short period.
The current study had some limitations. First, because the analyte concentration in control materials was too low or too high, in most cases, the difference from a midpoint was taken from a higher or lower grade for use as a TEa in that grade. In particular, the TEa for a level-1 control material was taken from an upper grade, and it may therefore have been underestimated. The predefined range of change %R for level-1 control material was wider than that for other levels for all analytes. Therefore, the sigma metric also may have been underestimated. Second, bias was not considered in determining the sigma metric, which may have led to overestimation. These two effects on the sigma metric may have offset each other. Third, the concentration of the level-1 control material with an expected test result of negative grade was close to, but not exactly the clinically important concentration. Therefore, in future studies, performance needs to be evaluated using control materials with low analyte concentrations obtained from a third-party manufacturer. Fourth, the use of the difference in the midpoint change %R between two adjacent grades as the TEa may require in-depth investigation in subsequent studies. Moreover, TEa was derived from the manufacturer’s predefined range table of change %R, and therefore, the TEa of other URSTs may vary depending on the product. Finally, and most importantly, the difference in midpoint change %R between adjacent grades used as the TEa does not correspond with either of outcome model, biological variation, or state of the art, which can be used as analytical performance specifications (APS) [34]. However, with the collection of data from multicenter laboratories using the same type of URST, the total error of change %R can be calculated and modified as TEa, which would provide the APS based on the state of the art of the measurement [34]. Until such data are available, the TEa set by the method devised in this study can be used.
In conclusion, although the URST is a semi-quantitative test, statistical IQC can be performed, as in routine practice for clinical chemistry tests, using the quantitative readout value. At clinically significantly low analyte concentrations, most of the sigma metrics exceeded 4. Depending on the metric, 12.5s with Nc=2 or 3 to 13.5s with Nc=2 or 3 could be recommended as a control rule for IQC of a URST in routine clinical practice.
We would like to thank JeongSun Park and Yong Chan Yun of YD Diagnostics for the technical support with the URiSCAN Super Plus fully automated urine analyzer.
Conceptualization: Park H; Study design: Park H and Ko Y; Administration: Park H; Collection and assembly of data: Park H and Ko Y; Data analysis and interpretation: Park H; Manuscript writing (original draft): Park H; Manuscript writing (review and editing): Park H and Ko Y; Final approval of manuscript: Park H and Ko Y.
None declared.
None declared.