Article

Original Article

Ann Lab Med 2024; 44(6): 507-517

Published online July 2, 2024 https://doi.org/10.3343/alm.2023.0447

Copyright © Korean Society for Laboratory Medicine.

Evaluating the Commutability of Reference Materials for α-Fetoprotein: Accurate Value Assignment With Multiple Systems and Trueness Verification

Jianping Zhang , M.M.*, Jing Zhao , M.M.*, Qingtao Wang , M.M., Rui Zhang , M.D., and Yuhong Yue , M.M.

Department of Clinical Laboratory, Beijing Chaoyang Hospital, Capital Medical University; Beijing Center for Clinical Laboratories, Beijing, China

Correspondence to: Rui Zhang, M.D.
Department of Clinical Laboratory, Beijing Chaoyang Hospital, Capital Medical University, Beijing Center for Clinical Laboratories, No. 8 Gongti South Road, Chaoyang District, Beijing 100020, China
E-mail: zr189169@163.com

Yuhong Yue, M.M.
Department of Clinical Laboratory, Beijing Chaoyang Hospital, Capital Medical University, Beijing Center for Clinical Laboratories, No. 8 Gongti South Road, Chaoyang District, Beijing 100020, China
E-mail: yueyh2017@163.com

* These authors contributed equally to this study as co-first authors.

Received: November 14, 2023; Revised: March 19, 2024; Accepted: June 14, 2024

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Background: The accurate measurement of α-fetoprotein (AFP) is critical for clinical diagnosis. However, different AFP immunoassays may yield different results. Appropriate AFP reference materials (RMs) were selected and assigned accurate values for applications with external quality assessment (EQA) programs to standardize AFP measurements.
Methods: Forty individual clinical samples and six different concentrations of candidate RMs (Can-RMs, L1–L6) were prepared by the Beijing Center for Clinical Laboratories. The Can-RMs were assigned target values by performing five immunoassays, using WHO International Standard 72/225 as a calibrator, and sent to 45 clinical laboratories in Beijing for AFP measurements. The commutability of all RMs was assessed based on CLSI and the International Federation of Clinical Chemistry and Laboratory Medicine (IFCC) approaches. Analytical performance was assessed for compliance based on accuracy (total error, TE), trueness (bias), and precision (CV).
Results: The Can-RMs were commutable for all immunoassays using the CLSI approach and for 6 of 10 assay combinations using the IFCC approach. RMs diluted in WHO RM 72/225 were commutable among all assays with the CLSI approach, except for serum matrix (Autolumo vs. Roche analyzer) and diluted water matrix (Abbott vs. Roche/Mindray analyzer), whereas some inconclusive and non-commutable results were found using the IFCC approach. The average pass rates based on the TE, bias, and CV were 91%, 81%, and 95%, respectively.
Conclusions: The commutability of the RMs differed between both evaluation approaches. The Can-RMs exhibited good commutability with the CLSI approach, suggesting their suitability for use with that approach as commutable EQA materials with assigned values and for monitoring the performance of AFP measurements.

Keywords: Alpha-fetoprotein, Commutability, External quality assessment, Immunoassay, Reference standard

Alpha-fetoprotein (AFP) is a glycoprotein mainly produced in the yolk sac and liver cells during the embryonic stage. With continued research interest in understanding the functions of serum AFP and continuous improvement in measurement methods, the serum AFP concentration has been found to be significantly elevated in patients with various cancers. Related findings have identified AFP as a sensitive serum marker for diagnosing primary hepatocellular carcinoma (PHC). AFP is widely used for early screening, diagnosis, monitoring disease development, guiding treatment decisions, and prognosis predictions for patients with PHC [1-5].

In human serum samples, AFP is measured using various automated systems in clinical laboratories, most commonly by performing chemiluminescent immunoassays or electrochemiluminescence immunoassays. The accurate measurement of AFP is crucial for decision-making regarding the diagnosis of relevant diseases and patient management. Metrological traceability refers to the value of laboratory results obtained with a certified reference material (RM) and reference-measurement procedure, achieved with a series of calibrations [6]. The AFP RM 72/225, designated by the WHO as a high-order RM, is used to assign values to manufacturers’ calibrators with an uninterrupted chain of traceability [7, 8]. Ensuring RM commutability in the traceability process is imperative [7]. One contributing factor to the lack of agreement is the use of non-commutable RMs as calibrators in the calibration hierarchy for clinical laboratory-measurement procedures. Commutability is defined as equivalence in the mathematical relationships between the results of different measurement procedures for a control material and representative clinical samples (CSs). Commutability of control materials is one of the most important factors affecting the implementation of external quality assessment (EQA) programs. Employing commutable materials enables the transfer of laboratory performance in such programs to CSs [9]. Ideally, an EQA program involves control materials that are commutable with patient samples and have values assigned using higher-order reference methods [10]. Providers of reference and control materials should be definitively responsible for assessing the commutability of such materials before use [11].

To standardize AFP measurements, we assessed differences and commutability with the WHO 72/225 standard diluted with six different matrices, using five different AFP immunoassays, and selected the appropriate WHO 72/225 diluent matrix to improve the comparability of AFP measurements in the different assays. We also examined the optimal approach for deciding whether prepared candidate RMs (Can-RMs) for AFP exhibit suitable commutability for use with EQA or proficiency-testing samples, which would support their use as common calibrators for measurement procedures or trueness control materials provided by manufacturers to validate calibrations [12].

Instruments and reagents

The five automated immunoassays and reagents used in this study were as follows: Abbott Architect i2000 (Abbott Diagnostics, Abbott Park, IL, USA), Beckman DxI 800 (Beckman Coulter Inc., Brea, CA, USA), Roche Cobas E601 (Roche Diagnostics GmbH, Branchburg, NJ, USA), Mindray CL-2000i (Mindray Co., Ltd., Shenzhen, China), and Autolumo A2000 (Autobio Co., Ltd., Zhengzhou, China).

The commercial diluents were phosphate-buffered saline (PBS) (Na2HPO4, KH2PO4, NaCl, and KCl; pH 7.2–7.4), glucose-free DMEM (a14430-01; Gibco, Grand Island, NY, USA), minimal essential medium (MEM; 51200-038; Gibco), and Roswell Park Memorial Institute 1640 (RPMI) medium (r7509-500 mL; Sigma-Aldrich, St. Louis, MO, USA). A Milli-Q water system was obtained from EMD Millipore (Billerica, MA, USA). In addition, a healthy human serum pool prepared from leftover patient serum samples (collected from Beijing Chaoyang Hospital) was used as a diluent for WHO 72/225 and had a mean native AFP concentration of 3.0 µg/L.

The standard material used for AFP analysis was WHO 72/225 (60,500 µg/L), which was purchased from the National Institute for Biological Standards and Control (Hertfordshire, United Kingdom).

Preparation of Can-RMs

We collected 40 leftover patient serum samples with low, medium, and high AFP concentrations from the Laboratory Department of Beijing Chaoyang Hospital, where the AFP concentrations ranged from 5.1 to 730.9 µg/L (as measured using the Roche Cobas E601 system). Each sample (at least 2.5 mL) was evenly divided into five aliquots, which were stored at −80°C until further use. This study was approved by the Ethics Committee of Beijing Chaoyang Hospital (approval number 2023/7/11-2) and was performed in compliance with the Declaration of Helsinki. The requirement for informed consent was waived due to the retrospective nature of the study design.

Based on the CLSI C37 guidelines [13], human serum pools without hemolysis, lipemia, and icterus were prepared using serum samples from the Laboratory Department of the Beijing Chaoyang Hospital as leftover samples for AFP measurements. The samples had different AFP concentrations and were directly collected into tubes and frozen at −80°C daily. The frozen serum aliquots were thawed at room temperature, pooled, and analyzed using a Beckman DxI 800 analyzer in 1 day. These pools were thoroughly mixed, filtered through 0.22-µm membranes, aliquoted into 2 mL cryogenic vials (1 mL/vial), and stored at −80°C for later use as trueness controls or for EQA.

WHO 72/225 standards

The WHO 72/225 standard material (60,500 µg/L) was diluted in six different matrices, including PBS (P), a healthy human serum pool (S), DMEM (D), MEM (M), RPMI 1640 (RP) medium, and distilled water (DW) to theoretical standard concentrations of 726.0, 363.0, 181.5, 90.8, 36.3, and 18.2 µg/L. These samples were labeled as P1–P6, S1–S6, D1–D6, M1–M6, RP1–RP6, and DW1–DW6, respectively. Each diluted WHO 72/225 sample was divided into five equal portions. Samples from the same batch were analyzed using the following five immunoassays on the same day: Roche Cobas E601, Beckman DxI 800, Abbott Architect i2000, Mindray CL-2000i, and Autolumo A2000. All samples were analyzed in triplicate.

Commutability study

According to the CLSI EP30A guidelines [14], six concentrations of the WHO 72/225 standard diluted with six different diluent components and six levels of Can-RMs were randomly allocated among the 40 individual serum samples, and all samples were measured in triplicate with the five assays in 1 day. For commutability evaluations, the measurements were transformed to a logarithmic scale, which stabilized the SDs. The transformed data were analyzed using Deming regression, and 95% prediction intervals were calculated for each pair of assays. The commutability of the AFP RMs was evaluated based on whether the logarithmic value of the RM was within the prediction interval for CSs, measured in paired assays [8, 14, 15]. Commutability assessments were also performed according to the difference in the bias analysis based on the recommendations of the IFCC Working Group on Commutability [12, 16]. The difference between the bias for an RM and the average bias for clinical samples (BCSs) was denoted as the dRM, and the standard uncertainty of the dRM, the U(dRM), was calculated based on the distribution of the mean of bias for patient samples over the entire concentration interval [12]. The maximum |dRM| value for the RM was designated as the commutability criterion (C). The U(dRM) from the commutability sheet regarding the difference in bias between the CSs and the RMs is represented for each RM with error bars. The uncertainty consisted of two components: the uncertainty of the estimate of bias for the CSs and the uncertainty of the estimate of bias for each RM. When the uncertainty interval was within BCS±C, the RM was commutable; otherwise, it was non-commutable, and when it overlapped with the C limit, the result was inconclusive [16].

Value assignments for Can-RMs

The WHO 72/225 standard was diluted in PBS (P1–P6) and Can-RMs (L1–L6) and then measured in the same analytical sequence in triplicate on 2 consecutive days using the five assays. For each assay, WHO 72/225 was used as a common standard for traceability and transmission of the measured value. The linear-regression results for the theoretical concentration of the diluted WHO 72/225 standards and the actual measured concentrations were plotted pairwise. The assigned values of the Can-RMs were calculated after standardization with the WHO 72/225 standards. The values were assigned for Can-RMs using the average results from the five clinical analyzers.

AFP EQA survey

The Beijing Center for Clinical Laboratories (BCCL) sent trueness-verification samples (Can-RMs L2 and L3) to 45 Beijing Municipal Health Commission-affiliated laboratories for AFP measurements, and each sample was analyzed thrice on 3 consecutive days by each laboratory. Nine AFP measurements were obtained for each sample in each laboratory, and the AFP measurement results obtained from the analysis were recorded or tabulated. Precision (CV), trueness (bias), and accuracy (TE) data were used to evaluate the measurement performance, and the tolerance limit derived from the desirable biological variation (https://www.westgard.com/biodatabase1.htm) was used to evaluate the acceptability of the results [17, 18]. Within-subject and between-subject biological variations were expressed as percentages [19]. The difference between the tested and target values was determined as the bias. CVs were calculated from the nine replicate results, and the TE was calculated as the bias±2 CV [20].

Statistical analysis

The commutability of all RMs was assessed based on CLSI and IFCC approaches. Regression analyses were conducted using Deming regression, and graphs were prepared using Microsoft Excel 2010 (Microsoft, Redmond, WA, USA). Bias, CV, and TE evaluation criterion were calculated from data on biological variation (https://www.westgard.com/biodatabase1.htm) to evaluate the results of the EQA program. Intralaboratory CV was calculated from the 9 repeated results, and TE was calculated as the bias+2CV. Accordingly, the optimal analytical quality specifications of the measurements were 5.9%, 3.1% and 10.9% for AFP, respectively; the desirable analytical quality specifications of the measurements were 11.8%, 6.1%, and 21.9% for AFP, respectively; the minimal analytical quality specifications of the measurements were 17.7%, 9.2%, and 32.8% for AFP, respectively.

Comparability of the diluted WHO 72/225 standards in different AFP immunoassays

Diluted WHO 72/225 RMs (samples P1–P6, S1–S6, D1–D6, M1–M6, RP1–RP6, and DW1–DW6) were assessed using five immunoassays. The deviations of the measured values for the diluted WHO 72/225 samples from their standard concentrations of 726.0, 363.0, 181.5, 90.8, 36.3, and 18.2 µg/L were calculated. The results of all five assays for WHO 72/225 prepared in the six different buffer matrices are shown in Fig. 1. The measured concentrations of 72/225 in each buffer differed with each assay system. When PBS, DMEM, MEM, and RPMI were used as diluent buffers, the largest positive deviations between the measured values and the standard concentrations for WHO 72/225 were observed with the Roche and Mindray assays, which overestimated the AFP concentrations. The Abbott and Beckman systems showed negative deviations and underestimated the AFP concentrations. The Autolumo system showed a trend from a positive deviation to a negative deviation. In the serum buffer matrix, only the Mindray assay showed a positive deviation, and the Autolumo system showed a relatively significant variation. In the DW matrix, a trend toward a negative deviation was observed with decreasing AFP concentrations with the Beckman, Abbott, and Autolumo assay systems (Fig. 1).

Figure 1. Comparability of the results obtained when the WHO72/225 standard was diluted in six different buffer matrices and tested using five different alpha-fetoprotein immunoassays. Panels A–F show the results obtained with six different buffer matrices, including (A) phosphate-buffered saline, (B) a healthy human serum pool, (C) DMEM, (D) minimal essential medium, (E) Roswell Park Memorial Institute 1640 medium, and (F) distilled water.

Commutability of the RMs for AFP measurements

Fig. 2 shows the regression curves for the AFP measurements in 40 individual serum samples using the five assays, according to the CLSI method. Can-RM samples L1–L6 (developed by BCCL) were commutable across all five assays in 10 pairwise comparisons. In addition, the WHO 72/225 standard was commutable after dilution to six concentrations in all buffers, except for the serum matrix (Autolumo vs. Roche) and DW matrix (Abbott vs. Roche/Mindray). However, the Can-RM L1 and L2 results were inconclusive for the Roche vs. Autolumo comparisons, and the Can-RM L6 results were inconclusive for the Roche/Abbott/Beckman vs. Mindray comparisons, based on the IFCC method (Fig. 3). With the IFCC approach, the biological variation of the AFP data generated was 17.7% (bias), which was used as the C value. The WHO 72/225 standard diluted in different RMs showed commutability among 4 of 10 assay combinations in PBS, 5 of 10 assay combinations in serum, 2 of 10 assay combinations in RPMI, 3 of 10 assay combinations in DMEM, 4 of 10 assay combinations in MEM, and 5 of 10 assay combinations in DW. The commutability results for Can-RMs L1–L6 and the WHO standard diluted at six concentrations were determined using both the CLSI and IFCC approaches (Table 1 and Figs. 2 & 3).

Figure 2. Assessing the commutability of reference materials across five analytical systems for determining alpha-fetoprotein concentrations based on the CLSI document EP30-A method. (A) Roche assay results – Abbott assay results. (B) Roche assay – Beckman assay. (C) Roche assay – Mindray assay. (D) Roche assay – Autolumo assay. (E) Abbott assay – Beckman assay. (F) Abbott assay – Mindray assay. (G) Abbott assay – Autolumo assay. (H) Beckman assay – Autolumo assay. (I) Beckman assay – Mindray assay. (J) Autolumo assay – Mindray assay. The solid black lines are regression curves, and the dashed lines are two-tailed 95% prediction lines.

Figure 3. Assessing the commutability of reference materials (RMs) across five analytical systems for determining alpha-fetoprotein (AFP) concentrations, based on the International Federation of Clinical Chemistry and Laboratory Medicine [16] (IFCC) method. (A) Roche assay results – Abbott assay results. (B) Roche assay – Beckman assay. (C) Roche assay – Mindray assay. (D) Roche assay – Autolumo assay. (E) Abbott assay – Beckman assay. (F) Abbott assay – Mindray assay. (G) Abbott assay – Autolumo assay. (H) Beckman assay – Autolumo assay. (I) Beckman assay – Mindray assay. (J) Autolumo assay – Mindray assay. The bias of the logarithm ln (loge) of the concentrations measured with two measuring systems is shown in each case. The purple diamonds represent the six Beijing Center for Clinical Laboratories (BCCL) Can-RMs. The black circles represent the clinical samples (CSs). The error bars indicate the uncertainty of the difference in bias between each BCCL Can-RM and diluted WHO standard and the average bias for the CSs. The solid gray line is the bias-for-the-clinical sample (BCS) line, which represents the mean bias for all CSs. The red dashed lines (C lines) indicate the maximum allowable commutability-related bias (i.e., the commutability criterion [C]).

Commutability assessments for the Can-RMs and diluted WHO standard for AFP measurements, according to the CLSI EP30A and IFCC approaches
Assay combinationsCan-RMsWHO-PBSWHO-SerumWHO-RPMIWHO-DMEMWHO-MEMWHO-DW
CLSIIFCCCLSIIFCCCLSIIFCCCLSIIFCCCLSIIFCCCLSIIFCCCLSIIFCC
AbbottRocheCCCCCCCCCCCCNCI
BeckmanRocheCCCCCICICCCICC
MindrayRocheCICCCNCCICICCCC
AutolumoRocheCICINCNCCICICICC
BeckmanAbbottCCCCCCCCCCCCCI
MindrayAbbottCICICICNCCICINCI
AutolumoAbbottCCCNCCNCCICICNCCI
AutolumoBeckmanCCCNCCICICICNCCC
MindrayBeckmanCICICCCICICCCC
MindrayAutolumoCCCICCCICICICI

Abbreviations: Can-RM, candidate reference material; AFP, alpha-fetoprotein; IFCC, International Federation of Clinical Chemistry and Laboratory Medicine [16]; PBS, phosphate-buffered saline; RPMI, Roswell Park Memorial Institute 1640; MEM, minimal essential medium; DW, distilled water; C, I, and NC: commutable, inconclusive, and non-commutable results for Can-RMs L1–L6 and diluted WHO standards at six diluted concentrations, respectively.



Final assigned values of the BCCL AFP Can-RMs

The average assigned AFP concentrations for the BCCL Can-RMs obtained using five assay platforms were calculated based on calibration curves by interpolation from linear regression curves. Each result was expressed as the standard value ± total uncertainty. The total uncertainty of the standard value reflected the uncertainties introduced by the measurement uncertainty, homogeneity, long-term stability, and the uncertainty of international RMs. The factors contributing to the uncertainty, including those introduced by assignments with multiple systems, were comprehensively calculated for the obtained results. The final assigned concentrations for the BCCL Can-RMs L1–L6 were 12.0±0.7, 27.0±3.7, 84.2±8.1, 126.0±8.9, 232.2±21.9, and 666.4±97.3 µg/L, respectively (Table 2).

The assigned concentrations and uncertainties for human serum AFP concentrations in the BCCL Can-RMs
VariablesCandidate reference materials
L1L2L3L4L5L6
uchar (%)1.25.93.96.82.85.8
ubb (%)0.80.50.70.20.70.9
ults (%)0.82.51.21.92.83.6
uWHO (%)2.42.42.42.42.42.4
Relative standard uncertainty uc (%)2.96.94.87.54.77.3
Relative expanded uncertainty, 2× uc (%)5.913.79.714.99.414.6
Expanded uncertainty (µg/L)0.73.78.118.921.997.3
Assigned value (µg/L)12.027.084.2126.0232.2666.4

Abbreviations: AFP, alpha-fetoprotein; BCCL, Beijing Center for Clinical Laboratories; Can-RM, candidate reference material; uchar, uncertainty of the measurement; ubb, uncertainty from homogeneity of the material; ults, uncertainty due to the long-term stability; uWHO, uncertainty of WHO standard materials; uc, combined standard uncertainty.



Analysis of the EQA AFP trueness program

The CV, bias, and TE evaluation criteria were determined based on the desirable level of biological variability and were 6.1%, 11.8%, and 21.9%, respectively. Using these values, the pass rates were calculated separately. When testing two levels of samples, the percentages of laboratories passing the desired bias were 74% (Can-RM L2) and 87% (Can-RM L4). In addition, 96% and 98% of the laboratories met the desired CV for Can-RM L2 and L4, respectively, and the percentages of laboratories meeting the allowable TE were 85% and 96%, respectively. For all participating laboratories, the mean bias was 1.2% (range, −17.6%–19.7%) for Can-RM L2 and −1.5% (range, −18.5%–13.1%) for Can-RM L4 (Table 3).

AFP EQA pass rates of the BCCL Can-RMs according to bias, CV, and TE criteria
InstrumentN*BCCL Can-RM L2BCCL Can-RM L4
Acceptable rate, %Mean bias, %
(minimum to maximum)
Acceptable rate, %Mean bias, %
(minimum to maximum)
BiasCVTEBiasCVTE
All instruments457496851.2 (−17.5 to 19.7)879396−1.5 (−18.5 to 13.1)
Abbott Architect i2000SR/i2000/i1000SR6100100100−3.9 (−5.2 to 0.5)100100100−5.3 (−12.6 to −0.2)
Abbott AxSym3100100100−4.0 (−4.4 to −3.6)100100100−4.6 (−4.3 to −5.0)
Beckman Access, Access23336733−0.01 (−1.5 to 18.2)336733−4.5 (−18.4 to 13.1)
Beckman DxI600, DxI800667100100−8.9 (−3.1 to −14.3)67100100−10.2 (−15.1 to −3.6)
Roche Cobas E601/E602137792856.8 (−17.5 to 15.4)921001003.9 (−18.5 to 10.7)
Roche Elecsys 1010/2010, Cobas E4116831001003.1 (−1.9 to 12.1)1001001001.0 (−3.8 to 8.9)
Roche ES 300/60021001001000.9 (−8.3 to 10.1)100100100−0.5 (−10.1 to 9.2)
Siemens ADVIA Centaur CP/XP4251002511.9 (−3.2 to 19.7)100501002.7 (−2.4 to 5.4)
Bioscience Peteck 96-I1100100100−3.1100100100−5.8
SNIBE Maglumi1100100100−11.6100100100−7.5

*The total number of participating institutions and the number of institutions utilizing the indicated analytical platform are indicated.

The difference between the tested and target values was defined as the bias. The CV was calculated from nine replicate results, and the TE was calculated as the bias±2CV. The mean bias was calculated from the mean bias results of the participating institutions utilizing each indicated platform. The evaluation criteria for CV, bias, and TE were 6.1%, 11.8%, and 21.9%, respectively.

Abbreviations: AFP, alpha-fetoprotein; EQA, external quality assessment; BCCL, Beijing Center for Clinical Laboratories; Can-RM, candidate reference material; TE, total error.


The purpose of an EQA program is to provide a service enabling laboratories to enhance their analytical performance to satisfactory levels and improve the quality of the results. Most routine EQA programs use materials without verified commutability and use consensus means (either from a peer group or all laboratories) as target values [10]. According to ISO17043, proficiency-test items should match in terms of the matrix, measurands, and concentrations [21]. The commutability of EQA samples with CSs is important in laboratory quality evaluations [11, 22]. In addition to using commutable RMs, it is necessary to assign values (and uncertainty) to the RMs and to define and apply clinically permissible analytical-performance specifications to substantiate the validity of laboratory measurements in clinical settings [9, 23].

We evaluated measurement differences when the WHO 72/225 standard was diluted in PBS, serum, DMEM, MEM, RPMI, or DW at six concentrations (18.15–726 µg/L). We also examined how the measured concentrations of the WHO 72/225 standard diluted in the same buffers differed with different assay systems. We calculated the deviations of the measured values at six concentrations from the standard concentrations of the diluted WHO 72/225 standards. The results revealed different deviations in the measured values with all five assays for samples in all six buffers at different concentrations. Consistent trends in terms of positive and negative deviations were observed for all five assay systems, with PBS, DMEM, MEM, or RPMI serving as the dilution buffer. Differing deviations were observed in the different assays when serum or DW was used as the matrix. These findings demonstrate how the measured AFP concentration may be affected by the diluent selected for the WHO 72/225 RM [8].

Currently, no recognized reference method exists for AFP. The commutabilities of the BCCL Can-RMs and WHO 72/225 RM across six diluent matrices were analyzed using five routine assays from different manufacturers according to the CLSI EP30A guidelines, and the difference bias was evaluated using the IFCC approach. The commutability of the tested samples differed between both approaches. The BCCL Can-RMs were commutable according to the EP30A criteria but provided inconclusive results for IFCC-based bias estimations in the Autolumo/Mindray vs. Roche and Abbott/Beckman vs. Mindray system comparisons. The WHO 72/225 RM, when diluted in PBS, DMEM, MEM, and RPMI, exhibited commutability across all six diluted concentrations for the five measurement systems with the CLSI approach. However, inconclusive or non-commutable results were obtained for some immunoassay comparisons with the IFCC approach.

Linear regression with prediction intervals (according to the CLSI approach) has been commonly used to assess the commutabilities of different RMs. This approach is based on the statistical distribution of differences in the results obtained with CSs between two assays, where the precision may vary among different pairs of assays [12]. This approach does not quantify how closely the RM agrees with the average relationship for the CS at the concentration of interest [16]. The commutability results for the RMs obtained correlate weakly with clinical data. In terms of assessing differences in bias, the standard for evaluating commutability is that the deviation/difference ± uncertainty of the RM should fall within the fixed range of the allowable deviation. The same judgment criteria can be used to evaluate all measurement procedures. However, the evaluation standard should be selected carefully and avoid being too strict for methods to pass the evaluation. Differences in the uncertainties of RMs can also affect commutability evaluations [24]. The main sources, such as the number of replicates performed with CSs and Can-RMs, position effects, and sample-specific differences (which all contribute to the uncertainty), can be specified. This approach can result in uncertainty in the commutability assessment in that the outcome may become inconclusive or indicate non-commutability [25]. Many factors can lead to non-commutability, such as the nature of the analyte and its matrix and the principles of specific assays [26]; thus, further investigation is necessary.

We assessed the commutability of the international standard material WHO 72/225 with five different AFP immunoassays (Abbott Architect i2000, Beckman DxI 800, Roche Cobas E601, Mindray CL-2000i, and Autolumo A2000). The measured mean values of the five assay systems were used as the assigned values for six concentrations of Can-RMs. Two concentrations of Can-RMs were selected as EQA samples for use with the trueness-verification program in Beijing. An average of 95% of testing laboratories met the requirement for intra-laboratory CV, and 81% and 91% met the requirements for bias and TE, respectively, indicating that more laboratories met the tolerance limit for precision (CV) than the tolerance limits for trueness (bias) and accuracy (TE). Accordingly, bias may be the main limiting factor for accurate and comparable AFP results. However, owing to the limitation of having only a few specific concentrations of samples available, it was challenging to conduct assay comparisons across the entire range of assays [27].

A major limitation of this study is that only five assays were used for AFP measurements. The second limitation is that only two concentrations were used for the EQA samples, which did not cover the entire measurement range. Future validation studies with more assays and EQA samples covering more concentrations are required. The third limitation is that the individual samples for the commutability study were stored at −80°C until use. Further research is needed to confirm whether the commutability outcomes were affected by protein-structure changes caused by freezing.

In conclusion, the commutability results for the RMs obtained with both evaluation approaches differed. Owing to differences in their intended use, both evaluation approaches involve different ways of evaluating the commutability, which may explain the observed variations in evaluation results. Specific diluents for the WHO 72/225 AFP standard are needed to ensure accurate value transmission in the traceability process. The Can-RMs exhibited good commutability and, thus, are suitable for use as commutable EQA materials with the values assigned and might improve the performance of laboratory AFP measurements. The trueness-verification results provide reliable data for promoting the improvement of the corresponding quality specifications.

Yue Y, Zhang R, and Wang Q conceived and designed the research. Yue Y, Zhang J, and Zhao J collected and tabulated the data. Yue Y and Zhang J conducted the research. Yue Y and Zhang R analyzed and interpreted the data. Zhang J and Zhao J wrote the initial draft. Yue Y and Zhang R revised the manuscript. All authors have accepted responsibility for the entire content of this manuscript and approved its submission.

This study was supported by the Beijing Municipal Administration of Hospitals Clinical Medicine Development of Special Funding Support (grant number ZYLX202137) and Beijing Chaoyang Hospital Science and Technology Innovation Fund (grant number 22kcjjzd-7).

  1. Gillespie JR, Uversky VN. Structure and function of alpha-fetoprotein: a biophysical overview. Biochim Biophys Acta 2000;1480:41-56.
    Pubmed CrossRef
  2. Wong RJ, Ahmed A, Gish RG. Elevated alpha-fetoprotein: differential diagnosis - hepatocellular carcinoma and other disorders. Clin Liver Dis 2015;19:309-23.
    Pubmed CrossRef
  3. Chang TS, Wu YC, Tung SY, Wei KL, Hsieh YY, Huang HC, et al. Alpha-fetoprotein measurement benefits hepatocellular carcinoma surveillance in patients with cirrhosis. Am J Gastroenterol 2015;110:836-44; quiz 845.
    Pubmed CrossRef
  4. Cao W, Chen Y, Han W, Yuan J, Xie W, Liu K, et al. Potentiality of α-fetoprotein (AFP) and soluble intercellular adhesion molecule-1 (sICAM-1) in prognosis prediction and immunotherapy response for patients with hepatocellular carcinoma. Bioengineered 2021;12:9435-51.
    Pubmed KoreaMed CrossRef
  5. Hanif H, Ali MJ, Susheela AT, Khan IW, Luna-Cuadros MA, Khan MM, et al. Update on the applications and limitations of alpha-fetoprotein for hepatocellular carcinoma. World J Gastroenterol 2022;28:216-29.
    Pubmed KoreaMed CrossRef
  6. International Organization for Standardization. In vitro diagnostic medical devices. requirements for establishing metrological traceability of values assigned to calibrators, trueness control materials and human samples. 2020;ISO 17511:2020. https://www.iso.org/standard/69985.html
    CrossRef
  7. Vesper HW, Thienpont LM. Traceability in laboratory medicine. Clin Chem 2009;55:1067-75.
    Pubmed CrossRef
  8. Yue Y, Zhang S, Xu Z, Chen X, Wang Q. Commutability of reference materials for α-fetoprotein in human serum. Arch Pathol Lab Med 2017;141:1421-7.
    Pubmed CrossRef
  9. Braga F, Pasqualetti S, Panteghini M. The role of external quality assessment in the verification of in vitro medical diagnostics in the traceability era. Clin Biochem 2018;57:23-8.
    Pubmed CrossRef
  10. Jones GRD, Delatour V, Badrick T. Metrological traceability and clinical traceability of laboratory results - the role of commutability in external quality assurance. Clin Chem Lab Med 2022;60:669-74.
    Pubmed CrossRef
  11. Braga F, Panteghini M. Commutability of reference and control materials: an essential factor for assuring the quality of measurements in laboratory medicine. Clin Chem Lab Med 2019;57:967-73.
    Pubmed CrossRef
  12. Miller WG, Schimmel H, Rej R, Greenberg N, Ceriotti F, Burns C, et al. IFCC Working Group recommendations for assessing commutability part 1: general experimental design. Clin Chem 2018;64:447-54.
    Pubmed KoreaMed CrossRef
  13. Danilenko U, Vesper HW, Myers GL, Clapshaw PA, Camara JE, Miller WG. An updated protocol based on CLSI document C37 for preparation of off-the-clot serum from individual units for use alone or to prepare commutable pooled serum reference materials. Clin Chem Lab Med 2020;58:368-74.
    Pubmed KoreaMed CrossRef
  14. CLSI. Characterization and qualification of commutable reference materials for laboratory medicine; approved guideline. 1st ed. CLSI EP30-A. Wayne, PA: Clinical and Laboratory Standards Institute, 2010.
  15. Deprez L, Toussaint B, Zegers I, Schimmel H, Grote-Koska D, Klauke R, et al. Commutability assessment of candidate reference materials for pancreatic α-amylase. Clin Chem 2018;64:1193-202.
    Pubmed CrossRef
  16. Nilsson G, Budd JR, Greenberg N, Delatour V, Rej R, Panteghini M, et al. IFCC working group recommendations for assessing commutability part 2: using the difference in bias between a reference material and clinical samples. Clin Chem 2018;64:455-64.
    Pubmed KoreaMed CrossRef
  17. Trapé J, Botargues JM, Porta F, Ricós C, Badal JM, Salinas R, et al. Reference change value for alpha-fetoprotein and its application in early detection of hepatocellular carcinoma in patients with hepatic disease. Clin Chem 2003;49:1209-11.
    Pubmed CrossRef
  18. Trapé J, Franquesa J, Sala M, Domenech M, Montesinos J, Catot S, et al. Determination of biological variation of α-fetoprotein and choriogonadotropin (β chain) in disease-free patients with testicular cancer. Clin Chem Lab Med 2010;48:1799-801.
    Pubmed CrossRef
  19. Coşkun A, Aarsand AK, Sandberg S, Guerra E, Locatelli M, Díaz-Garzón J, et al. Within- and between-subject biological variation data for tumor markers based on the European Biological Variation Study. Clin Chem Lab Med 2022;60:543-52.
    Pubmed CrossRef
  20. Korchia J, Freeman KP. Total observed error, total allowable error, and QC rules for canine serum and urine cortisol achievable with the Immulite 2000 Xpi cortisol immunoassay. J Vet Diagn Invest 2022;34:246-57.
    Pubmed KoreaMed CrossRef
  21. Conformity assessment - General requirements for the competence of proficiency testing providers. 2023; ISO/IEC 17043:2023 https://www.iso.org/standard/80864.html.
  22. Jennings I, Kitchen D, Kitchen S, Woods T, Walker I. The importance of commutability in material used for quality control purposes. Int J Lab Hematol 2019;41:39-45.
    Pubmed CrossRef
  23. Miller WG, Jones GRD, Horowitz GL, Weykamp C. Proficiency testing/external quality assessment: current challenges and future directions. Clin Chem 2011;57:1670-80.
    Pubmed CrossRef
  24. Xing T, Liu J, Sun H, Gao Y, Ju Y, Liu X, et al. Commutability assessment of reference materials for homocysteine. Clin Chem Lab Med 2022;60:1562-9.
    Pubmed CrossRef
  25. Guo Q, Wang J, Yi X, Zeng J, Zhou W, Zhao H, et al. Commutability of reference materials for alkaline phosphatase measurements. Scand J Clin Lab Invest 2020;80:388-94.
    Pubmed CrossRef
  26. Apple FS, Collinson PO; IFCC Task Force on Clinical Applications of Cardiac Biomarkers. Analytical characteristics of high-sensitivity cardiac troponin assays. Clin Chem 2012;58:54-61.
    Pubmed CrossRef
  27. Zhang TJ, Pu YG, Zhou HJ, Ma R, Zhang JT, Wang DG, et al. Application of IFCC reference assay in the evaluation of five glycosylated hemoglobin detection assays. Chin J Lab Med 2018;12:821-6.