Calibration of High-Density Lipoprotein Cholesterol Values From the Korea National Health and Nutrition Examination Survey Data, 2008 to 2015
2017; 37(1): 1-8
Ann Lab Med 2022; 42(2): 121-140
Published online March 1, 2022 https://doi.org/10.3343/alm.2022.42.2.121
Copyright © Korean Society for Laboratory Medicine.
Laboratory Corporation of America Holdings, Research Triangle Park, NC, USA
Correspondence to: Brian A. Rappold, B.S.
Laboratory Corporation of America Holdings, 1904 TW Alexander Drive, Research Triangle Park, NC 27719, USA
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
The process of method development for a diagnostic assay based on liquid chromatography-tandem mass spectrometry (LC-MS/MS) involves several disparate technologies and specialties. Additionally, method development details are typically not disclosed in journal publications. Method developers may need to search widely for pertinent information on their assay(s). This review summarizes the current practices and procedures in method development. Additionally, it probes aspects of method development that are generally not discussed, such as how exactly to calibrate an assay or where to place quality controls, using examples from the literature. This review intends to provide a comprehensive resource and induce critical thinking around the experiments for and execution of developing a clinically meaningful LC-MS/MS assay.
Keywords: Tandem Mass Spectrometry, Liquid Chromatography, Method Development, Sample Preparation, Calibration, Internal Standards, Matrix Effects, Quality Control
The development of liquid chromatography-tandem mass spectrometry (LC-MS/MS) clinical assays can be a complex task. Challenges span a wide range of chemical, physical, biological, and infrastructure issues that must be mitigated to provide a meaningful test result. With few exceptions, diagnostic MS testing is fully designed and executed in house [1-3]. The preparation of most testing materials, from calibration solutions to mobile phases, is the responsibility of the laboratory staff. In a development setting, control of nearly all steps is advantageous; the number of possible pathways to a final assay is only constrained by the developer’s imagination and experience and resources. The details of each step in a procedure can be optimized to best fit the workflow, layout, and assets of the laboratory.
In general, method development involves an amalgamation of tool kits. Some tools are provided by chemical or hardware manufacturers, others through learned skills or training, and still others from first-principle experimentation. In-depth descriptions of method development are largely absent in publications. From the author’s experience of hundreds of assays developed, far more experiments in method development fail than pass (for good reason). Unsuccessful experiments are not frequently published. Rather, a “final method” is described in a “method development” section, providing little opportunity for vicarious learning. This review sheds light on the scant literature resources that can be utilized in clinical assay development, with a focus on small-molecule-based measures. This bias towards compounds <1,000 Da is due to the relative dearth of publications related to the full utilization of peptide/protein measurement by MS in a clinical assay (i.e., assay data are used to make a meaningful clinical decision). However, the fundamental principles for many of the processes are consistent for both small and large compounds; important distinctions will be identified where appropriate. Additionally, the majority of measurement procedures focus on the use of chromatographic separation prior to MS detection; the utility of matrix-assisted laser desorption-ionization-MS in the clinic has been reviewed elsewhere [4, 5].
In this review, we provide examples from the landscape of LC-MS/MS utilized in clinical laboratories, while attempting to elucidate the components involved in assay development that are often not described in publications. Sections are organized according to the order in which method development considerations are made. Though this review is linear in presentation, the reader is advised to revisit relevant sections during method development for continued process refinement until the assay is in the final validation stage.
The development of an MS assay should not begin with any equipment but rather with the acquisition of a high-quality standard material. Few MS assays in clinical practice do not use a known amount of the measurand in both development and operations. Notable exceptions to the need for a standard material include measurands assayed in newborn screening, such as acyl carnitines and urinary organic acids, which depends on substantial post-analysis data review [6, 7]. Proper sourcing of the analytical standard to be used as a test material is essential. From experience, we recommend attempting to procure at least two lots of standard material from discrete manufacturers. This allows for experimentation to determine errors or evaluate data absent in the certificate of analysis. In some cases, the chemical synthesis and purification of a neat standard is imperfect, resulting in bias in concentration assignment. Such inaccuracies may not be observed until the method comparison phase of validation or even after the launch of the assay . Additional care should be taken when multiple analytes with high degrees of similarity are prepared from distinct solutions, especially for compounds that are metabolites or have only subtle modifications . Such manufacturing or degradation by-products may contribute to biases in measurand concentrations.
In certain cases, a certified reference material linked to higher-order metrology with value and error assignments provided exists. Available compounds can be obtained from various organizations, such as the International Federation of Clinical Chemistry, National Institutes of Science and Technology, Joint Committee for Traceability in Laboratory Medicine, National Measurement Institute Australia, National Metrology Institute of Japan, and Korea Research Institute for Science and Standards . These materials may also be included in commercial vendor catalogs. With the recent uptake of quantitative nuclear magnetic resonance, traceable materials with well-characterized concentrations are becoming more readily available and will certainly improve calibration-related metrology [11, 12].
An analytical standard is required for an MS assay to enable quantification of the substance in patient samples. The lack of a purified material leads to a limited experimental space or at least experimental designs that have far more assumptions than variables. Important evaluations, such as spike and recovery, can be unambiguously performed with a neat compound. Alternative approaches often leave outstanding questions that must be addressed before applying the developed assay to patient samples and require significant validation.
An IS is essential to the application of MS to clinical analysis. While certain aspects of MS technology, such as relatively good response functions at low concentrations and perceived specificity due to collisional dissociation or high resolution, are important, the ability to correct for all analytical steps with a true physicochemical mimic is profound. For endogenous analytes, the capacity to execute experiments using an IS cannot be undervalued. As a surrogate, recovery of the IS can be used to assess assay efficiency in patient samples without concern for the endogenous analyte biasing results . Time-course studies may be executed to understand the influence of endogenous protein binding or adsorptive loss [14, 15]. The equilibration time for the IS is critical for the normalization of recovery of the IS and analyte. The time and conditions necessary for IS equilibration in the patient sample should be determined during method development [16, 17].
Appropriate labeling of the IS should fully resolve isotopic contribution from high levels of an analyte. In fortuitous circumstances, there may be more than one stable, isotopically labeled IS available. In these cases, the degree of the deuterium isotope effect (for deuterium-labeled species), cost, availability, and reliability of product ion formation are all important considerations in IS selection [18, 19].
The rationale for the concentration of the IS required for an assay is generally not discussed in journal articles. Numerous factors, including cost, pipetting precision, solubility, stability, assay compatibility, unlabeled analyte, and overall analytical precision, have to be considered. Broadly, the lowest concentration of IS (cost) should be used to provide a response within the linear range of the detector with the lowest influence of noise and imprecise integration (analytical precision). The obvious development experiment is to perform titrations of the IS, inject those titrations in replicates, and record the detection precision. Such an experiment is, to our knowledge, not described in most journal articles as it can be considered trivial. However, using data to rationalize the decision-making process is fundamental to “evidence-based medicine.”
The solvent for storage of the IS should prevent adsorptive loss to the walls of the container (solubility) and have a composition or additives to prevent degradation and/or in-solution deuterium exchange (stability) . Additionally, the IS should be added to the assay in a volume that is reproducible for the laboratory (e.g., >20 μL is preferred by pipette manufacturers) and such that the aliquot does not interfere with the extraction procedure . For example, excessive organic aliquots added to plasma prior to hydrophobic solid-phase extraction (SPE) may result in precipitants clogging the SPE bed and release of the analyte into the organic solvent.
With analytical materials in hand, the physical work of method development can begin. Logically, this starts with the establishment of provisional MS parameters, such as precursor and product ions, and collision energies. Note that any source conditions should be appropriate for low-flow infusion and should not represent source conditions of the final method. Significant refinement is appropriate when the variables that affect source performance, such as LC flow rate and solvent composition, have been implemented [22, 23].
Though relatively infrequent, the measured precursor can differ from the expected protonated precursor can occur in MS. In-source dissociation can result in loss of a moiety, commonly water [24-26]. In-source dissociation is influenced by temperature and electronic settings (e.g., declustering potential). These should be evaluated during initial infusion. Alternatively, certain analytes poorly form a (de)protonated precursor ion and instead preferably yield adducts. Common adducts are generated via sodium, ammonium, or lithium, though others may be observed, depending on the molecule(s) being assayed [27, 28]. The addition of possible adducting species in the infusion solution may yield precursor ions not otherwise observed.
The fragmentation of molecules may enhance the specificity of the assay, depending on the selection of the neutral losses. Known facile and/or common fragments should be avoided to ameliorate concerns of interfering peaks; indeed, sufficient care should be taken to account for all the forms of interferences that may be generated via MS . However, from a development standpoint, all possible transitions should be retained until sufficiently ample data are available to exclude a transition, with the decision being based mostly on specificity as opposed to raw response. Transition ratios/ion ratios or comparison of the peak areas between two distinct product ions from a single precursor are informative in identifying specificity issues . In some cases, a molecule can only form a single product ion. A collision energy offset on the exact same neutral loss can provide a suitable evaluation of transition ratios for selectivity determination . However, this approach must be implemented with caution as it may not be sufficient for every analyte . In other cases, the compound may not fragment into a product ion at all or may yield such low efficiency dissociation/transmission that the product ion
Aspects of chromatography development for a wide array of applications have been previously discussed [35-37]. In most cases, these guidelines are for industries and do not match the needs and challenges of clinical laboratory testing. Applications related to the assessment of active pharmaceutical ingredients may benefit from access to large sample volumes/masses, making higher response functions through greater analyte load possible. This is not feasible for most clinical assays. While urine may seem plentiful, neonatal urine is both short in supply and difficult to capture. Blood draining, while an accepted clinical practice in the past, is currently discouraged. In general, sample availability is limited. In MS, there are few opportunities to safely garnish higher analyte responses. Generally, creating a higher ion yield in the source is the primary opportunity. The use of larger patient sample volumes is less desirable.
Chromatographic development in bioanalysis for clinical trials may not be sufficient to manage the challenges associated with the measurement of endogenous compounds in an uncontrolled test population. Additionally, the sample types accepted into the laboratory for testing may be infrequently observed in a trial environment. Opaque yellow plasma from severe lipemia or near-neon oranges of icteric serum are not uncommon in the diagnostic environment. While perhaps the closest to the diagnostic industry, bio-analysis-based recommendations should be read with an understanding of their fit-for-purpose nature.
Some strategies based on established assays may inform scientists on the selection of chromatographic development starting points. While initial determination of column dimensions may seem trivial, there are certain rationales for prioritizing this item over stationary phase selection. First, a column should be chosen such that the dimensions and expected flow rate do not exceed the capabilities of the LC system. For example, a 100 mm long column with a 1 mm internal diameter packed with 1.8 μm particles may provide excellent separation efficiency. However, when utilized with a 400 bar-max (5,800 psi/40 MPa) LC pump and a water/methanol gradient, the flow rate to prevent reaching the pressure limit would be close to 75 μL/min. Such a flow rate can result in lower laboratory efficiency as washing and re-equilibration of the column is a function of flow rate and volume. A column with a larger internal diameter and particle size may be more appropriate for the described equipment.
An additional consideration is the use of columns with a relatively small internal diameter to increase the sensitivity at the μL/min flow level in electrospray ionization [38-40]. There are undeniable sensitivity gains; however, these must be balanced against the expectation of system ruggedness and robustness. Small internal diameters yield small cross-sections of particles at the head of the column. A 75 μm internal diameter tube packed with 5 μm particles can only contain approximately 200 particles in the first cross-section. The normal distribution of particle features, such as size, shape, % functionalized, carbon-load, for a bulk of the packing material (both within and between lots of stationary phase) may not translate into a 200-particle cross-section when assessed between columns [41, 42]. This can influence the separation for not just the analyte(s) of interest but also possible interferences, resulting in column-to-column differences. In comparison, a 2.1 mm internal diameter column can have many thousands of particles within the first cross-section. This would increase the likelihood that the distribution of activities and particles’ physical parameters is more likely to be precise across many years of purchasing the same column from the same vendor. The data in Table 1 indicate the preference for >2 mm internal diameter columns across clinical laboratories running LC-MS/MS [43-53].
Additionally, the utilization of guard columns is rather infrequent or at least not often disclosed. Readers can extrapolate the relationships between a pressure max, column dimensions, and the maximum flow rate, which to the author, are poorly correlated with any feature. Indeed, the systems with the highest available pressure limits utilize the lowest described flow rate, without significant liabilities indicated in the column dimensions. Considered solely on this basis, it may be surmised that some throughput efficiency is lost; however, this should be balanced against a number of factors. The ionization cross-section may be greater on the mass spectrometer model used at that flow rate. Perhaps the laboratory is attempting to leverage the association between lower flow rates and more efficient separations to achieve resolution for a near-eluting isobar. Alternatively, the entire separation was perhaps performed on a previous-generation model and revalidation of a new separation was low priority.
Apart from the hardware of LC, the mobile phases remain another concern. The purity of solvents may affect the ionization cross-section of molecules, yet purity is generally expressed as a colloquial term. A search for specific requirements to be labeled as “LC-MS grade” within documents from International Union of Pure and Applied Chemistry, International Organization for Standardization, and the US Food and Drug Administration (FDA) returned zero meaningful results. In mobile phase selection, certainly, higher quality is preferred, though not required. Materials labeled as “HPLC grade” have been shown to be adequate for the intended analysis [43, 46, 48]. Other assays utilize the claimed “LC-MS grade” for successful analysis. For simplification of ordering, price reduction based on purchase volume, and ease of material storage, laboratories may prefer to order all mobile phases from a single vendor. This would limit the diversity of manufacturers until back-order or supply issues require purchasing from a different manufacturer.
As stated previously, the utilization of data to inform decisions as opposed to historical processes defining a method is important in method development. It is not uncommon that a laboratory adopts a single weak and strong mobile phase for all analyses or utilizes the same stationary phase for all separations. While perhaps beneficial to operational consistency, such constraints may not be optimal for the selectivity or response of a specific compound. Interrogation of ionization efficiency and retention are worthwhile investigations, particularly if the development is fully
For chromatography development, real data are an important guide in determining the most appropriate column. Experiments should be designed so as to introduce some of the issues for which LC-MS/MS analyses may struggle, including ion suppression and interfering species. First, certain molecules, such as endogenous phospholipids or exogenous phthalates, can disrupt atmospheric pressure ionization . To observe such phenomena, strategies such as post-column infusion or direct MS detection have been developed [58, 59]. Interfering species can be determined a
Chromatographic screening utilizing solvents as determined in an ionization efficiency screening has been discussed elsewhere [56, 63]. The process of chromatographic screening can be informed by important metrics, such as absolute abundance, retention time, cycle time, peak width, noise/baseline height, and relative retention time/resolution. In our experience, it can be tremendously difficult to attempt to force a molecule, pre-determined mobile phase composition, and pre-selected column to work in concert. Exploration of as much experimental space as possible affords confidence in the developed assay. An additional benefit is the availability of detailed information for alternative separation techniques should an insurmountable difficulty appear later in development. This is especially important in hydrophilic-interaction LC, as the number and strength of interactions vastly differ from those in reversed-phase LC .
Sample preparation falls into a few broad categories, and each style serves distinct purposes. The decision-making path should start with the determination of the instrument’s lowest precise response versus the necessary reference interval. Take for example a test where the measurand is intended to achieve a 1 ng/mL lower limit of quantification (LLOQ). If with optimal solvents and separation, the assay is capable of detecting a 1 pg/mL concentration with significant signal-to-noise, concentration via SPE or liquid-liquid extraction is unnecessary from a response perspective. However, these preparative techniques may be utilized to achieve a degree of selectivity not possible in the LC setup. In that light, it should be recognized that LC is a relatively high-resolution technique and affords near-full automation. Generally, LC is the most useful approach to effect specificity of analysis prior to MS detection; however, specific sample extraction can yield excellent selectivity when properly designed for the target compound(s).
After the need for extraction to concentrate or dilute is determined, the laboratory’s physical setup and available equipment should be evaluated in the context of the assay to be executed. For a laboratory with a single 24-position SPE vacuum manifold intending to run 1,000 samples per day through tube-based SPE, additional vacuum ports and manifolds will need to be procured. Alternatively, it may be just as cost-efficient to look at positive-pressure manifolds for 96-well plates. At many steps during method development, ensuring that quality science can be delivered in a production/industrial setting may precede over research desires.
It is always helpful to encapsulate multiple conclusions from a single experiment in extraction development. For example, recovery studies on neat solutions are important; however, recovery should be co-executed in the intended matrix as well. Note that many sample matrices are buffered because of the presence of preservatives. What works in Type I water may not work in citrated plasma. Experimentally, evaluation criteria including absolute analyte response as well as IS recovery, peak shape deviations, retention time drifts for multiple injections, and visually observed debris or other macro-confounders within the extracts, are highly informative . Thoughtful experimental design, which includes expectations for reduced data (e.g. precision of IS response across multiple samples within 20%), is critical to efficient method development.
SPE has been quite successful in implementation [65-67]. In particular, the shift away from silica-based phases for polymers has increased the ruggedness of the technology in industrial environments. Scientists interested in SPE should note that many generic protocols exist. They are drafted to capture a reasonable percentage of many different compounds but are not universally ideal for all compounds. Some experimentation related to the degree of sample pre-manipulation (i.e., pH adjustment, pre-precipitating), washing solution strength, number of washes, elution characteristics, etc. is necessary to develop an optimal protocol . Given the breadth of available SPE products, testing different sorbents from different manufacturers (even if similar in interaction) may provide important information . Additionally, the SPE modality should be considered in light of the LC separation mode . When a hydrophobic SPE process is coupled to a reversed-phase LC separation, sample preparation can be just a lower-resolution analogy to the LC. This provides only concentration, not isolation of the target compound from other, similar molecules. In line herewith, compounds most likely to co-elute in the LC portion are also concentrated.
Liquid-liquid extraction is quite popular in certain segments of clinical assays. Moderate- and low-polarity compounds can be separated from a bulk matrix in a facile manner, and certain polar compounds have even been separated from an aqueous phase using salt assistance [71-73]. Tube-based liquid-liquid extraction has proven to be largely manual, mainly because of the need for pipetting of low-viscosity high-vapor-pressure solvents. However, supported liquid extraction media of both natural and synthetic origins have been introduced to provide a sorbent for the liquid-liquid extraction process to occur in an SPE-like manner [74-76]. Novel pipetting systems have been developed to support formats typical for liquid handling footprints [77, 78]. In either liquid or supported form, conditions should be optimized for the compound(s) of interest . This can include testing individual solvents as well as solvent mixtures to achieve the most appropriate extract .
Immunocapture is used in clinical assays for both small and large molecules. The execution of development of this approach is largely empirical, and a number of additional aspects must be accounted for, especially for proteins . In protein analysis, the immunoassay approach of “bind very strongly” generally does not apply; the antigen should be released without much effort to measure it via MS. Yet, the affinity must be sufficient to enable adequate recovery, which is generally addressed via experimentation [81, 82]. Peptide capture following digestion also offers an opportunity for the utilization of immunoaffinity-capture preparations . In protein analysis, it is important to attempt to control the variability of digestion, which may or may not be adequately controlled for, even with a labeled protein IS (as opposed to a labeled or winged peptide) [14, 84]. Specific method development considerations in the clinical context have been recently discussed . Of importance in recommendations for protein analysis by bottom-up approaches is the need for post-digestion peptide stoichiometry. Correlation of the absolute recoveries of multiple peptides after digestion is paramount in demonstrating assay quality [86, 87].
Protein precipitation and simple dilution have also been applied to many clinical assays, either as a stand-alone preparation or combined with other techniques. Precipitants may include organic solvents, such as methanol, ethanol, or acetonitrile, or may be aqueous-based, such as zinc sulfate, sulfosalicylate, or trichloroacetic acid [88-91]. Some variation in recovery due to precipitants has been noted and should be accounted for in assay development . Despite some rather pointed statements on the superiority of a solution for precipitation, experimentation is necessary to determine the optimal conditions for the intended analyte .
Modification of the measurand(s) can offer solutions to some challenges in assay development. Highly polar molecules can be derivatized to more lipophilic products, allowing for increased retention in reversed-phase LC . Some compounds with poor ionization cross-sections, low collisional dissociation yields, or minimal transmission efficiency may benefit from derivatization .
Notable drawbacks exist in derivatization workflows. In some cases, diagnostic stereoisomers can be merged into a single molecule. This is observed with the derivatization of allo-isoleucine and isoleucine; therefore, the confirmatory test for maple syrup urine disease requires a distinct preparation . In other cases, derivatization can result in the formation of epimers from a single molecule, such as 4-phenyl-1,2,4-triazoline-3,5-dione derivatives of 1,25-dihdyroxy vitamin D . A third concern for chemical modifications is that MS/MS fragmentation may result in the product ion being part (or whole) of the derivative. In this case, all derivatized compounds can produce the same product ion, significantly limiting the specificity ascribed to MS/MS dissociation . Experimentation in the development of chemical-modification workflows must also include a thorough evaluation of process stoichiometry with consideration for elevations of compounds in the matrix, some of which may be unrelated with the pathology being evaluated. For example, to develop an amine derivatization procedure for detecting urea cycle disorders, the recovery of the derivatives must be stressed over supra-physiological amounts of amines in the sample. Testing in development of only ideal, “normal” samples can lead to incorrect test results in the intended population.
For the extraction modes discussed above, it is possible to combine certain techniques to achieve adequate specificity and response. Certain assays are difficult to perform in a single extraction mode that provides for a high-quality test result. Utilizing multiple steps to reach that quality answer may be indicated during assay development. Similar to selection of the SPE modality in contrast to the LC modality, consideration of orthogonal techniques when combining extraction procedures is recommended.
The use of a human matrix in early method development is not required. Much of an assay can be fully developed using commercially available neat materials. However, early introduction of the intended sample matrix is important. This is particularly true in the assessment of calibration materials. The Clinical Laboratory Standards Institute (CLSI) has detailed a hierarchy of matrices for test articles used for calibration . A calibration matrix preferably is commutable between assays, exhibits the same analytical properties as the matrix of interest, and is a readily available material. For many LC-MS/MS assays, it is difficult to achieve the most preferred solution (patient pool) with a known concentration of the analyte(s). In general, MS assays for endogenous targets in routine analysis will utilize an analyte-depleted matrix, synthetic matrix, or solvent-based standard schemes. Additives can be included in any of these materials to provide advantageous outcomes, such as preservatives for extended stability or binding partners for preventing adsorptive loss [14, 99, 100]. For exogenous analytes, a compound-free matrix is readily available from commercial sources or in-house with the intended preservative of the test sample. This may be important for compounds that metabolize quickly, such as cocaine in whole blood, which requires constrained conditions for collection, transport, and storage for the analysis to be meaningful .
The calibration matrix considerably influences assay quality. In an LC-MS/MS experiment, the only moment an absolute accuracy of preparation is required is during calibration. The addition of an IS to sample aliquots (generally the first step in a procedure) is entirely about precision. The relative volumes of those two components (IS aliquot and sample aliquot) must precisely match the ratio of the calibration standards. After the IS and sample are combined, even imprecision in absolute recovery during sample preparation, injection, and ionization is controlled. Only the calibration standards must be prepared with accuracy.
Particular care should be taken in determining any differences between the calibration matrix and the samples to be analyzed. Charcoal-stripped sera and especially, fully delipidized and double/triple-stripped materials, do not behave like human serum as many of the lipophilic molecules have been depleted. Similarly, a dialyzed calibration matrix may have a grossly reduced concentration of endogenous materials. Deviations of the calibration matrix from the human matrix may result in differential adsorptive loss, variable matrix effects observed in sample preparation, changes in IS equilibration time, and disparate ionization suppression between calibrators and specimens [102, 103].
Adsorptive loss can be problematic and can influence the accuracy of a calibration scheme and quality controls (QCs) for some compounds. Generally, adsorptive losses are a function of equilibrium such that accuracy may deviate from trueness at the same percentage across the range of concentrations . Depending on the protocol and compound, various solutions have been established, from the addition of albumin (as a non-specific binding partner), increasing solubility through alternative solvents (e.g., dimethyl sulfoxide), or pH modification or sonication during extraction steps to re-solubilizing the compound into the liquid [105, 106]. In a fine example of creativity, Lame,
There are few prescriptive guidelines for the determination of the concentrations to be used to generate a calibration curve. The European Medicines Agency (EMEA) and US FDA have made recommendations for MS assays [108, 109]. The minimum number of distinct standards to utilize (six according to both EMEA and US FDA) and the acceptance criteria for back-calculated accuracy (15% at non-LLOQ values, 20% at the LLOQ according to both EMEA and US FDA) are explicit. However, these directives are designed for clinical trials involving new drug entities and may not be suitable for the intended use of diagnostic assays.
These guidance documents indicate a number of components. First, the calibration range should always include bracketed standards: one at the LLOQ and another at the upper limit of quantification (ULOQ). Values below the LLOQ (without a validated concentration factor) or above the ULOQ (without a validated dilution) should not be extrapolated as they would fall outside the quantitative limits. Additionally, the expanded allowable imprecision at the low end of measure acknowledges the heteroscedastic nature of LC-MS/MS analysis. It is recognized that such guidance is inconsistent with the approaches for the establishment of lower limits and upper limits of measurement intervals (LLMI and ULMI, respectively) defined by the CLSI EP17-A2 . The approaches for limit of blank and limit of detection require the determination of values often less than a lowest calibrator, yet these values would intrinsically have greater error. This discrepancy is attributed to the CLSI guidance documents being drafted for various types of assay technologies (e.g., LC-Ultraviolet detection, nephelometry, turbidity, and PCR), while the EMEA and US FDA documents provide specific recommendations for LC-MS/MS. The specific recommendations are preferred as they account for intrinsic capabilities and liabilities of the platform. Consequently, the LLOQ is equivalent to the LLMI, and the ULOQ is equivalent to the ULMI.
One additional consideration in calibration curve point selection is the acknowledgement of non-linear ionization and detection in LC-MS/MS protocols. While the dynamic range of modern MS systems is substantial, it is not infinite. There are constraints of ionization and detection non-linearity, which differ between vendor and models of instruments and may differ on the same instrument over time. Additionally, carryover can be a limiting factor in measuring many orders of magnitude. A reliable working range beyond a 2,000-fold difference between the LLMI and the ULMI is quite rare for LC-MS/MS.
Table 2 includes examples of assays used in clinical diagnostics with their measurands and calibration schemes [43-49]. Several pertinent observations can be made from the collected references. Not all assays require a large breadth of reportable values, as demonstrated by azothiopurine metabolite analysis . The relatively limited number of calibrators in the plasma metanephrine assay may also indicate the clinical utility, where gross elevations generally represent a catecholamine-expressing tumor . Including more standard curve points in that assay may improve the perception of quality, with minimal change in its function.
Of interest is the relative placement of calibration points. No discernible relationship exists between the width of the measurement range and how the points are distributed within the range. Some points within a range are serially diluted (2-fold), while others differ 2.5-fold within a single standard curve . In another example, there is an order of magnitude between the LLMI and the next calibrator, while all following points are 5-fold different . One assay modifies the IS concentration while maintaining a single analyte concentration . This practice is mathematically acceptable; however, there may be significant noise/interference differences for lower analyte concentrations that are not accounted for in the regression equation. Finally, one candidate reference method has but a single calibration point ! Remarkably, all these assays have demonstrated validity, meaning that all are acceptable calibration mechanisms for the assays’ intended purposes. Each assay yielded precision and accuracy across their claimed measurement range. However, for ease, some expectations for calibration points could be set forth. For LC-MS/MS assays, a calibration curve should:
1) define exactly the lowest reliable quantity of measurement (i.e., the LLMI/LLOQ);
2) define exactly the highest reliable quantity of measurement (i.e., the ULMI/ULOQ);
3) identify regions of non-linearity due to source or detector saturation;
4) improve calibration accuracy in regions of increased imprecision.
Therefore, the LLMI, ULMI, and a point in a region just below the ULMI (i.e., 90% of the ULMI) would be necessary to assist in describing non-linearity and at least one point near the LLMI to assist in improved accuracy of the fit at the low end of measure.
In this calibration scheme, points of regression influence have been reduced. Notably, calibration standards in the middle of the calibration curve are not included. Should the calibration curve for an LC-MS/MS assay not generally be linear throughout the middle of the measurement range, the appropriate undertaking would be in troubleshooting, not assay calibration. Additionally, it is recognized that this simple model is just that: a sufficient calibration that can be executed with maximal analytical value and understood by most laboratory staff. Finally, it should be recognized that error determinations are not addressed in this model; QCs are the primary drivers for batch acceptance based on error in back-calculated concentrations.
The utility of QCs in a clinical assay is prescriptively to assess performance. More specifically, it is to decide whether a batch passes or fails based on a quantitative recovery of an expected amount of molecule. In an acute sense, the pass/fail assessment is important. However, QCs are also used to monitor drift in results over time. QCs can offer some additional benefit as a function of quality assurance monitoring in LC-MS/MS assays, but otherwise serve the same general purpose as in other diagnostic assays. However, the capability of LC-MS/MS assays to frequently provide matrix tolerance (in which variations from the intended test matrix can be utilized) provides some additional prospects in QC selection.
The QC matrix should be the same as that of the intended analyte as much as possible for the assay in development. This refers to the unmodified matrix, not one that is charcoal-stripped or of synthetic origin. However, in certain cases, modifications are necessary to support the longitudinal aspect of QCs due to instability, insolubility, or endogenous presence of a molecule. As a function of method development, stability studies for the analyte of interest should be conducted as soon as possible. Time is the limiting factor for such studies. Attempts at “accelerated stability” studies or exposure of stability samples to elevated temperatures to gauge possible liabilities have been made [124-126]. However, accelerated stability analysis should be considered informative, not definitive, particularly in blood-based matrices. For complex, enzyme-rich samples, assumptions of the Arrhenius equation may be confounded by factors such as temperature-dependent enzyme activities, the presence or generation of co-factors, and pH changes in the sample during storage .
The determination of the optimal stability of QC materials is a matter of research and experimenting. For established assays, literature searches and exploration of sample shipping/storage constraints from other laboratories can be enlightening. For novel assays, it is recommended that a broad approach be taken in the initial stability evaluation. Various storage conditions (-70°C, -15°C, 2-8°C, ambient laboratory temperature) as well as additives (antioxidants, pH modifiers, enzymatic activity inhibitors, preservatives, buffers, and solvents) should be evaluated. The fortification of pooled matrix and neat solvents to the same concentration, the addition of probable (and perhaps improbable) stabilizers to the samples, and allowing for the passage of time followed by comparative analysis is the only conclusive method for demonstrating the conditions for measurand stability.
The numbers of QCs and their concentrations vary broadly across the clinical assay landscape. Table 3 shows a comparison of selected LC-MS/MS assays with reported QC levels [12, 50, 111-123]. Although the numbers and target concentrations for QCs are highly disparate, even for the same measurand, there are some similarities. All assays have at least two QCs. Except for testosterone and vitamin D assays, at least one QC within five times the LLOQ/LLMI exists. Fig. 1 shows the frequency of QCs to be placed in the quartiles of a calibration range after transformation of the analytical measurement range and QC values into a 100-point scale. For LC-MS/MS assays, it seems preferable to place QCs (and in some cases, multiple QCs) at the low end of the measure. As a performance metric, this is quite reasonable because of the heteroscedastic nature of MS/MS detection; systematic errors, which would affect assay performance, may be more pronounced at the low end than at higher response functions. Interestingly, one testosterone assay utilizes a separate QC when used for samples from women. It can be surmised that LC-MS/MS QCs should stress both analytical and clinical assay components. This might be taken as a tacit recommendation to place QCs near the LLOQ and at identified medical decision points. However, this may not be a meaningful approach because of diminishing returns in efficiency. The use of excessive numbers of QCs can prevent the analysis of actual patient samples. Indeed, to assess just the possible number of QCs for matching medical decision points for testosterone in serum, the variety of diagnostic and treatment-associated cutoffs would indicate more than 30 QCs to be utilized, spanning indications for osteoporosis, age-related hypogonadism, polycystic ovarian syndrome, androgen-secreting tumors, pathological hyperandrogenism, hirsutism, ovarian hyperthecosis, McCune-Albright syndrome, precocious puberty, exposure to testosterone-containing medications, etc. [128-132].
Concentrations of QCs deserve some thought. Certain assays have prescribed concentration targets. For example, the National Laboratory Certification Program (NLCP), which oversees the Substance Abuse and Mental Health Services drug testing program in the USA, dictates that QCs, at a minimum, shall be at 40% and 125% of the cutoff used to define drug presence in urine for quantitative assays . This approach focuses the QCs in an analytical region where the concentrations are fit for purpose. In the NLCP example, the presence of controlled substances in urine is, in practice, a qualitative determination. It matters little whether the concentration of 6-mono-acetyl morphine, a definitive metabolite of heroin, in urine is 600 ng/mL or 900 ng/mL, as the purpose of the test is to confirm substances. With the lack of explicit guidance from regulatory authorities or best practice recommendations from qualified organizations, when selecting where to place QCs, one should consider the ideals set by precedence. A QC that challenges the low end of measure, such as one at 3-fold the LLOQ as described in the US FDA Bioanalytical Method Validation Guidance, is highly valuable in low-end error detection . At least one other QC, preferably in a region where clinical delineation occurs, is important. Discussions with clinicians as to what levels are meaningful are highly appropriate.
Lastly, special-use cases for QCs do often occur. One of the most broadly used MS/MS assays is newborn screening based on dried blood spots. These assays are typically semi-quantitative at best, particularly for acyl-carnitine/amino-acid species that are isobaric and/or for which standard materials are not available. In this case, as well as other multi-index analyses, a QC scheme may benefit from the addition of clinical controls. These controls have known diagnoses and the samples are interpreted for both analytical and clinical reproducibility. This approach has been adopted for proficiency testing schemes .
Matrix effects is a term used for a number of different incidences in LC-MS/MS workflows. Commonly, it refers to ionization suppression or enhancement in patient samples relative to the solvent matrix. It should also encompass all matrix-induced variations in measured response that may occur during sample preparation or chromatography. Ionization matrix effects have been extensively reviewed . Solutions to resolve ionization matrix effects include switching from electrospray to atmospheric pressure chemical ionization or electron ionization, modification of the extraction procedure, introduction of an in-line trapping column for lipidaceous ionization suppressors, or preparative depletion techniques [136-140]. Each solution has its drawbacks. For example, a molecule may not sufficiently ionize in an alternative modality to generate appropriate signal (if any at all). Phospholipid depletion may result in the loss of the target compound(s). The experimental design should account for unintended consequences while trying to manage matrix effects.
From our experience, the most reliable way to resolve ionization matrix effects is to increase chromatographic resolution. Provided the analytes are sufficiently retained (≥three void volumes), modifying the amount of strong solvent delivered over time is a simple experiment. This is achieved by lowering the pitch of the LC gradient and extending the run-time, though additional isocratic steps prior to the gradient may also serve to resolve suppressing analyte species . When optimal solvent conditions have already been determined based on response function, it is not recommended that broad pH changes be utilized to affect the necessary resolution. Rather, additional column screening or an orthogonal sample preparation technique is preferred.
Of interest is the degree to which ionization suppression/enhancement is acceptable in an assay. We have heard various claims, ranging from 15% maximum allowable change from solvent to matrix to 80% allowable suppression or 200% enhancement. Some component of expectation should be derived from the laboratory’s standard operating procedure for the assays in use. If the laboratory typically accepts samples that have an IS recovery of 50%-150% compared with the calibrators/QCs in the same batch, that would be a bare minimum in method development. It should be noted that method development is often performed in a different manner than routine analysis, and tighter allowance criteria in the developmental stage may cultivate ruggedness in an operational assay.
The other analytical steps in which matrix effects are to be considered are sample preparation and chromatography. In sample preparation, a primary concern is the effect of binding partners for the analyte and the IS. Such binding may induce recovery differences in an extraction where the IS is not allowed to equilibrate fully prior to extraction, leading to an over-recovery of the IS relative to the analyte [16, 17]. Analytes with known specific and high-avidity binding partners are susceptible, as are compounds with non-specific binding. To determine this effect in sample preparation, time-course studies are utilized. For exogenous measurands, the analyte can be fortified to a meaningful concentration and allowed to equilibrate with the matrix for some hours or days, depending on stability. For endogenous compounds, a sample with a measurable amount of compound is sequestered. The IS is added to replicates of these samples, and the samples are extracted at timed intervals. Area ratios are assessed for reproducibility across the time points, with careful review of the absolute response of the analyte and taking the IS into account. Temperature deviations from laboratory normal (either warmer or colder) may be utilized to accelerate the equilibration of the IS with the analyte. It should never be assumed that an extraction approach fully releases bound analyte or that full equilibrium between the analyte and IS have occurred. Additionally, there is substantial diversity of possible unrelated pathologies affecting non-specific binding in a patient sample. Unless the extraction process has been experimentally proven to free any bound fraction and/or reach equilibrium with the IS, few assumptions should be made.
Chromatographic observations of matrix effects are commonly associated with deviations in observed peak shape due to mass overload [142, 143]. Mass overload is simply having too many solutes for the available capacity of the stationary phase. Notably, deformations in peak shape can result from not only an excess of the analyte being detected, but also other analytes that are in competition for access to the stationary phases. It is not uncommon to observe reduced retention times, severe tailing, and excessive peak broadening as a function of mass overload when compared with the measurand(s) in neat solution. In cases where sample dilution is limited (e.g., <4-5-fold) and the sample is not purified (e.g., in protein precipitation), testing of numerous samples is essential to understand the frequency and effects of mass overload.
All three sources of matrix effects (ionization, sample preparation, and chromatography) should be evaluated prior to the completion of method development. Assessment of ionization suppression is typically the first and longest experiment, though easily achieved through measurement of IS response differences between crude matrix extracts and neat solvents during LC development. IS equilibration studies should be performed at the beginning of sample preparation development. Chromatographic matrix effects can be stressed as a function of LC development, provided that a sample of sufficient complexity and concentration of non-measurands is available.
The optimization of MS assays in the context of LC-MS/MS assays can be difficult in that it applies to numerous features, some of which are diametrically opposed. Take for example the optimization of an assay with a drug and its metabolite. For philosophical purposes, assume that the drug has a relatively poor ionization cross-section and a very short half-life, while the metabolite has a very high ionization cross-section and is very long-lived. The assay may require that the drug be analyzed at a range 100-fold lower than the metabolite. Optimization of the drug may imply that a maximum signal is generated, whereas “optimization” of the metabolite may require sub-optimal settings to prevent source saturation or detector blinding such that both compounds can be measured from a single preparation/injection.
Thus, MS optimization is more than “achieving the highest signal,” although that can be a target goal. Optimization may also include addressing process efficiency, reducing possible error rates, removing noise in raw data, increasing ruggedness, or improving throughput. Given the breadth of instrument models and intended uses, specific recommendations based on the literature cannot be made. However, some approaches in the literature are consistent.
MS parameters are the focus of many optimization workflows; sample preparation and LC optimization references are included in the above sections. Gas flow rates/pressures, temperature, collision energies, ion optics energies, and probe voltage/positions are all modifiable components [144-146]. Both single-variable (one variable at a time) and more complex approaches have been reported [147, 148]. There are pros and cons to both approaches, and either approach may yield meaningful changes in response of noise or analyte. Care should be applied to whichever approach taken as there are variables that are highly coordinated in MS analysis . For example, the most appropriate collision energy at one collision cell pressure can be very different at another collision cell pressure. When planning optimization experiments, one should consider interactions between the variables being assessed.
It may be appropriate to offer one small tip based on our own optimization protocols. For each transition of interest, it is possible to assign multiple values of an MS parameter and perform a single injection. Fig. 2 demonstrates the optimization of lysine in human plasma as a component of a broader panel. The same neutral loss is being explored as a function of this analysis; the mass difference in the product ion scan forces the software to discretely analyze each transition with the assigned compound-dependent parameter. The goal of this optimization experiment was to lower the response of lysine without affecting other analytes co-measured. A reduction in the collision energy from the optimal provides a response that is within the linear range of the detector.
Optimization of compound-specific parameters in this manner can be highly useful for compound/transition-dependent parameters. First, it resolves concerns about injection variance contributing to response modifications. Second, objective conclusions can be reached visually without further data reduction. Third, it is an efficient use of both sample and instrument time relative to replicate injections with parameter variations. Note, however, that global parameters (e.g., collision cell base pressure or temperature) do not respond with the same switching speed as electronics but require slower evaluations.
We discussed aspects of sequestering test materials, establishing MS parameters, developing a chromatographic separation, implementing sample preparation, selection of calibration and QC materials/concentrations, and managing matrix effects in the development of LC-MS/MS assays. These items have been listed in Table 4 as a method development checklist, approximating to a high degree what goes into the development of an assay. There may be distinct challenges that are not mentioned in this review and may well go unmentioned in all of literature. Perhaps the novelty of discovering and mitigating those challenges are among the most rewarding scientific ventures. Knowledgeable experimental design informed by all the variables in an LC-MS/MS system can lead to better testing for patients when such challenges are observed and subsequently mitigated.
We did not address the components of pre-validation or validation studies. We also did not discuss the actual process of how to institute an assay for production and keep it going for many years. These are essential components to the execution of LC-MS/MS in clinical laboratories and will be introduced in part 2 of this review series.
In closing, the canvas on which LC-MS/MS assays can be painted is constrained by very few imposed limitations. As such, the opportunity to design highly efficient and high-quality assays is only within the boundaries of scientific creativity and, importantly, data to support the application of such assays on patient samples. The papers referenced herein offer only a brief description of the possible path to achieve quality work; much is left to the scientists performing the development. To that end, we offer the wish and advice of “good luck and work hard” to the reader.
The author would like to thank the thousands of attendees to clinical mass spectrometry short courses who have spent hours listening to myself and my colleagues discuss these topics. The conversations, questions, and observations provided by these laboratorians, technicians, clinicians, and scientists continue to inspire me immensely.
Rappold BA prepared and wrote the manuscript.
The author declares no conflict of interest.