Source of bias | Current state | Desired state | Ongoing initiatives | Gaps/opportunities |
---|---|---|---|---|
Application of cutoff values to non-harmonized test methodologies | Methods used to derive cutoff values in clinical practice guidelines are seldom reported | Methods used to derive cutoff values are disclosed and transparent to allow applicability determination | Journals should require reporting of instrumentation and methods used in clinical trials and clinical practice guidelines | |
Lack of harmonization of laboratory tests | Only a subset of laboratory tests has been harmonized by manufacturers to produce comparable results across platforms | The majority of in-vitro diagnostic tests have undergone harmonization by the manufacturer | International Consortium for Harmonization of Clinical Laboratory Results | Increase reference material available for harmonization efforts |
Instrument and method not reported with laboratory results | Laboratory results do not include the method or instrument used to derive the results, limiting result comparability evaluation of non-harmonized tests | Laboratory results are encoded with instrument and reagent kit identifier | SHIELD | Dissemination and uptake of standard ontology recommendations from SHIELD; EHR and laboratory system functionality to support standard ontology recommendations |
Lack of standardization in the digital representation of laboratory results | Variability in how tests are named and results are reported | Accurate representation of laboratory test results using standard ontologies | SHIELD | Dissemination and uptake of standard ontology recommendations from SHIELD; EHR and laboratory system functionality to support standard ontology recommendations |
Degradation of Artificial Intelligence Models when applied to different datasets and changing data representation | AI and ML models may lack generalizability when applied to a setting with different data representations and different result values due to the use of non-harmonized tests | Local evaluations are conducted to ensure that the model performs as expected upon retraining or fine-tuning the model with local data as needed, along with performance monitoring over time | CHAI | Dissemination and uptake of recommendations |
Insufficient representation of women and minority populations in datasets and clinical trial results | Clinical trials do not routinely report results by sex, race, or ethnicity | Datasets include sex, race, or ethnicity in outcome reports and as a covariate in statistical analysis | NIH | Increased compliance with National Institutes of Health policies |
Revitalization Act of 1993 | ||||
Bias in AI and ML models because of erroneous race data collected and conclusions inferred in the healthcare literature | Predictive models using race are vulnerable to bias that may perpetuate health disparities because of inaccurate race representation and sufficient data on minorities in datasets | Developers are mindful of potentially erroneous race data collection and inaccurately inferred conclusions and employ the concept of “counterfactual fairness” to ensure models do not unfairly disadvantage minority populations | The Alan Turing Institute Counterfactual Fairness Project | Dissemination and uptake of recommendations |
Abbreviations: AI: artificial intelligence; ML: machine learning; SHIELD: Systemic Harmonization and Interoperability Enhancement for Laboratory Data; CHAI: Coalition for Health AI; NIH: National Institutes of Health; EHR, electronic health record.
© Ann Lab Med