Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations

Matthew B. A. McDermott Haoran Zhang Laleh Seyyed-Kalantari + 2 penulis

Abstrak

Artificial intelligence (AI) systems have increasingly achieved expert-level performance in medical imaging applications. However, there is growing concern that such AI systems may reflect and amplify human bias, and reduce the quality of their performance in historically under-served populations such as female patients, Black patients, or patients of low socioeconomic status. Such biases are especially troubling in the context of underdiagnosis, whereby the AI algorithm would inaccurately label an individual with a disease as healthy, potentially delaying access to care. Here, we examine algorithmic underdiagnosis in chest X-ray pathology classification across three large chest X-ray datasets, as well as one multi-source dataset. We find that classifiers produced using state-of-the-art computer vision techniques consistently and selectively underdiagnosed under-served patient populations and that the underdiagnosis rate was higher for intersectional under-served subpopulations, for example, Hispanic female patients. Deployment of AI systems using medical imaging for disease diagnosis with such biases risks exacerbation of existing care biases and can potentially lead to unequal access to medical treatment, thereby raising ethical concerns for the use of these models in the clinic. Artificial intelligence algorithms trained using chest X-rays consistently underdiagnose pulmonary abnormalities or diseases in historically under-served patient populations, raising ethical concerns about the clinical use of such algorithms.

Artikel Ilmiah Terkait

Toward fairness in artificial intelligence for medical image analysis: identification and mitigation of potential biases in the roadmap from data collection to model deployment

Heather M. Whitney Sanmi Koyejo K. Drukker + 9 lainnya

26 April 2023

Abstract. Purpose To recognize and address various sources of bias essential for algorithmic fairness and trustworthiness and to contribute to a just and equitable deployment of AI in medical imaging, there is an increasing interest in developing medical imaging-based machine learning methods, also known as medical imaging artificial intelligence (AI), for the detection, diagnosis, prognosis, and risk assessment of disease with the goal of clinical implementation. These tools are intended to help improve traditional human decision-making in medical imaging. However, biases introduced in the steps toward clinical deployment may impede their intended function, potentially exacerbating inequities. Specifically, medical imaging AI can propagate or amplify biases introduced in the many steps from model inception to deployment, resulting in a systematic difference in the treatment of different groups. Approach Our multi-institutional team included medical physicists, medical imaging artificial intelligence/machine learning (AI/ML) researchers, experts in AI/ML bias, statisticians, physicians, and scientists from regulatory bodies. We identified sources of bias in AI/ML, mitigation strategies for these biases, and developed recommendations for best practices in medical imaging AI/ML development. Results Five main steps along the roadmap of medical imaging AI/ML were identified: (1) data collection, (2) data preparation and annotation, (3) model development, (4) model evaluation, and (5) model deployment. Within these steps, or bias categories, we identified 29 sources of potential bias, many of which can impact multiple steps, as well as mitigation strategies. Conclusions Our findings provide a valuable resource to researchers, clinicians, and the public at large.

Fairness of artificial intelligence in healthcare: review and recommendations

T. Tsuboyama Y. Fushimi T. Nakaura + 15 lainnya

4 Agustus 2023

In this review, we address the issue of fairness in the clinical integration of artificial intelligence (AI) in the medical field. As the clinical adoption of deep learning algorithms, a subfield of AI, progresses, concerns have arisen regarding the impact of AI biases and discrimination on patient health. This review aims to provide a comprehensive overview of concerns associated with AI fairness; discuss strategies to mitigate AI biases; and emphasize the need for cooperation among physicians, AI researchers, AI developers, policymakers, and patients to ensure equitable AI integration. First, we define and introduce the concept of fairness in AI applications in healthcare and radiology, emphasizing the benefits and challenges of incorporating AI into clinical practice. Next, we delve into concerns regarding fairness in healthcare, addressing the various causes of biases in AI and potential concerns such as misdiagnosis, unequal access to treatment, and ethical considerations. We then outline strategies for addressing fairness, such as the importance of diverse and representative data and algorithm audits. Additionally, we discuss ethical and legal considerations such as data privacy, responsibility, accountability, transparency, and explainability in AI. Finally, we present the Fairness of Artificial Intelligence Recommendations in healthcare (FAIR) statement to offer best practices. Through these efforts, we aim to provide a foundation for discussing the responsible and equitable implementation and deployment of AI in healthcare.

Redefining Radiology: A Review of Artificial Intelligence Integration in Medical Imaging

Reabal Najjar

25 Agustus 2023

This comprehensive review unfolds a detailed narrative of Artificial Intelligence (AI) making its foray into radiology, a move that is catalysing transformational shifts in the healthcare landscape. It traces the evolution of radiology, from the initial discovery of X-rays to the application of machine learning and deep learning in modern medical image analysis. The primary focus of this review is to shed light on AI applications in radiology, elucidating their seminal roles in image segmentation, computer-aided diagnosis, predictive analytics, and workflow optimisation. A spotlight is cast on the profound impact of AI on diagnostic processes, personalised medicine, and clinical workflows, with empirical evidence derived from a series of case studies across multiple medical disciplines. However, the integration of AI in radiology is not devoid of challenges. The review ventures into the labyrinth of obstacles that are inherent to AI-driven radiology—data quality, the ’black box’ enigma, infrastructural and technical complexities, as well as ethical implications. Peering into the future, the review contends that the road ahead for AI in radiology is paved with promising opportunities. It advocates for continuous research, embracing avant-garde imaging technologies, and fostering robust collaborations between radiologists and AI developers. The conclusion underlines the role of AI as a catalyst for change in radiology, a stance that is firmly rooted in sustained innovation, dynamic partnerships, and a steadfast commitment to ethical responsibility.

Bias in artificial intelligence algorithms and recommendations for mitigation

Dana Moukheiber Haobo Ma Razan Zatarah + 8 lainnya

1 Juni 2023

The adoption of artificial intelligence (AI) algorithms is rapidly increasing in healthcare. Such algorithms may be shaped by various factors such as social determinants of health that can influence health outcomes. While AI algorithms have been proposed as a tool to expand the reach of quality healthcare to underserved communities and improve health equity, recent literature has raised concerns about the propagation of biases and healthcare disparities through implementation of these algorithms. Thus, it is critical to understand the sources of bias inherent in AI-based algorithms. This review aims to highlight the potential sources of bias within each step of developing AI algorithms in healthcare, starting from framing the problem, data collection, preprocessing, development, and validation, as well as their full implementation. For each of these steps, we also discuss strategies to mitigate the bias and disparities. A checklist was developed with recommendations for reducing bias during the development and implementation stages. It is important for developers and users of AI-based algorithms to keep these important considerations in mind to advance health equity for all populations.

Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review

Julia Situ J. Paguio Joel Park + 11 lainnya

1 Maret 2022

Background While artificial intelligence (AI) offers possibilities of advanced clinical prediction and decision-making in healthcare, models trained on relatively homogeneous datasets, and populations poorly-representative of underlying diversity, limits generalisability and risks biased AI-based decisions. Here, we describe the landscape of AI in clinical medicine to delineate population and data-source disparities. Methods We performed a scoping review of clinical papers published in PubMed in 2019 using AI techniques. We assessed differences in dataset country source, clinical specialty, and author nationality, sex, and expertise. A manually tagged subsample of PubMed articles was used to train a model, leveraging transfer-learning techniques (building upon an existing BioBERT model) to predict eligibility for inclusion (original, human, clinical AI literature). Of all eligible articles, database country source and clinical specialty were manually labelled. A BioBERT-based model predicted first/last author expertise. Author nationality was determined using corresponding affiliated institution information using Entrez Direct. And first/last author sex was evaluated using the Gendarize.io API. Results Our search yielded 30,576 articles, of which 7,314 (23.9%) were eligible for further analysis. Most databases came from the US (40.8%) and China (13.7%). Radiology was the most represented clinical specialty (40.4%), followed by pathology (9.1%). Authors were primarily from either China (24.0%) or the US (18.4%). First and last authors were predominately data experts (i.e., statisticians) (59.6% and 53.9% respectively) rather than clinicians. And the majority of first/last authors were male (74.1%). Interpretation U.S. and Chinese datasets and authors were disproportionately overrepresented in clinical AI, and almost all of the top 10 databases and author nationalities were from high income countries (HICs). AI techniques were most commonly employed for image-rich specialties, and authors were predominantly male, with non-clinical backgrounds. Development of technological infrastructure in data-poor regions, and diligence in external validation and model re-calibration prior to clinical implementation in the short-term, are crucial in ensuring clinical AI is meaningful for broader populations, and to avoid perpetuating global health inequity.

Daftar Referensi

0 referensi

Tidak ada referensi ditemukan.

Artikel yang Mensitasi

1 sitasi

Measuring algorithmic bias to analyze the reliability of AI tools that predict depression risk using smartphone sensed-behavioral data

Tanzeem Choudhury Fei Wang + 6 lainnya

22 April 2024

AI tools intend to transform mental healthcare by providing remote estimates of depression risk using behavioral data collected by sensors embedded in smartphones. While these tools accurately predict elevated depression symptoms in small, homogenous populations, recent studies show that these tools are less accurate in larger, more diverse populations. In this work, we show that accuracy is reduced because sensed-behaviors are unreliable predictors of depression across individuals: sensed-behaviors that predict depression risk are inconsistent across demographic and socioeconomic subgroups. We first identified subgroups where a developed AI tool underperformed by measuring algorithmic bias, where subgroups with depression were incorrectly predicted to be at lower risk than healthier subgroups. We then found inconsistencies between sensed-behaviors predictive of depression across these subgroups. Our findings suggest that researchers developing AI tools predicting mental health from sensed-behaviors should think critically about the generalizability of these tools, and consider tailored solutions for targeted populations.