Revising human-systems engineering principles for embedded AI applications

Jason Scott Metcalfe Laura Freeman M. Cummings

Abstrak

The recent shift from predominantly hardware-based systems in complex settings to systems that heavily leverage non-deterministic artificial intelligence (AI) reasoning means that typical systems engineering processes must also adapt, especially when humans are direct or indirect users. Systems with embedded AI rely on probabilistic reasoning, which can fail in unexpected ways, and any overestimation of AI capabilities can result in systems with latent functionality gaps. This is especially true when humans oversee such systems, and such oversight has the potential to be deadly, but there is little-to-no consensus on how such system should be tested to ensure they can gracefully fail. To this end, this work outlines a roadmap for emerging research areas for complex human-centric systems with embedded AI. Fourteen new functional and tasks requirement considerations are proposed that highlight the interconnectedness between uncertainty and AI, as well as the role humans might need to play in the supervision and secure operation of such systems. In addition, 11 new and modified non-functional requirements, i.e., “ilities,” are provided and two new “ilities,” auditability and passive vulnerability, are also introduced. Ten problem areas with AI test, evaluation, verification and validation are noted, along with the need to determine reasonable risk estimates and acceptable thresholds for system performance. Lastly, multidisciplinary teams are needed for the design of effective and safe systems with embedded AI, and a new AI maintenance workforce should be developed for quality assurance of both underlying data and models.

Artikel Ilmiah Terkait

Human-Centered Artificial Intelligence: Reliable, Safe & Trustworthy

B. Shneiderman

10 Februari 2020

ABSTRACT Well-designed technologies that offer high levels of human control and high levels of computer automation can increase human performance, leading to wider adoption. The Human-Centered Artificial Intelligence (HCAI) framework clarifies how to (1) design for high levels of human control and high levels of computer automation so as to increase human performance, (2) understand the situations in which full human control or full computer control are necessary, and (3) avoid the dangers of excessive human control or excessive computer control. The methods of HCAI are more likely to produce designs that are Reliable, Safe & Trustworthy (RST). Achieving these goals will dramatically increase human performance, while supporting human self-efficacy, mastery, creativity, and responsibility.

Trustworthy Artificial Intelligence: A Review

Davinder Kaur Suleyman Uslu A. Durresi + 1 lainnya

18 Januari 2022

Artificial intelligence (AI) and algorithmic decision making are having a profound impact on our daily lives. These systems are vastly used in different high-stakes applications like healthcare, business, government, education, and justice, moving us toward a more algorithmic society. However, despite so many advantages of these systems, they sometimes directly or indirectly cause harm to the users and society. Therefore, it has become essential to make these systems safe, reliable, and trustworthy. Several requirements, such as fairness, explainability, accountability, reliability, and acceptance, have been proposed in this direction to make these systems trustworthy. This survey analyzes all of these different requirements through the lens of the literature. It provides an overview of different approaches that can help mitigate AI risks and increase trust and acceptance of the systems by utilizing the users and society. It also discusses existing strategies for validating and verifying these systems and the current standardization efforts for trustworthy AI. Finally, we present a holistic view of the recent advancements in trustworthy AI to help the interested researchers grasp the crucial facets of the topic efficiently and offer possible future research directions.

Towards a Roadmap on Software Engineering for Responsible AI

Xiwei Xu Liming Zhu J. Whittle + 2 lainnya

9 Maret 2022

Although AI is transforming the world, there are serious concerns about its ability to behave and make decisions responsibly. Many ethical regulations, principles, and frameworks for responsible AI have been issued recently. However, they are high level and difficult to put into practice. On the other hand, most AI researchers focus on algorithmic solutions, while the responsible AI challenges actually crosscut the entire engineering lifecycle and components of AI systems. To close the gap in operationalizing responsible AI, this paper aims to develop a roadmap on software engineering for responsible AI. The roadmap focuses on (i) establishing multi-level governance for responsible AI systems, (ii) setting up the development processes incorporating process-oriented practices for responsible AI systems, and (iii) building responsible-AI-by-design into AI systems through system-level architectural style, patterns and techniques. CCS CONCEPTS • Software and its engineering;

Software Engineering for AI-Based Systems: A Survey

Xavier Franch Julien Siebert Anna Maria Vollmer + 5 lainnya

5 Mei 2021

AI-based systems are software systems with functionalities enabled by at least one AI component (e.g., for image-, speech-recognition, and autonomous driving). AI-based systems are becoming pervasive in society due to advances in AI. However, there is limited synthesized knowledge on Software Engineering (SE) approaches for building, operating, and maintaining AI-based systems. To collect and analyze state-of-the-art knowledge about SE for AI-based systems, we conducted a systematic mapping study. We considered 248 studies published between January 2010 and March 2020. SE for AI-based systems is an emerging research area, where more than 2/3 of the studies have been published since 2018. The most studied properties of AI-based systems are dependability and safety. We identified multiple SE approaches for AI-based systems, which we classified according to the SWEBOK areas. Studies related to software testing and software quality are very prevalent, while areas like software maintenance seem neglected. Data-related issues are the most recurrent challenges. Our results are valuable for: researchers, to quickly understand the state-of-the-art and learn which topics need more research; practitioners, to learn about the approaches and challenges that SE entails for AI-based systems; and, educators, to bridge the gap among SE and AI in their curricula.

Responsible-AI-by-Design: A Pattern Collection for Designing Responsible Artificial Intelligence Systems

Q. Lu J. Whittle Liming Zhu + 1 lainnya

1 Mei 2023

Responsible artificial intelligence (AI) issues often occur at the system level, crosscutting many system components and the entire software engineering lifecycle. We summarize design patterns that can be embedded into AI systems as product features to contribute to responsible-AI-by-design.

Daftar Referensi

1 referensi

Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks

Jonas W. Mueller Curtis G. Northcutt + 1 lainnya

26 Maret 2021

We identify label errors in the test sets of 10 of the most commonly-used computer vision, natural language, and audio datasets, and subsequently study the potential for these label errors to affect benchmark results. Errors in test sets are numerous and widespread: we estimate an average of at least 3.3% errors across the 10 datasets, where for example label errors comprise at least 6% of the ImageNet validation set. Putative label errors are identified using confident learning algorithms and then human-validated via crowdsourcing (51% of the algorithmically-flagged candidates are indeed erroneously labeled, on average across the datasets). Traditionally, machine learning practitioners choose which model to deploy based on test accuracy - our findings advise caution here, proposing that judging models over correctly labeled test sets may be more useful, especially for noisy real-world datasets. Surprisingly, we find that lower capacity models may be practically more useful than higher capacity models in real-world datasets with high proportions of erroneously labeled data. For example, on ImageNet with corrected labels: ResNet-18 outperforms ResNet-50 if the prevalence of originally mislabeled test examples increases by just 6%. On CIFAR-10 with corrected labels: VGG-11 outperforms VGG-19 if the prevalence of originally mislabeled test examples increases by just 5%. Test set errors across the 10 datasets can be viewed at https://labelerrors.com and all label errors can be reproduced by https://github.com/cleanlab/label-errors.

Artikel yang Mensitasi

0 sitasi

Tidak ada artikel yang mensitasi.