DOI: 10.1145/3613904.3642777
Terbit pada 3 Oktober 2023 Pada International Conference on Human Factors in Computing Systems

AXNav: Replaying Accessibility Tests from Natural Language

Maryam Taeb Ruijia Cheng E. Schoop + 3 penulis

Abstrak

Developers and quality assurance testers often rely on manual testing to test accessibility features throughout the product lifecycle. Unfortunately, manual testing can be tedious, often has an overwhelming scope, and can be difficult to schedule amongst other development milestones. Recently, Large Language Models (LLMs) have been used for a variety of tasks including automation of UIs. However, to our knowledge, no one has yet explored the use of LLMs in controlling assistive technologies for the purposes of supporting accessibility testing. In this paper, we explore the requirements of a natural language based accessibility testing workflow, starting with a formative study. From this we build a system that takes a manual accessibility test instruction in natural language (e.g., “Search for a show in VoiceOver”) as input and uses an LLM combined with pixel-based UI Understanding models to execute the test and produce a chaptered, navigable video. In each video, to help QA testers, we apply heuristics to detect and flag accessibility issues (e.g., Text size not increasing with Large Text enabled, VoiceOver navigation loops). We evaluate this system through a 10-participant user study with accessibility QA professionals who indicated that the tool would be very useful in their current work and performed tests similarly to how they would manually test the features. The study also reveals insights for future work on using LLMs for accessibility testing.

Artikel Ilmiah Terkait

Assistive-Technology Aided Manual Accessibility Testing in Mobile Apps, Powered by Record-and-Replay

Navid Salehnamadi Ziyao He S. Malek

19 April 2023

Billions of people use smartphones on a daily basis, including 15% of the world’s population with disabilities. Mobile platforms encourage developers to manually assess their apps’ accessibility in the way disabled users interact with phones, i.e., through Assistive Technologies (AT) like screen readers. However, most developers only test their apps with touch gestures and do not have enough knowledge to use AT properly. Moreover, automated accessibility testing tools typically do not consider AT. This paper introduces a record-and-replay technique that records the developers’ touch interactions, replays the same actions with an AT, and generates a visualized report of various ways of interacting with the app using ATs. Empirical evaluation of this technique on real-world apps revealed that while user study is the most reliable way of assessing accessibility, our technique can aid developers in detecting complex accessibility issues at different stages of development.

Latte: Use-Case and Assistive-Service Driven Automated Accessibility Testing Framework for Android

S. Malek Navid Salehnamadi S. Branham + 3 lainnya

6 Mei 2021

For 15% of the world population with disabilities, accessibility is arguably the most critical software quality attribute. The ever-growing reliance of users with disability on mobile apps further underscores the need for accessible software in this domain. Existing automated accessibility assessment techniques primarily aim to detect violations of predefined guidelines, thereby produce a massive amount of accessibility warnings that often overlook the way software is actually used by users with disability. This paper presents a novel, high-fidelity form of accessibility testing for Android apps, called Latte, that automatically reuses tests written to evaluate an app’s functional correctness to assess its accessibility as well. Latte first extracts the use case corresponding to each test, and then executes each use case in the way disabled users would, i.e., using assistive services. Our empirical evaluation on real-world Android apps demonstrates Latte’s effectiveness in detecting substantially more useful defects than prior techniques.

ALL: Accessibility Learning Labs for Computing Accessibility Education

Daniel E. Krutz Samuel A. Malachowsky Saad Khan + 1 lainnya

26 Juni 2021

Our Accessibility Learning Labs not only inform participants about the need for accessible software, but also how to properly create and implement accessible software. These experiential browser-based labs enable participants, instructors and practitioners to engage in our material using only their browser. In the following document, we will provide a brief overview of our labs, how they may be adopted, and some of their preliminary results. Complete project material is publicly available on our project website: http://all.rit.edu

Automating GUI-based Software Testing with GPT-3

Daniel Zimmermann A. Koziolek

1 April 2023

This paper introduces a new method for GUI-based software testing that utilizes GPT-3, a state-of-the-art language model. The approach uses GPT-3’s transformer architecture to interpret natural language test cases and programmatically navigate through the application under test. To overcome the memory limitations of the transformer architecture, we propose incorporating the current state of all GUI elements into the input prompt at each time step. Additionally, we suggest using a test automation framework to interact with the GUI elements and provide GPT-3 with information about the application’s current state. To simplify the process of acquiring training data, we also present a tool for this purpose. The proposed approach has the potential to improve the efficiency of software testing by eliminating the need for manual input and allowing non-technical users to easily input test cases for both desktop and mobile applications.

A Probabilistic Model and Metrics for Estimating Perceived Accessibility of Desktop Applications in Keystroke-Based Non-Visual Interactions

Syed Masum Billah Md. Touhidul Islam Donald E. Porter

19 April 2023

Perceived accessibility of an application is a subjective measure of how well an individual with a particular disability, skills, and goals experiences the application via assistive technology. This paper first presents a study with 11 blind users to report how they perceive the accessibility of desktop applications while interacting via assistive technology such as screen readers and a keyboard. The study identifies the low navigational complexity of the user interface (UI) elements as the primary contributor to higher perceived accessibility of different applications. Informed by this study, we develop a probabilistic model that accounts for the number of user actions needed to navigate between any two arbitrary UI elements within an application. This model contributes to the area of computational interaction for non-visual interaction. Next, we derive three metrics from this model: complexity, coverage, and reachability, which reveal important statistical characteristics of an application indicative of its perceived accessibility. The proposed metrics are appropriate for comparing similar applications and can be fine-tuned for individual users to cater to their skills and goals. Finally, we present five use cases, demonstrating how blind users, application developers, and accessibility practitioners can benefit from our model and metrics.

Daftar Referensi

4 referensi

Assistive-Technology Aided Manual Accessibility Testing in Mobile Apps, Powered by Record-and-Replay

Navid Salehnamadi Ziyao He + 1 lainnya

19 April 2023

Billions of people use smartphones on a daily basis, including 15% of the world’s population with disabilities. Mobile platforms encourage developers to manually assess their apps’ accessibility in the way disabled users interact with phones, i.e., through Assistive Technologies (AT) like screen readers. However, most developers only test their apps with touch gestures and do not have enough knowledge to use AT properly. Moreover, automated accessibility testing tools typically do not consider AT. This paper introduces a record-and-replay technique that records the developers’ touch interactions, replays the same actions with an AT, and generates a visualized report of various ways of interacting with the app using ATs. Empirical evaluation of this technique on real-world apps revealed that while user study is the most reliable way of assessing accessibility, our technique can aid developers in detecting complex accessibility issues at different stages of development.

A Large-Scale Longitudinal Analysis of Missing Label Accessibility Failures in Android Apps

A. S. Ross Mingyuan Zhong + 3 lainnya

29 April 2022

We present the first large-scale longitudinal analysis of missing label accessibility failures in Android apps. We developed a crawler and collected monthly snapshots of 312 apps over 16 months. We use this unique dataset in empirical examinations of accessibility not possible in prior datasets. Key large-scale findings include missing label failures in 55.6% of unique image-based elements, longitudinal improvement in ImageButton elements but not in more prevalent ImageView elements, that 8.8% of unique screens are unreachable without navigating at least one missing label failure, that app failure rate does not improve with number of downloads, and that effective labeling is neither limited to nor guaranteed by large software organizations. We then examine longitudinal data in individual apps, presenting illustrative examples of accessibility impacts of systematic improvements, incomplete improvements, interface redesigns, and accessibility regressions. We discuss these findings and potential opportunities for tools and practices to improve label-based accessibility.

Latte: Use-Case and Assistive-Service Driven Automated Accessibility Testing Framework for Android

S. Malek Navid Salehnamadi + 4 lainnya

6 Mei 2021

For 15% of the world population with disabilities, accessibility is arguably the most critical software quality attribute. The ever-growing reliance of users with disability on mobile apps further underscores the need for accessible software in this domain. Existing automated accessibility assessment techniques primarily aim to detect violations of predefined guidelines, thereby produce a massive amount of accessibility warnings that often overlook the way software is actually used by users with disability. This paper presents a novel, high-fidelity form of accessibility testing for Android apps, called Latte, that automatically reuses tests written to evaluate an app’s functional correctness to assess its accessibility as well. Latte first extracts the use case corresponding to each test, and then executes each use case in the way disabled users would, i.e., using assistive services. Our empirical evaluation on real-world Android apps demonstrates Latte’s effectiveness in detecting substantially more useful defects than prior techniques.

Artikel yang Mensitasi

0 sitasi

Tidak ada artikel yang mensitasi.