DOI: 10.1145/3491102.3502143
Terbit pada 29 April 2022 Pada International Conference on Human Factors in Computing Systems

A Large-Scale Longitudinal Analysis of Missing Label Accessibility Failures in Android Apps

A. S. Ross Mingyuan Zhong J. Wobbrock + 2 penulis

Abstrak

We present the first large-scale longitudinal analysis of missing label accessibility failures in Android apps. We developed a crawler and collected monthly snapshots of 312 apps over 16 months. We use this unique dataset in empirical examinations of accessibility not possible in prior datasets. Key large-scale findings include missing label failures in 55.6% of unique image-based elements, longitudinal improvement in ImageButton elements but not in more prevalent ImageView elements, that 8.8% of unique screens are unreachable without navigating at least one missing label failure, that app failure rate does not improve with number of downloads, and that effective labeling is neither limited to nor guaranteed by large software organizations. We then examine longitudinal data in individual apps, presenting illustrative examples of accessibility impacts of systematic improvements, incomplete improvements, interface redesigns, and accessibility regressions. We discuss these findings and potential opportunities for tools and practices to improve label-based accessibility.

Artikel Ilmiah Terkait

Empirical Investigation of Accessibility Bug Reports in Mobile Platforms: A Chromium Case Study

Marouane Kessentini Wajdi Aljedaani Mohamed Wiem Mkaouer + 1 lainnya

11 Mei 2024

Accessibility is an important quality factor of mobile applications. Many studies have shown that, despite the availability of many resources to guide the development of accessible software, most apps and web applications contain many accessibility issues. Some researchers surveyed professionals and organizations to understand the lack of accessibility during software development, but few studies have investigated how developers and organizations respond to accessibility bug reports. Therefore, this paper analyzes accessibility bug reports posted in the Chromium repository to understand how developers and organizations handle them. More specifically, we want to determine the frequency of accessibility bug reports over time, the time-to-fix compared to traditional bug reports (e.g., functional bugs), and the types of accessibility barriers reported. Results show that the frequency of accessibility reports has increased over the years, and accessibility bugs take longer to be fixed, as they tend to be given low priority.

Latte: Use-Case and Assistive-Service Driven Automated Accessibility Testing Framework for Android

S. Malek Navid Salehnamadi S. Branham + 3 lainnya

6 Mei 2021

For 15% of the world population with disabilities, accessibility is arguably the most critical software quality attribute. The ever-growing reliance of users with disability on mobile apps further underscores the need for accessible software in this domain. Existing automated accessibility assessment techniques primarily aim to detect violations of predefined guidelines, thereby produce a massive amount of accessibility warnings that often overlook the way software is actually used by users with disability. This paper presents a novel, high-fidelity form of accessibility testing for Android apps, called Latte, that automatically reuses tests written to evaluate an app’s functional correctness to assess its accessibility as well. Latte first extracts the use case corresponding to each test, and then executes each use case in the way disabled users would, i.e., using assistive services. Our empirical evaluation on real-world Android apps demonstrates Latte’s effectiveness in detecting substantially more useful defects than prior techniques.

Assistive-Technology Aided Manual Accessibility Testing in Mobile Apps, Powered by Record-and-Replay

Navid Salehnamadi Ziyao He S. Malek

19 April 2023

Billions of people use smartphones on a daily basis, including 15% of the world’s population with disabilities. Mobile platforms encourage developers to manually assess their apps’ accessibility in the way disabled users interact with phones, i.e., through Assistive Technologies (AT) like screen readers. However, most developers only test their apps with touch gestures and do not have enough knowledge to use AT properly. Moreover, automated accessibility testing tools typically do not consider AT. This paper introduces a record-and-replay technique that records the developers’ touch interactions, replays the same actions with an AT, and generates a visualized report of various ways of interacting with the app using ATs. Empirical evaluation of this technique on real-world apps revealed that while user study is the most reliable way of assessing accessibility, our technique can aid developers in detecting complex accessibility issues at different stages of development.

Screen Recognition: Creating Accessibility Metadata for Mobile Applications from Pixels

Qi Shan Samuel White Lilian de Greef + 9 lainnya

13 Januari 2021

Many accessibility features available on mobile platforms require applications (apps) to provide complete and accurate metadata describing user interface (UI) components. Unfortunately, many apps do not provide sufficient metadata for accessibility features to work as expected. In this paper, we explore inferring accessibility metadata for mobile apps from their pixels, as the visual interfaces often best reflect an app’s full functionality. We trained a robust, fast, memory-efficient, on-device model to detect UI elements using a dataset of 77,637 screens (from 4,068 iPhone apps) that we collected and annotated. To further improve UI detections and add semantic information, we introduced heuristics (e.g., UI grouping and ordering) and additional models (e.g., recognize UI content, state, interactivity). We built Screen Recognition to generate accessibility metadata to augment iOS VoiceOver. In a study with 9 screen reader users, we validated that our approach improves the accessibility of existing mobile apps, enabling even previously inaccessible apps to be used.

Analyzing Accessibility Reviews Associated with Visual Disabilities or Eye Conditions

Alberto Dumont Alves Oliveira D. M. Eler Wajdi Aljedaani + 3 lainnya

19 April 2023

Accessibility reviews collected from app stores may contain valuable information for improving apps accessibility. Recent studies have presented insightful information on accessibility reviews, but they were based on small datasets and focused on general accessibility concerns. In this paper, we analyzed accessibility reviews that report issues affecting users with visual disabilities or conditions. Such reviews were identified based on selection criteria applied over 179,519,598 reviews of popular apps on the Google Play Store. Our results show that only 0,003% of user reviews mention visual disabilities or conditions; accessibility reviews are associated with 36 visual disabilities or eye conditions; many users do not give precise feedback and refer to their disability using generic terms; accessibility reviews can be grouped into general topics of concerns related to different types of disabilities; and positive reviews are generally associated with high scores and negative feedback with lower scores.

Daftar Referensi

4 referensi

Finding the Needle in a Haystack: On the Automatic Identification of Accessibility User Reviews

E. Alomar Wajdi Aljedaani + 3 lainnya

6 Mei 2021

In recent years, mobile accessibility has become an important trend with the goal of allowing all users the possibility of using any app without many limitations. User reviews include insights that are useful for app evolution. However, with the increase in the amount of received reviews, manually analyzing them is tedious and time-consuming, especially when searching for accessibility reviews. The goal of this paper is to support the automated identification of accessibility in user reviews, to help technology professionals in prioritizing their handling, and thus, creating more inclusive apps. Particularly, we design a model that takes as input accessibility user reviews, learns their keyword-based features, in order to make a binary decision, for a given review, on whether it is about accessibility or not. The model is evaluated using a total of 5,326 mobile app reviews. The findings show that (1) our model can accurately identify accessibility reviews, outperforming two baselines, namely keyword-based detector and a random classifier; (2) our model achieves an accuracy of 85% with relatively small training dataset; however, the accuracy improves as we increase the size of the training dataset.

Accessibility of High-Fidelity Prototyping Tools

Garreth W. Tigwell Kristen Shinohara + 1 lainnya

6 Mei 2021

High-fidelity prototyping tools are used by software designers and developers to iron out interface details without full implementation. However, the lack of visual accessibility in these tools creates a barrier for designers who may use screen readers, such as those who are vision impaired. We assessed conformance of four prototyping tools (Sketch, Adobe XD, Balsamiq, UXPin) with accessibility guidelines, using two screen readers (Narrator and VoiceOver), focusing our analysis on GUI element accessibility and critical workflows used to create prototypes. We found few tools were fully accessible, with 45.9% of GUI elements meeting accessibility criteria (34.2% partially supported accessibility, 19.9% not supporting accessibility). Accessibility issues stymied efforts to create prototypes using screen readers. Though no screen reader-tool pairs were completely accessible, the most accessible pairs were VoiceOver-Sketch, VoiceOver-Balsamiq, and Narrator-Balsamiq. We recommend prioritizing improved accessibility for input and control instruction, alternative text, focus order, canvas element properties, and keyboard operations.

Latte: Use-Case and Assistive-Service Driven Automated Accessibility Testing Framework for Android

S. Malek Navid Salehnamadi + 4 lainnya

6 Mei 2021

For 15% of the world population with disabilities, accessibility is arguably the most critical software quality attribute. The ever-growing reliance of users with disability on mobile apps further underscores the need for accessible software in this domain. Existing automated accessibility assessment techniques primarily aim to detect violations of predefined guidelines, thereby produce a massive amount of accessibility warnings that often overlook the way software is actually used by users with disability. This paper presents a novel, high-fidelity form of accessibility testing for Android apps, called Latte, that automatically reuses tests written to evaluate an app’s functional correctness to assess its accessibility as well. Latte first extracts the use case corresponding to each test, and then executes each use case in the way disabled users would, i.e., using assistive services. Our empirical evaluation on real-world Android apps demonstrates Latte’s effectiveness in detecting substantially more useful defects than prior techniques.

Artikel yang Mensitasi

2 sitasi

Integrating Accessibility in a Mobile App Development Course

J. Bhatia Dhruv Nagpal + 3 lainnya

12 Oktober 2022

The growing interest in accessible software reflects in computing educators' and education researchers' efforts to include accessibility in core computing education. We integrated accessibility in a junior/senior-level Android app development course at a large private university in India. The course introduced three accessibility-related topics using various interventions: Accessibility Awareness (a guest lecture by a legal expert), Technical Knowledge (lectures on Android accessibility guidelines and testing practices and graded components for implementing accessibility in programming assignments), and Empathy (an activity that required students to blindfold themselves and interact with their phones using a screen-reader). We evaluated their impact on student learning using three instruments: (A) A pre/post-course questionnaire, (B) Reflective questions on each of the four programming assignments, and (C) Midterm and Final exam questions. Our findings demonstrate that: (A) significantly more (p <.05) students considered disabilities when designing an app after taking this course, (B) many students developed empathy towards the challenges persons with disabilities face while using inaccessible apps, and (C) all students could correctly identify at least one accessibility issue in the user interface of a real-world app given its screenshot and 90% of them could provide a correct solution to fix it.

AXNav: Replaying Accessibility Tests from Natural Language

Maryam Taeb Ruijia Cheng + 4 lainnya

3 Oktober 2023

Developers and quality assurance testers often rely on manual testing to test accessibility features throughout the product lifecycle. Unfortunately, manual testing can be tedious, often has an overwhelming scope, and can be difficult to schedule amongst other development milestones. Recently, Large Language Models (LLMs) have been used for a variety of tasks including automation of UIs. However, to our knowledge, no one has yet explored the use of LLMs in controlling assistive technologies for the purposes of supporting accessibility testing. In this paper, we explore the requirements of a natural language based accessibility testing workflow, starting with a formative study. From this we build a system that takes a manual accessibility test instruction in natural language (e.g., “Search for a show in VoiceOver”) as input and uses an LLM combined with pixel-based UI Understanding models to execute the test and produce a chaptered, navigable video. In each video, to help QA testers, we apply heuristics to detect and flag accessibility issues (e.g., Text size not increasing with Large Text enabled, VoiceOver navigation loops). We evaluate this system through a 10-participant user study with accessibility QA professionals who indicated that the tool would be very useful in their current work and performed tests similarly to how they would manually test the features. The study also reveals insights for future work on using LLMs for accessibility testing.