Automating GUI-based Software Testing with GPT-3
Abstrak
This paper introduces a new method for GUI-based software testing that utilizes GPT-3, a state-of-the-art language model. The approach uses GPT-3’s transformer architecture to interpret natural language test cases and programmatically navigate through the application under test. To overcome the memory limitations of the transformer architecture, we propose incorporating the current state of all GUI elements into the input prompt at each time step. Additionally, we suggest using a test automation framework to interact with the GUI elements and provide GPT-3 with information about the application’s current state. To simplify the process of acquiring training data, we also present a tool for this purpose. The proposed approach has the potential to improve the efficiency of software testing by eliminating the need for manual input and allowing non-technical users to easily input test cases for both desktop and mobile applications.
Artikel Ilmiah Terkait
Daniel Zimmermann A. Koziolek
11 September 2023
This paper presents a novel method for GUI testing in web applications that largely automates the process by integrating the advanced language model GPT-4 with Selenium, a popular web application testing framework. Unlike traditional deep learning approaches, which require extensive training data, GPT-4 is pre-trained on a large corpus, giving it significant generalisation and inference capabilities. These capabilities allow testing without the need for recorded data from human testers, significantly reducing the time and effort required for the testing process. We also compare the efficiency of our integrated GPT-4 approach with monkey testing, a widely used technique for automated GUI testing where user input is randomly generated. To evaluate our approach, we implemented a web calculator with an integrated code coverage system. The results show that our integrated GPT-4 approach provides significantly better branch coverage compared to monkey testing. These results highlight the significant potential of integrating specific AI models such as GPT-4 and automated testing tools to improve the accuracy and efficiency of GUI testing in web applications.
Chunyang Chen Junjie Wang Zhe Liu + 3 lainnya
14 Juli 2023
Pre-trained large language models (LLMs) have recently emerged as a breakthrough technology in natural language processing and artificial intelligence, with the ability to handle large-scale datasets and exhibit remarkable performance across a wide range of tasks. Meanwhile, software testing is a crucial undertaking that serves as a cornerstone for ensuring the quality and reliability of software products. As the scope and complexity of software systems continue to grow, the need for more effective software testing techniques becomes increasingly urgent, making it an area ripe for innovative approaches such as the use of LLMs. This paper provides a comprehensive review of the utilization of LLMs in software testing. It analyzes 102 relevant studies that have used LLMs for software testing, from both the software testing and LLMs perspectives. The paper presents a detailed discussion of the software testing tasks for which LLMs are commonly used, among which test case preparation and program repair are the most representative. It also analyzes the commonly used LLMs, the types of prompt engineering that are employed, as well as the accompanied techniques with these LLMs. It also summarizes the key challenges and potential opportunities in this direction. This work can serve as a roadmap for future research in this area, highlighting potential avenues for exploration, and identifying gaps in our current understanding of the use of LLMs in software testing.
Sajed Jalil Suzzana Rafi Kevin Moran + 2 lainnya
7 Februari 2023
Over the past decade, predictive language modeling for code has proven to be a valuable tool for enabling new forms of automation for developers. More recently, we have seen the ad-vent of general purpose "large language models", based on neural transformer architectures, that have been trained on massive datasets of human written text, which includes code and natural language. However, despite the demonstrated representational power of such models, interacting with them has historically been constrained to specific task settings, limiting their general applicability. Many of these limitations were recently overcome with the introduction of ChatGPT, a language model created by OpenAI and trained to operate as a conversational agent, enabling it to answer questions and respond to a wide variety of commands from end users.The introduction of models, such as ChatGPT, has already spurred fervent discussion from educators, ranging from fear that students could use these AI tools to circumvent learning, to excitement about the new types of learning opportunities that they might unlock. However, given the nascent nature of these tools, we currently lack fundamental knowledge related to how well they perform in different educational settings, and the potential promise (or danger) that they might pose to traditional forms of instruction. As such, in this paper, we examine how well ChatGPT performs when tasked with answering common questions in a popular software testing curriculum. We found that given its current capabilities, ChatGPT is able to respond to 77.5% of the questions we examined and that, of these questions, it is able to provide correct or partially correct answers in 55.6% of cases, provide correct or partially correct explanations of answers in 53.0% of cases, and that prompting the tool in a shared question context leads to a marginally higher rate of correct answers and explanations. Based on these findings, we discuss the potential promises and perils related to the use of ChatGPT by students and instructors.
Maryam Taeb Ruijia Cheng E. Schoop + 3 lainnya
3 Oktober 2023
Developers and quality assurance testers often rely on manual testing to test accessibility features throughout the product lifecycle. Unfortunately, manual testing can be tedious, often has an overwhelming scope, and can be difficult to schedule amongst other development milestones. Recently, Large Language Models (LLMs) have been used for a variety of tasks including automation of UIs. However, to our knowledge, no one has yet explored the use of LLMs in controlling assistive technologies for the purposes of supporting accessibility testing. In this paper, we explore the requirements of a natural language based accessibility testing workflow, starting with a formative study. From this we build a system that takes a manual accessibility test instruction in natural language (e.g., “Search for a show in VoiceOver”) as input and uses an LLM combined with pixel-based UI Understanding models to execute the test and produce a chaptered, navigable video. In each video, to help QA testers, we apply heuristics to detect and flag accessibility issues (e.g., Text size not increasing with Large Text enabled, VoiceOver navigation loops). We evaluate this system through a 10-participant user study with accessibility QA professionals who indicated that the tool would be very useful in their current work and performed tests similarly to how they would manually test the features. The study also reveals insights for future work on using LLMs for accessibility testing.
Ítalo Santos Ronnie E. S. Santos C. Magalhães + 1 lainnya
8 Desember 2023
A Large Language Model (LLM) represents a cutting-edge artificial intelligence model that generates content, including grammatical sentences, human-like paragraphs, and syntactically code snippets. LLMs can play a pivotal role in soft-ware development, including software testing. LLMs go beyond traditional roles such as requirement analysis and documentation and can support test case generation, making them valuable tools that significantly enhance testing practices within the field. Hence, we explore the practical application of LLMs in software testing within an industrial setting, focusing on their current use by professional testers. In this context, rather than relying on existing data, we conducted a cross-sectional survey and collected data within real working contexts-specifically, engaging with practitioners in industrial settings. We applied quantitative and qualitative techniques to analyze and synthesize our collected data. Our findings demonstrate that LLMs effectively enhance testing documents and significantly assist testing professionals in programming tasks like debugging and test case automation. LLMs can support individuals engaged in manual testing who need to code. However, it is crucial to emphasize that, at this early stage, software testing professionals should use LLMs with caution while well-defined methods and guidelines are being built for the secure adoption of these tools.
Daftar Referensi
0 referensiTidak ada referensi ditemukan.
Artikel yang Mensitasi
0 sitasiTidak ada artikel yang mensitasi.