Accelerating Software Development Using Generative AI: ChatGPT Case Study
Abstrak
The Software Development Life Cycle (SDLC) comprises multiple phases, each requiring Subject Matter Experts (SMEs) with phase-specific skills. The efficacy and quality of deliverables of each phase are skill dependent. In recent times, Generative AI techniques, including Large-scale Language Models (LLMs) like GPT, have become significant players in software engineering. These models, trained on extensive text data, can offer valuable contributions to software development. Interacting with LLMs involves feeding prompts with the context information and guiding the generation of textual responses. The quality of the response is dependent on the quality of the prompt given. This paper proposes a systematic prompting approach based on meta-model concepts for SDLC phases. The approach is validated using ChatGPT for small but complex business application development. We share the approach and our experience, learnings, benefits obtained, and the challenges encountered while applying the approach using ChatGPT. Our experience indicates that Generative AI techniques, such as ChatGPT, have the potential to reduce the skills barrier and accelerate software development substantially.
Artikel Ilmiah Terkait
Maleknaz Nayebi Song Wang Clark Tang + 3 lainnya
2023
The advancements in Large Language Models (LLMs) have opened up new opportunities for Automated Software Engineering (ASE). Two orthogonal approaches have been widely used to customize LLMs for ASE tasks, i.e., prompt engineering and fine-tuning. Prompt engineering-based approaches leverage different prompting strategies to query LLMs (e.g., ChatGPT ) to automate a task such as code generation, while fine-tuning-based approaches further train the pre-trained models (e.g., CodeBERT ) on customized data related to the down-stream task (in this example, code generation datasets). However, to date, there is no comprehensive and in-depth analysis of these two orthogonal approaches, in the ASE literature. In this paper, we investigate the effectiveness of state-of-the-art LLM, i.e., GPT-4 , with three different prompting engineering techniques (i.e., basic prompting, in-context learning, and task-specific prompting) against 18 fine-tuned LLMs on three typical ASE tasks, i.e., code generation, code summarization, and code translation. Our quantitative analysis of these prompting strategies suggests that prompt engineering GPT-4 cannot necessarily and significantly outperform fine-tuning smaller/older LLMs in all three tasks. For comment generation, GPT-4 with the best prompting strategy (i.e., task-specific prompt) had outperformed the first-ranked fine-tuned model by 8.33% points on average in BLEU. However, for code generation, the first-ranked fine-tuned model outperforms GPT-4 with best prompting by 16.61% and 28.3% points, on average in BLEU. For code translation, GPT-4 and fine-tuned baselines tie as they outperform each other on different translation tasks. To explore the impact of different prompting strategies, we conducted a user study with 27 graduate students and 10 industry practitioners. From our qualitative analysis, we find that the GPT-4 with conversational prompts (i
Sourav Mazumdar G. Sridhara Ranjani H.G.
26 Mei 2023
ChatGPT (Chat Generative Pre-trained Transformer) is a chatbot launched by OpenAI on November 30, 2022. OpenAI's GPT-3 family of large language models serve as the foundation for ChatGPT. ChatGPT is fine-tuned with both supervised and reinforcement learning techniques and has received widespread attention for its articulate responses across diverse domains of knowledge. In this study, we explore how ChatGPT can be used to help with common software engineering tasks. Many of the ubiquitous tasks covering the breadth of software engineering such as ambiguity resolution in software requirements, method name suggestion, test case prioritization, code review, log summarization can potentially be performed using ChatGPT. In this study, we explore fifteen common software engineering tasks using ChatGPT. We juxtapose and analyze ChatGPT's answers with the respective state of the art outputs (where available) and/or human expert ground truth. Our experiments suggest that for many tasks, ChatGPT does perform credibly and the response from it is detailed and often better than the human expert output or the state of the art output. However, for a few other tasks, ChatGPT in its present form provides incorrect answers and hence is not suited for such tasks.
Costain Nachuma A. I. Champa Md. Fazle Rabbi + 1 lainnya
15 April 2024
The emergence of AI tools such as ChatGPT is being used to assist with software development, but little is known of how developers utilize these tools as well as the capabilities of these tools in software engineering tasks. Using the DevGPT dataset, we conduct quantitative analyses of the tasks developers seek assistance from ChatGPT and how effectively ChatGPT addresses them. We also examine the impact of initial prompt quality on conversation length. The findings reveal where ChatGPT is most and least suited to assist in the identified 12 software development tasks. The insights from this research would guide the software developers, researchers, and AI tool providers in optimizing these tools for more effective programming aid.
Kuntal Dey Sankar Narayan Das Samdyuti Suri + 3 lainnya
11 September 2023
Autonomous agents equipped with Large Language Models (LLMs) are rapidly gaining prominence as a revolutionary technology within the realm of Software Engineering. These intelligent and autonomous systems demonstrate the capacity to perform tasks and make independent decisions, leveraging their intrinsic reasoning and decision-making abilities. This paper delves into the current state of autonomous agents, their capabilities, challenges, and opportunities in Software Engineering practices. By employing different prompts (with or without context), we conclude the advantages of contextrich prompts for autonomous agents. Prompts with context enhance user requirement understanding, avoiding irrelevant details that could hinder task comprehension and degrade model performance, particularly when dealing with complex frameworks such as Spring Boot, Django, Flask, etc. This exploration is conducted using Auto-GPT (v0.3.0), an open-source application powered by GPT-3.5 and GPT-4 which intelligently connects the “thoughts” of Large Language Models (LLMs) to independently accomplish the assigned goals or tasks.
B. Chen Yanlin Wang Dongmei Zhang + 6 lainnya
25 Agustus 2023
Software development plays a crucial role in driving innovation and efficiency across modern societies. To meet the demands of this dynamic field, there is a growing need for an effective software development assistant. However, existing large language models represented by ChatGPT suffer from limited accessibility, including training data and model weights. Although other large open-source models like LLaMA have shown promise, they still struggle with understanding human intent. In this paper, we present SoTaNa, an open-source software development assistant. SoTaNa utilizes ChatGPT to generate high-quality instruction-based data for the domain of software engineering and employs a parameter-efficient fine-tuning approach to enhance the open-source foundation model, LLaMA. We evaluate the effectiveness of \our{} in answering Stack Overflow questions and demonstrate its capabilities. Additionally, we discuss its capabilities in code summarization and generation, as well as the impact of varying the volume of generated data on model performance. Notably, SoTaNa can run on a single GPU, making it accessible to a broader range of researchers. Our code, model weights, and data are public at \url{https://github.com/DeepSoftwareAnalytics/SoTaNa}.
Daftar Referensi
0 referensiTidak ada referensi ditemukan.
Artikel yang Mensitasi
0 sitasiTidak ada artikel yang mensitasi.