Using Large Language Models to Enhance Programming Error Messages
Abstrak
A key part of learning to program is learning to understand programming error messages. They can be hard to interpret and identifying the cause of errors can be time-consuming. One factor in this challenge is that the messages are typically intended for an audience that already knows how to program, or even for programming environments that then use the information to highlight areas in code. Researchers have been working on making these errors more novice friendly since the 1960s, however progress has been slow. The present work contributes to this stream of research by using large language models to enhance programming error messages with explanations of the errors and suggestions on how to fix them. Large language models can be used to create useful and novice-friendly enhancements to programming error messages that sometimes surpass the original programming error messages in interpretability and actionability. These results provide further evidence of the benefits of large language models for computing educators, highlighting their use in areas known to be challenging for students. We further discuss the benefits and downsides of large language models and highlight future streams of research for enhancing programming error messages.
Artikel Ilmiah Terkait
Ioannis Karvelas Joe Dillane Brett A. Becker
25 Februari 2020
Improving the feedback that novices receive from programming environments is an important and often overlooked aspect of computing education research. This work in progress examines the effects of various mechanisms by which environments deliver feedback to users. By providing insights on the effects of these mechanisms, we aim to inform designers, developers and educators about more effective design and use of such environments for students.
Dominic Lohr H. Keuning Natalie Kiesler
31 Agustus 2023
Ever since the emergence of large language models (LLMs) and related applications, such as ChatGPT, its performance and error analysis for programming tasks have been subject to research. In this work-in-progress paper, we explore the potential of such LLMs for computing educators and learners, as we analyze the feedback it generates to a given input containing program code. In particular, we aim at (1) exploring how an LLM like ChatGPT responds to students seeking help with their introductory programming tasks, and (2) identifying feedback types in its responses. To achieve these goals, we used students' programming sequences from a dataset gathered within a CS1 course as input for ChatGPT along with questions required to elicit feedback and correct solutions. The results show that ChatGPT performs reasonably well for some of the introductory programming tasks and student errors, which means that students can potentially benefit. However, educators should provide guidance on how to use the provided feedback, as it can contain misleading information for novices.
Jaromír Šavelka Paul Denny Brad E. Sheese + 1 lainnya
14 Agustus 2023
Computing educators face significant challenges in providing timely support to students, especially in large class settings. Large language models (LLMs) have emerged recently and show great promise for providing on-demand help at a large scale, but there are concerns that students may over-rely on the outputs produced by these models. In this paper, we introduce CodeHelp, a novel LLM-powered tool designed with guardrails to provide on-demand assistance to programming students without directly revealing solutions. We detail the design of the tool, which incorporates a number of useful features for instructors, and elaborate on the pipeline of prompting strategies we use to ensure generated outputs are suitable for students. To evaluate CodeHelp, we deployed it in a first-year computer and data science course with 52 students and collected student interactions over a 12-week period. We examine students’ usage patterns and perceptions of the tool, and we report reflections from the course instructor and a series of recommendations for classroom use. Our findings suggest that CodeHelp is well-received by students who especially value its availability and help with resolving errors, and that for instructors it is easy to deploy and complements, rather than replaces, the support that they provide to students.
Sumit Gulwani Tung Phung Tobias Kohn + 4 lainnya
24 Januari 2023
Large language models (LLMs), such as Codex, hold great promise in enhancing programming education by automatically generating feedback for students. We investigate using LLMs to generate feedback for fixing syntax errors in Python programs, a key scenario in introductory programming. More concretely, given a student's buggy program, our goal is to generate feedback comprising a fixed program along with a natural language explanation describing the errors/fixes, inspired by how a human tutor would give feedback. While using LLMs is promising, the critical challenge is to ensure high precision in the generated feedback, which is imperative before deploying such technology in classrooms. The main research question we study is: Can we develop LLMs-based feedback generation techniques with a tunable precision parameter, giving educators quality control over the feedback that students receive? To this end, we introduce PyFiXV, our technique to generate high-precision feedback powered by Codex. The key idea behind PyFiXV is to use a novel run-time validation mechanism to decide whether the generated feedback is suitable for sharing with the student; notably, this validation mechanism also provides a precision knob to educators. We perform an extensive evaluation using two real-world datasets of Python programs with syntax errors and show the efficacy of PyFiXV in generating high-precision feedback.
F. Hermans
1 Agustus 2020
One of the aspects of programming that learners often struggle with is the syntax of programming languages: remembering the right commands to use and combining those into a working program. Prior research demonstrated that students submit source code with syntax errors in 73% of cases and even the best students do so in 50% of cases. An analysis of 37 million compilations by 250.000 students found that the most common error was a syntax error, which occurred in almost 800.000 compilations. It was also found that Java and Perl are not easier to understand than a programming language with randomly generated keywords, stressing the difficulties that novices face in understanding syntax. This paper presents Hedy: a new way of teaching the syntax of a programming language to novices, inspired by educational methods by which punctuation is taught to children. Hedy starts as a simple programming language without any syntactic elements such as brackets, colons or indentation. The rules slowly and gradually change until the novices are programming in Python. Hedy is evaluated on 9714 programs.
Daftar Referensi
0 referensiTidak ada referensi ditemukan.