Automated DevOps Pipeline Generation for Code Repositories using Large Language Models

Kartik Rawool Bowen Xu Deep Mehta + 1 penulis

Abstrak

Automating software development processes through the orchestration of GitHub Action workflows has revolutionized the efficiency and agility of software delivery pipelines. This paper presents a detailed investigation into the use of Large Language Models (LLMs) specifically, GPT 3.5 and GPT 4 to generate and evaluate GitHub Action workflows for DevOps tasks. Our methodology involves data collection from public GitHub repositories, prompt engineering for LLM utilization, and evaluation metrics encompassing exact match scores, BLEU scores, and a novel DevOps Aware score. The research scrutinizes the proficiency of GPT 3.5 and GPT 4 in generating GitHub workflows, while assessing the influence of various prompt elements in constructing the most efficient pipeline. Results indicate substantial advancements in GPT 4, particularly in DevOps awareness and syntax correctness. The research introduces a GitHub App built on Probot, empowering users to automate workflow generation within GitHub ecosystem. This study contributes insights into the evolving landscape of AI-driven automation in DevOps practices.

Artikel Ilmiah Terkait

On the Use of GitHub Actions in Software Development Repositories

Alexandre Decan T. Mens Pooya Rostami Mazrae + 1 lainnya

1 Oktober 2022

GitHub Actions was introduced in 2019 and constitutes an integrated alternative to CI/CD services for GitHub repositories. The deep integration with GitHub allows repositories to easily automate software development workflows. This paper empirically studies the use of GitHub Actions on a dataset comprising 68K repositories on GitHub, of which 43.9% are using GitHub Actions workflows. We analyse which workflows are automated and identify the most frequent automation practices. We show that reuse of actions is a common practice, even if this reuse is concentrated in a limited number of actions. We study which actions are most frequently used and how workflows refer to them. Furthermore, we discuss the related security and versioning aspects. As such, we provide an overview of the use of GitHub Actions, constituting a necessary first step towards a better understanding of this emerging ecosystem and its implications on collaborative software development in the GitHub social coding platform.

The State of the ML-universe: 10 Years of Artificial Intelligence & Machine Learning Software Development on GitHub

Thomas Zimmermann Danielle Gonzalez Nachiappan Nagappan

1 Mei 2020

In the last few years, artificial intelligence (AI) and machine learning (ML) have become ubiquitous terms. These powerful techniques have escaped obscurity in academic communities with the recent onslaught of AI & ML tools, frameworks, and libraries that make these techniques accessible to a wider audience of developers. As a result, applying AI & ML to solve existing and emergent problems is an increasingly popular practice. However, little is known about this domain from the software engineering perspective. Many AI & ML tools and applications are open source, hosted on platforms such as GitHub that provide rich tools for large-scale distributed software development. Despite widespread use and popularity, these repositories have never been examined as a community to identify unique properties, development patterns, and trends. In this paper, we conducted a large-scale empirical study of AI & ML Tool (700) and Application (4,524) repositories hosted on GitHub to develop such a characterization. While not the only platform hosting AI & ML development, GitHub facilitates collecting a rich data set for each repository with high traceability between issues, commits, pull requests and users. To compare the AI & ML community to the wider population of repositories, we also analyzed a set of 4,101 unrelated repositories. We enhance this characterization with an elaborate study of developer workflow that measures collaboration and autonomy within a repository. We've captured key insights of this community's 10 year history such as it's primary language (Python) and most popular repositories (Tensorflow, Tesseract). Our findings show the AI & ML community has unique characteristics that should be accounted for in future research.

Software Engineering Using Autonomous Agents: Are We There Yet?

Kuntal Dey Sankar Narayan Das Samdyuti Suri + 3 lainnya

11 September 2023

Autonomous agents equipped with Large Language Models (LLMs) are rapidly gaining prominence as a revolutionary technology within the realm of Software Engineering. These intelligent and autonomous systems demonstrate the capacity to perform tasks and make independent decisions, leveraging their intrinsic reasoning and decision-making abilities. This paper delves into the current state of autonomous agents, their capabilities, challenges, and opportunities in Software Engineering practices. By employing different prompts (with or without context), we conclude the advantages of contextrich prompts for autonomous agents. Prompts with context enhance user requirement understanding, avoiding irrelevant details that could hinder task comprehension and degrade model performance, particularly when dealing with complex frameworks such as Spring Boot, Django, Flask, etc. This exploration is conducted using Auto-GPT (v0.3.0), an open-source application powered by GPT-3.5 and GPT-4 which intelligently connects the “thoughts” of Large Language Models (LLMs) to independently accomplish the assigned goals or tasks.

Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review

Shangxin Guo C. Hang C. Tan + 2 lainnya

1 Juni 2023

This paper provides a comprehensive review of the literature concerning the utilization of Natural Language Processing (NLP) techniques, with a particular focus on transformer-based large language models (LLMs) trained using Big Code, within the domain of AI-assisted programming tasks. LLMs, augmented with software naturalness, have played a crucial role in facilitating AI-assisted programming applications, including code generation, code completion, code translation, code refinement, code summarization, defect detection, and clone detection. Notable examples of such applications include the GitHub Copilot powered by OpenAI’s Codex and DeepMind AlphaCode. This paper presents an overview of the major LLMs and their applications in downstream tasks related to AI-assisted programming. Furthermore, it explores the challenges and opportunities associated with incorporating NLP techniques with software naturalness in these applications, with a discussion on extending AI-assisted programming capabilities to Apple’s Xcode for mobile software development. This paper also presents the challenges of and opportunities for incorporating NLP techniques with software naturalness, empowering developers with advanced coding assistance and streamlining the software development process.

A Benchmarking Proposal for DevOps Practices on Open Source Software Projects

David Benavides José Francisco Crespo Jos'e Manuel S'anchez Ruiz + 3 lainnya

28 April 2023

The popularity of open-source software (OSS) projects has grown significantly over the last few years with more organizations relying on them. As these projects become larger, the need for higher quality also increases. DevOps practices have been shown to improve quality and performance. The DORA benchmarking reports provide useful information to compare DevOps practices performance between organizations, but they focus on continuous deployment and delivery to production, while OSS projects focus on the continuous release of code and its impact on third parties. The DORA reports mention the increasing presence of OSS projects as they are widely used in the industry, but they have never been used to measure OSS projects performance levels. This study reveals that the DORA benchmark cannot be applied to OSS projects and proposes benchmarking metrics for OSS projects, being the first one that adapts the DORA metrics and applies them in OSS projects. The metrics proposed in this study for benchmarking OSS projects include Release Frequency and Lead Time For Released Changes to measure throughput, and Time To Repair Code and Bug Issues Rate to assess stability. In contrast to the DORA reports, where data is collected through manual surveys, in our proposal, data is collected automatically by a tool we developed that retrieves information from public GitHub repositories. This reduces the risk of survey-based data collection. Our study also shows the benchmark feasibility by applying it to four popular OSS projects: Angular, Kubernetes, Tensorflow, and VS Code. In addition, we proposed challenges that address the topics and future works to expand the knowledge and findings of this study. Overall, the findings of the study can help to improve future research on OSS projects and provide a better understanding and challenges of the role of DevOps practices in OSS projects.

Daftar Referensi

0 referensi

Tidak ada referensi ditemukan.

Artikel yang Mensitasi

0 sitasi

Tidak ada artikel yang mensitasi.