Documentation Matters: Human-Centered AI System to Assist Data Science Code Documentation in Computational Notebooks

Xuye Liu Jaimie Drozdal Justin D. Weisz + 6 penulis

Abstrak

Computational notebooks allow data scientists to express their ideas through a combination of code and documentation. However, data scientists often pay attention only to the code, and neglect creating or updating their documentation during quick iterations. Inspired by human documentation practices learned from 80 highly-voted Kaggle notebooks, we design and implement Themisto, an automated documentation generation system to explore how human-centered AI systems can support human data scientists in the machine learning code documentation scenario. Themisto facilitates the creation of documentation via three approaches: a deep-learning-based approach to generate documentation for source code, a query-based approach to retrieve online API documentation for source code, and a user prompt approach to nudge users to write documentation. We evaluated Themisto in a within-subjects experiment with 24 data science practitioners, and found that automated documentation generation techniques reduced the time for writing documentation, reminded participants to document code they would have ignored, and improved participants’ satisfaction with their computational notebook.

Artikel Ilmiah Terkait

Telling Stories from Computational Notebooks: AI-Assisted Presentation Slides Creation for Presenting Data Science Work

Dakuo Wang Xiaojuan Ma Chengbo Zheng + 1 lainnya

21 Maret 2022

Creating presentation slides is a critical but time-consuming task for data scientists. While researchers have proposed many AI techniques to lift data scientists’ burden on data preparation and model selection, few have targeted the presentation creation task. Based on the needs identified from a formative study, this paper presents NB2Slides, an AI system that facilitates users to compose presentations of their data science work. NB2Slides uses deep learning methods as well as example-based prompts to generate slides from computational notebooks, and take users’ input (e.g., audience background) to structure the slides. NB2Slides also provides an interactive visualization that links the slides with the notebook to help users further edit the slides. A follow-up user evaluation with 12 data scientists shows that participants believed NB2Slides can improve efficiency and reduces the complexity of creating slides. Yet, participants questioned the future of full automation and suggested a human-AI collaboration paradigm.

Natural Language to Code Generation in Interactive Data Science Notebooks

Wen-Ding Li H. Michalewski A. Rao + 9 lainnya

19 Desember 2022

Computational notebooks, such as Jupyter notebooks, are interactive computing environments that are ubiquitous among data scientists to perform data wrangling and analytic tasks. To measure the performance of AI pair programmers that automatically synthesize programs for those tasks given natural language (NL) intents from users, we build ARCADE, a benchmark of 1078 code generation problems using the pandas data analysis framework in data science notebooks. ARCADE features multiple rounds of NL-to-code problems from the same notebook. It requires a model to understand rich multi-modal contexts, such as existing notebook cells and their execution states as well as previous turns of interaction. To establish a strong baseline on this challenging task, we develop PaChiNCo, a 62B code language model (LM) for Python computational notebooks, which significantly outperforms public code LMs. Finally, we explore few-shot prompting strategies to elicit better code with step-by-step decomposition and NL explanation, showing the potential to improve the diversity and explainability of model predictions. Arcade is publicly available at https://github.com/google-research/arcade-nl2code/.

AutoDS: Towards Human-Centered Automation of Data Science

Dakuo Wang Erick Oduor Justin D. Weisz + 2 lainnya

13 Januari 2021

Data science (DS) projects often follow a lifecycle that consists of laborious tasks for data scientists and domain experts (e.g., data exploration, model training, etc.). Only till recently, machine learning(ML) researchers have developed promising automation techniques to aid data workers in these tasks. This paper introduces AutoDS, an automated machine learning (AutoML) system that aims to leverage the latest ML automation techniques to support data science projects. Data workers only need to upload their dataset, then the system can automatically suggest ML configurations, preprocess data, select algorithm, and train the model. These suggestions are presented to the user via a web-based graphical user interface and a notebook-based programming user interface. Our goal is to offer a systematic investigation of user interaction and perceptions of using an AutoDS system in solving a data science task. We studied AutoDS with 30 professional data scientists, where one group used AutoDS, and the other did not, to complete a data science project. As expected, AutoDS improves productivity; Yet surprisingly, we find that the models produced by the AutoDS group have higher quality and less errors, but lower human confidence scores. We reflect on the findings by presenting design implications for incorporating automation techniques into human work in the data science lifecycle.

The State of the ML-universe: 10 Years of Artificial Intelligence & Machine Learning Software Development on GitHub

Thomas Zimmermann Danielle Gonzalez Nachiappan Nagappan

1 Mei 2020

In the last few years, artificial intelligence (AI) and machine learning (ML) have become ubiquitous terms. These powerful techniques have escaped obscurity in academic communities with the recent onslaught of AI & ML tools, frameworks, and libraries that make these techniques accessible to a wider audience of developers. As a result, applying AI & ML to solve existing and emergent problems is an increasingly popular practice. However, little is known about this domain from the software engineering perspective. Many AI & ML tools and applications are open source, hosted on platforms such as GitHub that provide rich tools for large-scale distributed software development. Despite widespread use and popularity, these repositories have never been examined as a community to identify unique properties, development patterns, and trends. In this paper, we conducted a large-scale empirical study of AI & ML Tool (700) and Application (4,524) repositories hosted on GitHub to develop such a characterization. While not the only platform hosting AI & ML development, GitHub facilitates collecting a rich data set for each repository with high traceability between issues, commits, pull requests and users. To compare the AI & ML community to the wider population of repositories, we also analyzed a set of 4,101 unrelated repositories. We enhance this characterization with an elaborate study of developer workflow that measures collaboration and autonomy within a repository. We've captured key insights of this community's 10 year history such as it's primary language (Python) and most popular repositories (Tensorflow, Tesseract). Our findings show the AI & ML community has unique characteristics that should be accounted for in future research.

Training and Evaluating a Jupyter Notebook Data Science Assistant

Neel Sundaresan Guillermo Serrato Shubham Chandel + 1 lainnya

30 Januari 2022

We study the feasibility of a Data Science assistant powered by a sequence-to-sequence transformer by training a new model JuPyT5 on all publicly available Jupyter Notebook GitHub repositories and developing a new metric: Data Science Problems (DSP). DSP is a collection of 1119 problems curated from 306 pedagogical notebooks with 92 dataset dependencies, natural language and Markdown problem descriptions, and assert-based unit tests. These notebooks were designed to test university students' mastery of various Python implementations of Math and Data Science, and we now leverage them to study the ability of JuPyT5 to understand and pass the tests. We analyze the content of DSP, validate its quality, and we find that given 100 sampling attempts JuPyT5 is able to solve 77.5\% of the DSP problems. We further present various ablation and statistical analyses and compare DSP to other recent natural language to code benchmarks.

Daftar Referensi

0 referensi

Tidak ada referensi ditemukan.

Artikel yang Mensitasi

0 sitasi

Tidak ada artikel yang mensitasi.