DOI: 10.1145/3512905
Terbit pada 30 Maret 2022 Pada Proc. ACM Hum. Comput. Interact.

How Domain Experts Work with Data: Situating Data Science in the Practices and Settings of Craftwork

Tom Steinberger J. King M. Ackerman + 1 penulis

Abstrak

Domain experts play an essential role in data science by helping data scientists situate their technical work beyond the statistical analysis of large datasets. How domain experts themselves may engage with data science tools as a type of end-user remains largely invisible. Understanding data science as domain expert-driven depends on understanding how domain experts use data. Drawing on an ethnographic study of a craft brewery in Korea, we show how craft brewers worked with data by situating otherwise abstract data within their brewing practices and settings. We contribute theoretical insight into how domain experts use data distinctly from technical data scientists in terms of their view of data (situated vs. abstract), purposes for engaging with data (guiding processes over predicting outcomes), and overall goals of using data (flexible control vs. precision). We propose four ways in which working with data can be supported through the design of data science tools, and discuss how craftwork can be a useful lens for integrating domain expert-driven understandings of data science into CSCW and HCI research.

Artikel Ilmiah Terkait

In the Backrooms of Data Science

P. Almklov Thomas Ă˜sterlie Elena Parmiggiani

2022

Much information systems research on data science treats data as preexisting objects and focuses on how these objects are analyzed. Such a view, however, overlooks the work involved in finding and preparing the data in the first place, such that they are available to be analyzed. In this paper, we draw on a longitudinal study of data management in the oil and gas industry to shed light on this backroom data work. We find that this type of work is qualitatively different from the front-stage data analytics in the realm of data science but is also deeply interwoven with it. We show that this work is unstable and bidirectional. That is, the work practices are constantly changing and must simultaneously take into account what data might be possible to access as well as the potential future uses of the data. It is also a collaborative endeavor involving cross-disciplinary expertise that seeks to establish control over data and is shaped by the epistemological orientation of the oil and gas domain.

The Craft and Coordination of Data Curation: Complicating Workflow Views of Data Science

Faye O. Polasek E. Yakel Sara Lafia + 5 lainnya

9 Februari 2022

Data curation is the process of making a dataset fit-for-use and archivable. It is critical to data-intensive science because it makes complex data pipelines possible, studies reproducible, and data reusable. Yet the complexities of the hands-on, technical, and intellectual work of data curation is frequently overlooked or downplayed. Obscuring the work of data curation not only renders the labor and contributions of data curators invisible but also hides the impact that curators' work has on the later usability, reliability, and reproducibility of data. To better understand the work and impact of data curation, we conducted a close examination of data curation at a large social science data repository, the Inter-university Consortium for Political and Social Research (ICPSR). We asked: What does curatorial work entail at ICPSR, and what work is more or less visible to different stakeholders and in different contexts? And, how is that curatorial work coordinated across the organization? We triangulated accounts of data curation from interviews and records of curation in Jira tickets to develop a rich and detailed account of curatorial work. While we identified numerous curatorial actions performed by ICPSR curators, we also found that curators rely on a number of craft practices to perform their jobs. The reality of their work practices defies the rote sequence of events implied by many life cycle or workflow models. Further, we show that craft practices are needed to enact data curation best practices and standards. The craft that goes into data curation is often invisible to end users, but it is well recognized by ICPSR curators and their supervisors. Explicitly acknowledging and supporting data curators as craftspeople is important in creating sustainable and successful curatorial infrastructures.

Understanding Data Visualization Design Practice

Paul C. Parsons

17 Agustus 2021

Professional roles for data visualization designers are growing in popularity, and interest in relationships between the academic research and professional practice communities is gaining traction. However, despite the potential for knowledge sharing between these communities, we have little understanding of the ways in which practitioners design in real-world, professional settings. Inquiry in numerous design disciplines indicates that practitioners approach complex situations in ways that are fundamentally different from those of researchers. In this work, I take a practice-led approach to understanding visualization design practice on its own terms. Twenty data visualization practitioners were interviewed and asked about their design process, including the steps they take, how they make decisions, and the methods they use. Findings suggest that practitioners do not follow highly systematic processes, but instead rely on situated forms of knowing and acting in which they draw from precedent and use methods and principles that are determined appropriate in the moment. These findings have implications for how visualization researchers understand and engage with practitioners, and how educators approach the training of future data visualization designers.

Excavating awareness and power in data science: A manifesto for trustworthy pervasive data research

Michael Zimmer Katie Shilton Matthew J. Bietz + 5 lainnya

1 Juli 2021

Frequent public uproar over forms of data science that rely on information about people demonstrates the challenges of defining and demonstrating trustworthy digital data research practices. This paper reviews problems of trustworthiness in what we term pervasive data research: scholarship that relies on the rich information generated about people through digital interaction. We highlight the entwined problems of participant unawareness of such research and the relationship of pervasive data research to corporate datafication and surveillance. We suggest a way forward by drawing from the history of a different methodological approach in which researchers have struggled with trustworthy practice: ethnography. To grapple with the colonial legacy of their methods, ethnographers have developed analytic lenses and researcher practices that foreground relations of awareness and power. These lenses are inspiring but also challenging for pervasive data research, given the flattening of contexts inherent in digital data collection. We propose ways that pervasive data researchers can incorporate reflection on awareness and power within their research to support the development of trustworthy data science.

How do Data Science Workers Collaborate? Roles, Workflows, and Tools

Michael J. Muller Dakuo Wang Amy X. Zhang

18 Januari 2020

Today, the prominence of data science within organizations has given rise to teams of data science workers collaborating on extracting insights from data, as opposed to individual data scientists working alone. However, we still lack a deep understanding of how data science workers collaborate in practice. In this work, we conducted an online survey with 183 participants who work in various aspects of data science. We focused on their reported interactions with each other (e.g., managers with engineers) and with different tools (e.g., Jupyter Notebook). We found that data science teams are extremely collaborative and work with a variety of stakeholders and tools during the six common steps of a data science workflow (e.g., clean data and train model). We also found that the collaborative practices workers employ, such as documentation, vary according to the kinds of tools they use. Based on these findings, we discuss design implications for supporting data science team collaborations and future research directions.

Daftar Referensi

0 referensi

Tidak ada referensi ditemukan.

Artikel yang Mensitasi

0 sitasi

Tidak ada artikel yang mensitasi.