A Deeper Look into the EU Text and Data Mining Exceptions: Harmonisation, Data Ownership, and the Future of Technology
Abstrak
This paper focuses on the two exceptions for text and data mining (TDM) introduced in the Directive on Copyright in the Digital Single Market (CDSM). While both are mandatory for Member States, Art. 3 is also imperative and finds application in cases of text and data mining for the purpose of scientific research by research and cultural institutions; Art. 4, on the other hand, permits text and data mining by anyone but with rightholders able to ‘contract-out’ (Art. 4). We trace the context of using the lever of copyright law to enable emerging technologies such as AI and the support innovation. Within the EU copyright intervention, elements that may underpin a transparent legal framework for AI are identified, such as the possibility of retention of permanent copies for further verification. On the other hand, we identify several pitfalls, including an excessively broad definition of TDM which makes the entire field of data-driven AI development dependent on an exception. We analyse the implications of limiting the scope of the exceptions to the right of reproduction; we argue that the limitation of Art. 3 to certain beneficiaries remains problematic; and that the requirement of lawful access is difficult to operationalize. In conclusion, we argue that there should be no need for a TDM exception for the act of extracting informational value from protected works. The EU’s CDSM provisions paradoxically may favour the development of biased AI systems due to price and accessibility conditions for training data that offer the wrong incentives. To avoid licensing, it may be economically attractive for EU-based developers to train their algorithms on older, less accurate, biased data, or import AI models already trained abroad on unverifiable data.
Artikel Ilmiah Terkait
David Donoho
2 Oktober 2023
A purported `AI Singularity' has been in the public eye recently. Mass media and US national political attention focused on `AI Doom' narratives hawked by social media influencers. The European Commission is announcing initiatives to forestall `AI Extinction'. In my opinion, `AI Singularity' is the wrong narrative for what's happening now; recent happenings signal something else entirely. Something fundamental to computation-based research really changed in the last ten years. In certain fields, progress is dramatically more rapid than previously, as the fields undergo a transition to frictionless reproducibility (FR). This transition markedly changes the rate of spread of ideas and practices, affects mindsets, and erases memories of much that came before. The emergence of frictionless reproducibility follows from the maturation of 3 data science principles in the last decade. Those principles involve data sharing, code sharing, and competitive challenges, however implemented in the particularly strong form of frictionless open services. Empirical Machine Learning (EML) is todays leading adherent field, and its consequent rapid changes are responsible for the AI progress we see. Still, other fields can and do benefit when they adhere to the same principles. Many rapid changes from this maturation are misidentified. The advent of FR in EML generates a steady flow of innovations; this flow stimulates outsider intuitions that there's an emergent superpower somewhere in AI. This opens the way for PR to push worrying narratives: not only `AI Extinction', but also the supposed monopoly of big tech on AI research. The helpful narrative observes that the superpower of EML is adherence to frictionless reproducibility practices; these practices are responsible for the striking progress in AI that we see everywhere.
Rainer Mühlhoff
1 Januari 2023
Big data and artificial intelligence pose a new challenge for data protection as these techniques allow predictions to be made about third parties based on the anonymous data of many people. Examples of predicted information include purchasing power, gender, age, health, sexual orientation, ethnicity, etc. The basis for such applications of “predictive analytics” is the comparison between behavioral data (e.g. usage, tracking, or activity data) of the individual in question and the potentially anonymously processed data of many others using machine learning models or simpler statistical methods. The article starts by noting that predictive analytics has a significant potential to be abused, which manifests itself in the form of social inequality, discrimination, and exclusion. These potentials are not regulated by current data protection law in the EU; indeed, the use of anonymized mass data takes place in a largely unregulated space. Under the term “predictive privacy,” a data protection approach is presented that counters the risks of abuse of predictive analytics. A person's predictive privacy is violated when personal information about them is predicted without their knowledge and against their will based on the data of many other people. Predictive privacy is then formulated as a protected good and improvements to data protection with regard to the regulation of predictive analytics are proposed. Finally, the article points out that the goal of data protection in the context of predictive analytics is the regulation of “prediction power,” which is a new manifestation of informational power asymmetry between platform companies and society.
Chris Russell Sandra Wachter B. Mittelstadt
3 Maret 2020
In recent years a substantial literature has emerged concerning bias, discrimination, and fairness in AI and machine learning. Connecting this work to existing legal non-discrimination frameworks is essential to create tools and methods that are practically useful across divergent legal regimes. While much work has been undertaken from an American legal perspective, comparatively little has mapped the effects and requirements of EU law. This Article addresses this critical gap between legal, technical, and organisational notions of algorithmic fairness. Through analysis of EU non-discrimination law and jurisprudence of the European Court of Justice (ECJ) and national courts, we identify a critical incompatibility between European notions of discrimination and existing work on algorithmic and automat-ed fairness. A clear gap exists between statistical measures of fairness as embedded in myriad fairness toolkits and governance mechanisms and the context-sensitive, often intuitive and ambiguous discrimination metrics and evidential requirements used by the ECJ; we refer to this approach as “contextual equality.”This Article makes three contributions. First, we review the evidential requirements to bring a claim under EU non-discrimination law. Due to the disparate nature of algorithmic and human discrimination, the EU’s current requirements are too contextual, reliant on intuition, and open to judicial interpretation to be automated. Many of the concepts fundamental to bringing a claim, such as the composition of the disadvantaged and advantaged group, the severity and type of harm suffered, and requirements for the relevance and admissibility of evidence, require normative or political choices to be made by the judiciary on a case-by-case basis. We show that automating fairness or non-discrimination in Europe may be impossible because the law, by design, does not provide a static or homogenous framework suited to testing for discrimination in AI systems.Second, we show how the legal protection offered by non-discrimination law is challenged when AI, not humans, discriminate. Humans discriminate due to negative attitudes (e.g. stereotypes, prejudice) and unintentional biases (e.g. organisational practices or internalised stereotypes) which can act as a signal to victims that discrimination has occurred. Equivalent signalling mechanisms and agency do not exist in algorithmic systems. Compared to traditional forms of discrimination, automated discrimination is more abstract and unintuitive, subtle, intangible, and difficult to detect. The increasing use of algorithms disrupts traditional legal remedies and procedures for detection, investigation, prevention, and correction of discrimination which have predominantly relied upon intuition. Consistent assessment procedures that define a common standard for statistical evidence to detect and assess prima facie automated discrimination are urgently needed to support judges, regulators, system controllers and developers, and claimants.Finally, we examine how existing work on fairness in machine learning lines up with procedures for assessing cases under EU non-discrimination law. A ‘gold standard’ for assessment of prima facie discrimination has been advanced by the European Court of Justice but not yet translated into standard assessment procedures for automated discrimination. We propose ‘conditional demographic disparity’ (CDD) as a standard baseline statistical measurement that aligns with the Court’s ‘gold standard’. Establishing a standard set of statistical evidence for automated discrimination cases can help ensure consistent procedures for assessment, but not judicial interpretation, of cases involving AI and automated systems. Through this proposal for procedural regularity in the identification and assessment of auto-mated discrimination, we clarify how to build considerations of fairness into automated systems as far as possible while still respecting and enabling the contextual approach to judicial interpretation practiced under EU non-discrimination law.
Ádám Nagy Hannah Hilligoss Nele Achten + 2 lainnya
15 Januari 2020
The rapid spread of artificial intelligence (AI) systems has precipitated a rise in ethical and human rights-based frameworks intended to guide the development and use of these technologies. Despite the proliferation of these "AI principles," there has been little scholarly focus on understanding these efforts either individually or as contextualized within an expanding universe of principles with discernible trends. To that end, this white paper and its associated data visualization compare the contents of thirty-six prominent AI principles documents side-by-side. This effort uncovered a growing consensus around eight key thematic trends: privacy, accountability, safety and security, transparency and explainability, fairness and non-discrimination, human control of technology, professional responsibility, and promotion of human values. Underlying this “normative core,” our analysis examined the forty-seven individual principles that make up the themes, detailing notable similarities and differences in interpretation found across the documents. In sharing these observations, it is our hope that policymakers, advocates, scholars, and others working to maximize the benefits and minimize the harms of AI will be better positioned to build on existing efforts and to push the fractured, global conversation on the future of AI toward consensus.
Ju Yeon Lee
16 Februari 2023
At the end of 2022, the appearance of ChatGPT, an artificial intelligence (AI) chatbot with amazing writing ability, caused a great sensation in academia. The chatbot turned out to be very capable, but also capable of deception, and the news broke that several researchers had listed the chatbot (including its earlier version) as co-authors of their academic papers. In response, Nature and Science expressed their position that this chatbot cannot be listed as an author in the papers they publish. Since an AI chatbot is not a human being, in the current legal system, the text automatically generated by an AI chatbot cannot be a copyrighted work; thus, an AI chatbot cannot be an author of a copyrighted work. Current AI chatbots such as ChatGPT are much more advanced than search engines in that they produce original text, but they still remain at the level of a search engine in that they cannot take responsibility for their writing. For this reason, they also cannot be authors from the perspective of research ethics.
Daftar Referensi
0 referensiTidak ada referensi ditemukan.
Artikel yang Mensitasi
0 sitasiTidak ada artikel yang mensitasi.