Adobe Sued for Allegedly Using Pirated Books in AI Training
Adobe, a leading software firm, has been hit with a proposed class-action lawsuit alleging that it misused authors’ work in AI training. The lawsuit claims that Adobe used pirated versions of numerous books, including those of author Elizabeth Lyon, to train its SlimLM program.
Adobe describes SlimLM as a small language model series optimized for document assistance tasks on mobile devices. However, Lyon’s lawsuit states that her writing was included in a processed subset of a manipulated dataset that was the basis of Adobe’s program.
The lawsuit claims that Adobe used a dataset called SlimPajama, which was created by copying and manipulating the RedPajama dataset. RedPajama has been cited in several litigation cases, including a lawsuit against Apple and Salesforce.
AI Training Data Controversy
The case is not an isolated incident, as AI algorithms are often trained on massive datasets that may include pirated materials. In September, Anthropic agreed to pay $1.5 billion to authors who had sued it for using pirated versions of their work to train its chatbot, Claude.
The controversy surrounding AI training data has become a growing concern in the tech industry. With the increasing use of AI, companies are facing lawsuits for allegedly using copyrighted material without consent or compensation.
As the tech industry continues to evolve, it remains to be seen how companies will address the issue of AI training data and copyright infringement.
Implications for the Tech Industry
The lawsuit against Adobe highlights the need for companies to ensure that their AI training data is sourced from legitimate and licensed materials. Failing to do so can result in costly lawsuits and damage to their reputation.
Moreover, the case raises questions about the ownership and control of AI training data. As AI becomes increasingly integrated into our lives, it is essential to establish clear guidelines and regulations to prevent copyright infringement.
Consequently, the tech industry must take a proactive approach to addressing the issue of AI training data and copyright infringement. This includes investing in legitimate data sources, developing robust copyright policies, and educating employees about the importance of respecting intellectual property rights.
Source: Link







