Software giant Adobe is facing a proposed class action lawsuit accusing it of illegally using copyrighted pirated books while training its artificial intelligence models.

Oregon author Elizabeth Lyon has filed a lawsuit on behalf of affected groups, claiming that Adobe used a pirated book dataset that includes her works when developing its lightweight language model called SlimLM.

The lawsuit points out that SlimLM was pre trained on the open-source dataset SlimPajama-627B. The dataset is accused of containing the infamous Books3 subset, which includes approximately 191000 unauthorized e-books.

Adobe is not the only giant affected by the disaster. Previously, companies such as Apple, Salesforce, and Anthropic have also been embroiled in legal disputes over the use of RedPajama or similar datasets containing Books3 content. Model Usage: The accused SlimLM model is mainly used for optimizing document assisted tasks on mobile devices.

At present, Adobe has not issued a formal comment on this lawsuit. With the large-scale application of AI technology, such legal games about training data compliance are becoming a key turning point in the industry.