Written by 6:05 pm AI, OpenAI

### Legal Action: NY Times Files Lawsuit Against Microsoft and Open AI for Rights Infringement

Shows evidence that GPT-based systems will reproduce Times articles if asked.

The New York Times is reportedly contemplating joining the expanding cohort of creators who are taking legal action against IoT companies for appropriating their data, as per a leak in August. The Times had engaged in discussions with OpenAI regarding the potential licensing of its content, but negotiations did not proceed as anticipated. Subsequently, a lawsuit has been filed, eight instances after the organization initially pondered the idea.

The focus of The Times’ legal action is directed towards various entities affiliated with OpenAI, notably Microsoft, an unexpected partner utilizing its services to power the Copilot platform and aiding in the development of infrastructure crucial for training the GPT Huge Language Model. The lawsuit alleges that the OpenAI-powered system circumvents The Times’ paywall, attributing inaccurate information to the publication, exceeding mere use of copyrighted material for educational purposes.

The lawsuit underscores the significant costs associated with news production at The Times, enabling the assignment of reporters to diverse beats and facilitating in-depth investigative journalism, among other endeavors. These substantial investments contribute to the newspaper’s reputation as a trustworthy information source.

The Times’ revenue generation model heavily relies on a stringent paywall strategy to control access to its investigative reports. By imposing restrictions on the replication and utilization of published content, accompanied by trademark warnings on print editions, and meticulous licensing practices, the newspaper not only boosts its income streams but also safeguards its authoritative voice by regulating the dissemination of its works.

The lawsuit contends that the OpenAI-developed tools undermine The Times’ reader relationships by disseminating its content without authorization, thereby depriving the newspaper of potential revenue streams such as memberships, licensing fees, advertising, and affiliate income.

Instances of unauthorized content usage allegedly occurred during the training of various GPT models, including GPT-3.5. The lawsuit highlights the utilization of a substantial online dataset known as “Common Crawl,” which incorporates data from 16 million distinct websites, including content from The Times. The newspaper emerges as the most referenced source within this dataset, surpassing even Wikipedia and a collection of US patents.

OpenAI’s current lack of transparency regarding the datasets used to train their latest GPT models raises concerns about potential issues related to access to training data during legal discovery processes.

In addition to copyright concerns related to AI training, The Times’ lawsuit goes further to demonstrate how the information consumed during training can be replicated in the outputs generated by AI models. The lawsuit provides examples where GPT-powered tools produce content closely resembling The Times’ articles, mimicking its style and verbatim text.

The lawsuit challenges any attempt to justify this unauthorized content usage as fair use, emphasizing that the defendants’ actions do not qualify as transformative under copyright law. The lawsuit asserts that the defendants’ behavior diverts readership from The Times by developing materials that substitute for its content, thereby undermining the newspaper’s position and revenue streams.

The lawsuit also raises concerns about the potential reputational damage caused by AI-generated misinformation, highlighting instances where AI models fabricated articles attributed to The Times, potentially impacting public health and the credibility of the publication.

The legal action targets multiple OpenAI entities involved in software development, as well as Microsoft for facilitating the consumption of copyrighted materials during training and offering OpenAI-powered services. Allegations include copyright infringement, DMCA violations, and unfair competition practices.

The lawsuit seeks the cessation of dataset usage for training, the deletion of GPT models trained using The Times’ data, and a permanent injunction against similar activities in the future. Additionally, The Times demands substantial financial compensation, including statutory damages, punitive measures, and restitution, among other remedies permitted by law.

Visited 2 times, 1 visit(s) today
Last modified: December 28, 2023
Close Search Window
Close