Written by zgiaonews• January 9, 2024• 9:33 am• ConceptualAI

### Challenges in Training ConceptualAI Without Copyrighted Components

HomeConceptualAI### Challenges in Training ConceptualAI Without Copyrighted Components

OpenAI said it’s “impossible to train today’s leading AI models without using cop…

Microsoft, a prominent supporter of OpenAI, along with the company itself, is currently facing multiple lawsuits alleging the unauthorized use of copyrighted content to train their large language models (LLMs). OpenAI has disclosed to the House of Lords Communications and Digital Select Committee that they anticipate a rise in legal challenges against both entities. OpenAI has emphasized the necessity of utilizing copyrighted materials in the training of advanced AI models, stating that it would be unfeasible to develop cutting-edge AI without such resources, as indicated in their written submission to the committee focusing on LLMs.

The company highlights that copyright law now extends to various forms of creative works, encompassing blog posts, images, forum contributions, code snippets, and official records. OpenAI underscores the limitations of relying solely on public domain content from over a century ago, asserting that such materials are insufficient for meeting contemporary AI requirements. Despite these challenges, OpenAI asserts its compliance with intellectual property regulations when training its models, citing the fair use doctrine for utilizing publicly available web content in AI training, as outlined in a recent blog post responding to legal actions initiated by The New York Times.

While affirming its commitment to supporting and incentivizing content creators, OpenAI addresses concerns regarding the accessibility of its GPTBot web crawler to creators’ content. The company indicates ongoing efforts to collaborate with rights holders to establish mutually beneficial agreements and implement mechanisms for opting out of data training processes.

Plaintiffs in some lawsuits have accused OpenAI and Microsoft of profiting from copyrighted materials without compensating the original authors, despite the significant financial gains generated by the AI industry. In a recent case involving non-fiction authors, it was suggested that alternative funding models such as revenue sharing were available but not pursued by the companies.

Regarding specific allegations, OpenAI refutes claims of unauthorized use of The New York Times’ articles, asserting that discussions were underway for a partnership to access the publication’s content. The company expresses surprise at the legal action taken by the newspaper, emphasizing their commitment to ongoing negotiations until mid-December. OpenAI contests the allegations of providing verbatim excerpts of paywalled articles through its ChatGPT tool, attributing any issues to prompt manipulation by users seeking extensive content. Despite the legal dispute, OpenAI remains optimistic about fostering a productive relationship with The New York Times.

Visited 3 times, 1 visit(s) today

Last modified: January 9, 2024