As the focus on AI companies and their data practices intensifies, the founder of OpenAI has stressed the importance of accessing copyrighted material for developing innovative AI tools like the ChatGPT bot.
The training of AI models such as ChatGPT and image-generating programs like Stable Diffusion heavily depends on a vast array of data from the internet, a significant portion of which is safeguarded by copyright laws to prevent unauthorized use of intellectual property.
Recently, The New York Times filed a legal complaint against OpenAI and Microsoft, alleging the unauthorized use of their intellectual property in advancing AI technologies. It is noteworthy that Microsoft is a key investor in OpenAI.
OpenAI communicated to the House of Lords communications and modern select committee about the necessity of accessing copyrighted content for the development and training of large language models like the GPT-4 model, which powers ChatGPT.
The organization highlighted the difficulty of training state-of-the-art AI models without utilizing copyrighted materials, pointing out that copyright regulations now cover a wide range of creative works, including blog posts, images, software code snippets, and official documents.
OpenAI argued that relying solely on non-copyrighted resources for training AI models would lead to inadequately trained systems that cannot meet the needs of modern society.
In response to The New York Times’ legal action, OpenAI reaffirmed its commitment to respecting the rights of content creators and owners. The concept of “fair use,” which allows certain uses of copyrighted material without explicit permission, is often cited by AI companies to justify their use of copyrighted content.
From a legal perspective, OpenAI stated that copyright laws do not hinder the training of AI systems.
Legal challenges against OpenAI have been increasing, with prominent figures like John Grisham, Jodi Picoult, and George RR Martin among 17 artists suing the company for alleged large-scale fraudulent practices.
Furthermore, Stability AI, the company behind Secure Diffusion, is facing legal disputes in the US, England, and Wales for alleged trademark violations by Getty Images, a leading global photo archive owner. Anthropic, supported by Amazon and responsible for the Claude robot, is being sued by a group of music publishers, including Universal Music, for unauthorized use of copyrighted song lyrics in training its models.
OpenAI has expressed readiness to undergo independent scrutiny of its safety protocols, endorsing the practice of “red-teaming” AI systems where impartial researchers simulate malicious actors to assess the systems’ security measures.
In line with an agreement established during a global health summit in the UK, OpenAI is one of the companies that have committed to collaborating with governments to subject their most advanced models to thorough security testing both before and after deployment.