There is a new report claiming that Apple’s generative AI, specifically its “Ajax” large language model (LLM), may be one of the few AIs that has been both legally and ethically trained. Copyright is currently somewhat of a minefield in this regard. Apple is alleged to be attempting to preserve protection and legality standards through the use of cutting-edge training techniques.
In the era of conceptual AI, trademark law is challenging to understand and is becoming more and more crucial as AI tools become more prevalent. One of the most obvious problems that arises frequently is that many businesses use copyrighted materials to train their huge language models (LLMs), with the majority of the time not disclosing whether or not they have licensed the training material. Oftentimes, the outputs of these versions include whole sections of trademark-protected works.
The current rationale for copyrighted materials being used by some of these companies to teach their LLMs is that, not unlike humans, these models require a lot of information (known as training data for LLMs) to learn and make clear and compelling responses. And as far as these companies are concerned, copyrighted materials are good game.
If technology companies use works in education and LLM production without making any obvious agreements with copyright holders or their representatives, many conceptual AI critics believe that this is a rights violation. Even so, this criticism hasn’t dissuaded tech firms from doing the same, and it’s assumed that this will continue to be the case with most AI tools, which are causing a growing resentment toward generative AI businesses.
The conceptual AI bush of legal battles and moral dilemmas
The Apple method of ethical AI training (which we are currently aware of)
At least one major tech player appears to be trying to avoid as many legal (and moral!) problems as possible. challenges as possible- and somewhat surprisingly, it’s Apple. Apple has been diligently pursuing licensing the works of major news outlets when looking for AI training materials, according to Apple Insider. Apple filed a petition in December to use the archives of several major publishers as training materials for its own LLM, also known as Ajax.
Ajax is rumored to be the software for upcoming Apple products’ basic on-device functionality, and it may choose to license other software like Google’s Gemini for more sophisticated features, like those that call for an internet connection. According to Apple Insider, this lets it avoid a few copyright infringement claims because it wouldn’t be held accountable for copyright violations by, let’s say, Google Gemini.
A March article described how Apple plans to train its internal LLM through a carefully selected selection of images, image-text, and text-based input. Apple’s methods both prioritized better image captioning and multi-step reasoning while also paying attention to privacy preservation. The Ajax LLM gets even more benefit from this because it is entirely on-device and doesn’t require an internet connection. There is a trade-off, as this does not allow Ajax to connect to online databases that store copyrighted material and prevent it from checking for copied content and plagiarism itself.
There is one more caveat that Apple Insider reveals about this when speaking to people who are familiar with Apple’s AI testing environments: there don’t currently appear to be many, if any, restrictions on users using copyrighted material themselves as input for on-device test environments. Additionally, it’s important to note that Apple isn’t the only company using a rights-first approach: Adobe Firefly, an art AI tool, is also claimed to be completely copyright compliant, so it’s anticipated that more AI startups will be wise enough to do the same.
I personally applaud Apple’s decision to approach this situation because I believe that human creativity is one of our most amazing talents and should be celebrated rather than given to an AI. We’ll have to wait to find out more about Apple’s policies on copyright and how to train AI, but I concur with Apple Insider’s assessment that this definitely sounds like a step up, especially given that some AIs have been documented repurchasing copyrighted material word for word. We can anticipate hearing more about Apple’s generative AI initiatives very soon, which are anticipated to be a key driver for its developer-focused software conference, WWDC 2024.