Google utilized Google Docs, which are freely accessible, to train its artificial intelligence (AI).
Google employed publicly available Google Docs to educate its AI. The accessibility of these Google Docs raises the question of what precisely constitutes a “publicly accessible” document. Google has assured users that their content is not scraped if the sharing settings are configured to “anyone with the link.”
For me, utilizing Google Docs is a habitual and somewhat indulgent method of organizing and categorizing information. Approximately 75% of my Google Drive comprises documents labeled “Untitled.” Whether capturing notes during a visit, commencing an article, jotting down fragmented thoughts, or saving snippets like words or links for future reference, I consistently initiate new documents.
Consequently, my Google Drive is inundated with haphazardly written documents, some frequently revisited, while most remain unshared. However, documents pertaining to work-related tasks are often shared with readers or colleagues. Typically, I facilitate access by adjusting the sharing settings to allow anyone with the link to view the document.
The recent revelation that Google was leveraging publicly available Google Docs for AI training prompted concerns about the privacy of personal content. Does this practice encompass my documents as well?
Google Docs sharing options present users with two primary choices: sharing with specific individuals via email addresses, limiting access to only those with the link, or enabling access for anyone with the link. The enterprise edition offers a third option, allowing sharing within the company.
The decision to share a document with anyone possessing the link does not categorize it as “publicly accessible,” as clarified by a Google representative to Business Insider. True public accessibility entails publishing the document on a website or sharing it on social media platforms. This distinction prevents web crawlers from indexing privately shared files, as Google emphasized.
To illustrate, a Google Doc publicly posted online would meet the criteria of being officially accessible.
Therefore, unless documents are actively shared on public platforms like Twitter or Facebook, they are not considered officially available for AI training purposes.
On February 28, Axel Springer, the parent company of Business Insider, alongside 31 other media organizations, initiated a $2.3 billion lawsuit against Google in a Dutch court, citing losses attributed to the company’s advertising practices.