It doesn’t require much effort to prompt GenAI to utter inaccuracies and falsehoods.
An instance from the past week exemplifies this, where chatbots from Microsoft and Google prematurely declared a Super Bowl victor even before the commencement of the game. However, the real concern arises when GenAI’s fabrications take a harmful turn, endorsing torture, perpetuating ethnic and racial stereotypes, and artfully articulating conspiracy theories.
Numerous vendors, ranging from established entities like Nvidia and Salesforce to emerging startups such as CalypsoAI, now offer solutions purported to combat detrimental and toxic content originating from GenAI. Despite these offerings, they remain opaque black boxes; without individually testing each one, it remains challenging to ascertain the efficacy of these anti-hallucination products and whether they truly live up to their assertions.
Recognizing this as a significant issue, Shreya Rajpal established Guardrails AI to address the predicament.
Rajpal expressed in an email interview with TechCrunch, “Most organizations… are grappling with similar challenges when it comes to responsibly deploying AI applications and determining the most effective solution. They often find themselves reinventing the wheel in terms of managing the risks that hold significance for them.”
Survey data supports Rajpal’s assertion, indicating that complexity, and consequently risk, stands as a primary obstacle hindering organizations from fully embracing GenAI.
A recent survey conducted by Intel subsidiary Cnvrg.io revealed that concerns regarding compliance and privacy, reliability, the high implementation costs, and a shortage of technical expertise were common among approximately a quarter of companies integrating GenAI applications. Another survey by Riskonnect, a provider of risk management software, disclosed that over half of executives expressed apprehension about employees making decisions based on erroneous information derived from GenAI tools.
Having previously worked at the self-driving startup Drive.ai and later at Apple’s special projects group post its acquisition of Drive.ai, Rajpal, alongside Diego Oppenheimer, Safeer Mohiuddin, and Zayd Simjee, co-founded Guardrails. Oppenheimer formerly spearheaded Algorithmia, a platform for machine learning operations, while Mohiuddin and Simjee held leadership roles in technology and engineering at AWS.
In essence, Guardrails’ offering does not deviate significantly from existing solutions in the market. The startup’s platform functions as a protective layer around GenAI models, particularly open source and proprietary ones like OpenAI’s GPT-4, to enhance their trustworthiness, reliability, and security.
Image Credits: Guardrails AI
Nonetheless, Guardrails sets itself apart through its open-source business model, with the platform’s codebase accessible on GitHub for free usage, coupled with a crowd-sourced approach.
By leveraging a marketplace known as the Guardrails Hub, developers can submit modular components termed “validators” that assess GenAI models for specific behavioral, compliance, and performance metrics. These validators are deployable, adaptable, and shareable among other developers and Guardrails customers, serving as the fundamental elements for tailoring custom solutions to moderate GenAI models.
Rajpal elaborated, “Through the Hub, our objective is to establish an open platform for knowledge exchange and to identify the most efficient path to further AI adoption — while also constructing a set of reusable guardrails that any organization can adopt.”
Validators available in the Guardrails Hub span from basic rule-based checks to algorithms designed to identify and address issues within models. Currently, there are around 50 validators, encompassing detectors for hallucinations and policy violations, as well as filters for proprietary data and insecure code.
“Many companies typically conduct broad, generic checks for profanity, personally identifiable information, and similar aspects,” Rajpal remarked. “However, there isn’t a universal definition of what constitutes acceptable usage for a specific organization or team. Each organization has its own set of risks that require monitoring — for instance, communication policies vary across organizations. With the Hub, we empower users to either adopt the solutions we offer as-is or utilize them as a robust starting point for further customization to suit their specific requirements.”
The concept of a hub for model guardrails presents an intriguing proposition. Nevertheless, skepticism arises regarding whether developers will actively contribute to a platform — especially a nascent one — without the promise of some form of compensation.
Rajpal remains optimistic, believing that developers will engage, if not for anything else, for acknowledgment and the altruistic motive of aiding the industry in progressing towards a “safer” GenAI environment.
“The Hub enables developers to observe the risks encountered by other enterprises and the guardrails they implement to address and mitigate those risks,” she added. “The validators serve as an open-source implementation of these guardrails that organizations can apply to their specific use cases.”
Guardrails AI, which currently does not charge for its services or software, recently secured $7.5 million in a seed funding round led by Zetta Venture Partners, with participation from Factory, Pear VC, Bloomberg Beta, Github Fund, and notable AI expert Ian Goodfellow. Rajpal disclosed that the funds will be utilized to expand Guardrails’ team of six members and initiate additional open-source projects.
“We engage with numerous entities — enterprises, small startups, and individual developers — who face challenges in deploying GenAI applications due to the lack of assurance and risk mitigation measures,” she continued. “This is a novel issue that has emerged on a large scale, attributed to the prevalence of ChatGPT and foundation models. We aim to be at the forefront of resolving this dilemma.”