Digital detox for hallucinating chatbots and other AI fails

When Air Canada recently landed in a spot of trouble, it couldn’t really blame it on human error. A Canadian court held the aviation major liable for the wrong information doled out by its chatbot to a traveller. Similar was the plight of a US-based healthcare whose generative AI-powered tool went rogue and leaked sensitive health information. Even the tech giant Google had its moment of digital reckoning when its GenAI tool Gemini struggled to come up with unbiased responses.

As AI becomes mainstream, the issue of testing and regulating the technology gains prominence, and this burgeoning need inspired Gaurav Agarwal to give up his old career and launch RagaAI. The US- and Bengaluru-based start-up has developed an automated platform to detect AI-related troubles, diagnose them and fix them before they can do any damage.

“If a company like Google faces challenges with its AI, imagine the enormity of the issue,” says Agarwal, who is the company CEO. He was earlier part of Nvidia’s autonomous driving team in the US, and has had stints at Texas Instruments and, in India, at Ola Electric. Drawing comparisons with automobile testing, he says AI systems and language learning models (LLM) must undergo equally rigorous checks before they go live. “A company cannot afford to be reactive to AI/LLM failures; they need to be tested exhaustively in-house before deployment,” he says.

For instance, an e-commerce client used RagaAI to fine-tune an open-source LLM for its customer service chatbot. With the chatbots hallucinating — offering inaccurate or inappropriate responses based on nonexistent patterns or objects — 20 per cent of the time on average, Agarwal says RagaAI detected, diagnosed, and resolved the hallucinations, minimising costly errors.

Simulated reality

Apart from AI systems and LLMs, RagaAI’s tools also test computer vision, structured data, and other audio models. It does this by simulating real-world scenarios to check the training data — used in machine learning — for biases and variations. Further, the errors detected are categorised by severity, and the root cause is pinpointed. “This could be faulty training data, flaws in the model’s design, or bugs in the code,” Agarwal explains. The start-up recommends solutions such as data correction or model retraining, and then closes the loop through iterative testing to confirm that the fixes work.

Also read

Next Skills 360’s coding kit for visually challenged students

Inclusive program: Leaving no child behind in software skills

HanuAI co-founders (from left) Rahul Kalra, Prerna Kalra and Manav Singal

A road inspector with AI prowess

Indian firms await clarity on the country’s evolving data protection laws on AI deployments

Why India Inc is slow to adopt AI tech

Agarwal says the company has 10-15 customers and has begun generating revenue in the past few months. The clients range from retail and e-commerce to aerospace, among other sectors. In January the start-up raised $4.7 million in a funding round led by deeptech-focused investor Pi Ventures, with participation from Anorak Ventures, TenOneTen Ventures, Exfinity Venture Partners and others. In March, it expanded its testing platform by launching ‘RagaAI LLM Hub’, an open source and enterprise-ready platform to monitor LLMs across 100 metrics.

“As AI adoption grows, regulations and compliance requirements will become stricter. We are already seeing a rise in enquiries and anticipate a surge in demand,” he says.

COMMENT NOW

Comments

Comments have to be in English, and in full sentences. They cannot be abusive or personal. Please abide by our community guidelines for posting your comments.

We have migrated to a new commenting platform. If you are already a registered user of TheHindu Businessline and logged in, you may continue to engage with our articles. If you do not have an account please register and login to post comments. Users can access their older comments by logging into their accounts on Vuukle.

Digital detox for hallucinating chatbots and other AI fails