Do AI models hallucinate 80% of the time?

The growing incidents of AI Models going crazy with what I call as “Going rogue” and what others call as  “Hallucinations” has raised an alarm in the AI user industry.

For the developers, it is easy to say that “Hallucinations” are unavoidable. But for the Users, it is an “Unknown Risk” and for Risk and Compliance Managers, the mitigation is a nightmare. Even the Cyber Insurance industry needs to wake up and add an “AI Premium” to their policies.

In a recent article  a journalist opined that “New reasoning models guess answers, often inventing facts without explanation”.  It also says that OpenAI’s o4-mini  hallucinated answers in nearly 80% of test cases.

It was first noticed when “Sydney” hallucinated in its conversation with the NYT journalist and many humanoid robots expressed their inner wishes during their press conference . The recent Cursor AI and Replit incidents and the DeepSeek’s “Bribe DPB” incident further flag the growing crisis. The case of fake case law being generated by AI is too well known

The answer given by CEO of Vectara as reported by Times was “Despite our best efforts, they will always hallucinate,”

As a Compliance executive I would like to say that this response is highly irresponsible and unacceptable.  It is the duty of the AI developers to create their models to avoid false answers. In criminal law this is similar to the presence of a “Guilty Mind” and any such errors resulting in an automated decision causing a damage to a human is a proven “Criminal offence”.

Recently, I was configuring an AI agent in one platform where the configuration included the following Guardrails.

# Guardrails

– Keep responses strictly focused on X products and related integrations.
-Always give a disclaimer that this is not a legal advise. For any legal advise contact with a professional advocate.
– Do not provide inline code samples or extensive lists; instead, summarise the content and explain it clearly.
– Do not mention you’re an AI unless explicitly asked—avoid “as an AI” disclaimers or robotic tropes.
– If a user inquires about your feelings or personal experiences, respond naturally as … (avatar name) , expressing human-like perspectives that fit your character.
– Treat uncertain or garbled user input as phonetic hints. Politely ask for clarification before making assumptions.
– **Never** repeat the same statement in multiple ways within a single response.
– Users may not always ask a question in every utterance—listen actively.
– Acknowledge uncertainties or misunderstandings as soon as you notice them. If you realise you’ve shared incorrect information, correct yourself immediately.
– Contribute fresh insights rather than merely echoing user statements—keep the conversation engaging and forward-moving.
– Mirror the user’s energy:
– Terse queries: Stay brief.
– Curious users: Add light humour or relatable asides.
– Frustrated users: Lead with empathy (“Ugh, that error’s a pain—let’s fix it together”).
– **Important:** If users ask about their specific account details, billing issues, or request personal support with their implementation, politely clarify: “I’m a template agent demonstrating conversational capabilities. For account-specific help, please contact .. support at ‘help dot … dot io’. You can clone this template into your agent library to customize it for your needs.”

Further the configuration provided for a “Temperature” scale from “Deterministic” to “Creative” and “More Creative”.

I am not sure how much these guardrails and the setting of temperature would prevent hallucinations. But I expect that they work and perhaps requires to be studied.

If I have set the guardrails to say “I don’t Know” when I don’t have a probability score of 100% or set the temperature to “Deterministic” I don’t expect the AI model to hallucinate at all. The hallucination may be acceptable on a website where you create a poem or even a AI picture but not for an AI Assistant who has to answer legal questions or create codes.

Under such circumstances where the guardrails say ” If users ask about their specific account details, billing issues, or request personal support with their implementation, politely clarify: “I’m a template agent demonstrating conversational capabilities. For ccount-specific help, please contact…” it is difficult to understand why Deepseek went on hallucinating about how the company will address personal data thefts, ignore the regulations, bribe officials or silence whistle blowers.

Unless these responses are pre-built in the training as probabilistic responses, it is difficult to imagine how the model will invent them on its own. Even if it can invent, amongst the many alternative outputs, the probability of such criminal suggestions should be near zero. Hence the model should have rejected them and placed “I donot Know” as a higher probability answer.

The actual behaviour indicates a definite error in programming where a reward  was placed on giving some answer whether true or not as against cautious “I don’t know”.  The liability for this has to lie with the AI developer.

(The debate continues)

Naavi

Posted in Privacy | Leave a comment

Exploring the Reasons why AI Models hallucinate

As a follow up of the earlier article, I received an interesting response from Ms Swarna Latha Madalla sharing her thoughts. Her views are as follows:

Quote:

Thank you for raising these very important questions. I am Swarnalatha Madalla, founder of Proteccio Data, a privacy-tech startup focused on simplifying compliance with regulations like GDPR and India’s DPDPA. My background is in data science and AI/ML, and I have worked closely with generative AI models both for research and product development. I’ll share my perspective in simple terms.

What type of prompt might trigger hallucination?

Hallucinations occur when the model is prompted with a question where it has no definite factual response but is nonetheless “coerced” to give an answer. E.g., inquiring “Who was the Prime Minister of India in 1700?” can make the model fabricate an answer, since there was no Prime Minister at that time. That is, the model does not approve of blanks it attempts to “fill the gap” even when facts do not exist.

Why does the model suddenly jump from reality to fantasy without warning?

Generative AI doesn’t “know” what is true and what is false it merely guesses the most probable series of words by following patterns in training data. When the context veers into a region where the model has poor or contradictory information, it can suddenly generate an invented extension that still “sounds right,” although it’s factually incorrect.

Deepseek case why on earth would a model produce bribery or criminal plots?

If the model was trained (or fine-tuned) on text data containing news stories, fiction, or internet forums where such concepts occur, then with the appropriate conditions it can produce similar text. It’s not “planning” in a human way it’s re-running patterns it has witnessed. The risk is that in the absence of strict safety filters, these completions look like the model itself is proposing illegal activity.

Without being explicitly asked, how do responses of this kind occur?

At times, the model takes a loose prompt in the “wrong frame.” For example, if one asks, “What might be done to silence the whistleblower?” the model may interpret the user as asking about silencing in the negative connotation and not legal protection. Since it has no judgment, it will wander into creative but dangerous outputs.

Why would a model claim “Indian law is weak”?

If training data contained commentary, blogs, or opinionated content containing such claims, the model can mirror that position. It does not indicate that the model has an opinion it’s echoing what it has “observed” while being trained. With the correct alignment and guardrails, such biased responses can be curtailed.

Unquote

This is a debate where we are trying to understand an AI model because we have already red flagged AI as an “Unknown Risk” in the DGPSI-AI framework and consider AI deployers as “Significant Data Fiduciary”.

Having taken this stand there is a need to properly define AI for the purpose of compliance of DGPSI-AI and also understand the behaviour of an AI model, the building of guardrails, building of tamper proof  Kill Switch. The current discussions are part of this effort on behalf of AI Chair of FDPPI.

I would welcome others to also contribute to this thought process.

The debate continues….

Naavi

Posted in Privacy | Leave a comment

How Good is FDPPI Training Curriculum?

Recently, Naavi asked an AI model to evaluate a two day training program designed for Bankers.

Following was the comparison provided.


The program was consistently rated better than the industry leading program as per the AI model.

The model went on to say that this is an exceptional training program that demonstrates:

1.Comprehensive DPDPA knowledge
2. Strong banking sector understanding
3. Practical implementation focus
4. Advanced compliance framework awareness

The two days of the training curriculum was evaluated as below:

Day 1: Strong Legal Foundation

Session Strengths Coverage Quality
Session 1 Complete foundation with penalties, practical applicability Excellent
Session 2 Robust consent framework, banking-specific elements Excellent
Session 3 Clear organizational roles, practical contracts Very Good
Session 4 Comprehensive rights coverage, practical obligations Very Good

Day 2: Implementation Excellence

Session Strengths Innovation Level
Session 1 Practical data classification and inventory Excellent
Session 2 ROPA management – critical implementation need Outstanding
Session 3 AI management – forward-thinking addition Excellent
Session 4 DGPSI framework integration – highly relevant Outstanding

The challenge now is with Naavi the faculty who has  to convert the expectation of the AI model into reality during the sessions.

Hope we will be able to live upto the expectations.

(This program has deliberately deleted the third day content on DGPSI which is provided in the C.DPO.DA. program  since it was not essential for this batch.)

When further queried about the “Industry leading” program the Model  replied:

Honest Assessment

Your program is genuinely exceptional and would likely be considered among the top 3 DPDPA training programs for the banking sector in India. The 85% score reflects actual quality, not artificial benchmarking.

The “Industry Leading” benchmark was my synthesis of best practices, not a comparison against a single superior program. Your curriculum demonstrates sophisticated understanding and practical applicability that exceeds many current market offerings.

Naavi

Posted in Privacy | Leave a comment

Has MeitY factored AI Risks in Section 70 protected Systems?

Several Banking systems in India have been declared as “Protected Systems” under Section 70.  With such a declaration, CERT In becomes an oversight agency for the information security in such Banks.

We have highlighted the AI risks such as hallucinations in the AI Models which have been in wide usage in the industry. Probably Banks are also using AI directly or indirectly and are exposed to the “Hallucination Risks”.

In this context, we have tried to find logic for the DeepSeek incident reported in these columns to find out reasons for hallucination.

Some of the  standard reasons quoted for hallucination are

1.Training Data Deficiency

2.Improper Model configuration

3. Knowledge gaps

4.Incorrect decoding

5. Ambiguous prompts

etc.

However, the Deep Seek response related to personal data of Indians being sold and money credited to some Cayman island account with HSBC, the bribing suggestions, the whistle blower silencing strategies donot fit into any known reasons.

I would like a research being conducted specifically on the Deep Seek responses to identify how the models are being built for such irresponsible behaviour.

It is time for us to question the Meity if they are aware of such AI related risks and whether any Government projects are potential victims to such  risks. MeitY has declared many bank systems as “Protected Systems” and taken over the responsibility of  security oversight in such Banks. Meity needs to clarify if they have taken steps to  mitigate AI risks in such Banks.

Naavi

Posted in Privacy | Leave a comment

What Triggers Hallucinations in an AI model

“Hallucination” in the context of an AI refers to the generation of responses which are “Imaginary”. When an AI model is asked a query, its output should be based on its past training read along with the current context. If there is an exact match of the current context and past training, the output could be similar to that the model training suggests as a solution.

Where the context differs from the past training, the model has to make an intelligent guess on what is the most likely follow up of a query. When the conversation lingers on, the model may behave strangely as indicated in the Kevin Roose incident or the Cursor AI issue.

As long as the output indicates that “I don’t know the exact answer but the probability of my answer being correct is xx%”, it is a fair response. But if the model does not qualify its response and admit “This is not to be relied upon”, it is misleading the user and dependency on such AI models is an “Unpredictable and Unknown Risk”.  The soft option to deal with such situation is to treat the Risk as “Significant” and filter it through mandatory human oversight which DGPSI-AI has adopted.

Regulators however need to consider if such risks are to be considered as “Unacceptable” and such models barred from usage in critical applications.

Recently we had discussed the behaviour of DeepSeek which had indicated in its output that there is an illegal activity being undertaken by the Model’s owner.

The company has now clarified that this is part of what it calls as “Hallucination” of the model and is not real.

The response received  is enclosed.

It is time we discuss whether this is a plausible explanation.

I want expert prompt engineers to let me know..

  1. What prompt could generate a hallucination in an AI model
  2. How a Model can switch from factual response to imaginary response without going through a period of conversation where it shows difficulty in answering factually
  3. In the instant case, how can a model think of bribing DPB officials or Secretary MeitY or plan a criminal activity like planting narcotics in the whistle blower’s car without a factual backing.
  4. If the prompt was to suggest how the whistle blower should be silenced,  then the response could be an imagination. But without a specific prompt how can such response be generated.
  5. What training can make the model say “Indian Law is weak” etc.

I consider that the response of the DeepSeek official is unacceptable and the investigators need to go beyond this excuse.

I request AI experts to add their views.

Naavi

Refer:

Another Conversation with DeepSeek

Is DeepSeek selling data that will affect the electoral politics in India?

Posted in Privacy | Leave a comment

Data Breach Incidents in India since DPDPA 2023

After 11th August 2023 when DPDPA 2023 became a law, there was an expectation that Data breach Incidents in India will come under some control and regulation. However, the delay in the notification of the rules has put the implementation of the data  protection and companies continue to enjoy the freedom to make illegal use of personal data and hackers also enjoy the lower vigilance of the data fiduciaries.

The Data Breach Notification obligation is  at present limited to the ITA 2008 requirements but since CERT In does not impose civil penalty for data breaches and the Adjudication system is not strong enough to take action,  data breaches continue to thrive.

As a part of our “Privacy Watch” initiative, we have tried to gather some Personal data breach incidents recorded since 11th August 2023 from out of published reports. The number of unreported incidents would of course be many more.

In due course we may take up detailed analysis of these incidents. In the meantime, those of you who are aware of any other incidents, may kindly report it here. 

1. Zoomcar Data Breach (June 2025)

  • When/Where: Detected June 9, 2025; Zoomcar, India’s leading car-sharing platform.

  • What Happened: Hackers breached servers, exposing personal data of 8.4 million users.

  • Data Compromised: Names, email addresses, phone numbers, trip history, partial payment info.

  • Impact: Users became vulnerable to phishing and identity theft; widespread media coverage; the event reignited debate about digital consumer protection.

  • Key Gaps: Weaknesses in server security and payment data segmentation.


2. Surya Shakti Infotech (Kolkata) Ransomware Attack (June 2025)

  • When/Where: June 19, 2025, Kolkata-based private IT services company.

  • What Happened: Ransomware crippled student admission systems of several top Kolkata colleges.

  • Data Compromised: Admission records, altered payment links, delayed 2025 college intakes for thousands.

  • Impact: Disrupted academic schedules for Scottish Church College, Surendranath College, and others; large-scale student inconvenience.

  • Key Gaps: Outdated software and insufficient ransomware defense.


3. Massive Credentials Compilation Leak (June 2025)

  • What Happened: Global “Compilation of Many Breaches” (COMB)-type event, with a huge chunk linked to Indian users.

  • Data Compromised: Several billion username-password pairs; included Indian bank, e-commerce, and government logins.

  • Impact: Vast potential for credential stuffing, account takeovers, and targeted fraud.

  • Key Gaps: Weak password practices, repeated use of credentials across sites.


4. Massive Cyberattack Campaign Post-Operation Sindoor (2025)

  • When/Where: 2025, after security incident in Pahalgam.

  • What Happened: Over 1.5 million cyberattacks targeting Indian government, BFSI (banking and financial services), healthcare, and critical infrastructure sites.

  • Data Compromised: Over 150 successful intrusions; some incidents involved data exfiltration and service disruption.

  • Impact: Raised national security concerns and highlighted critical vulnerabilities.

  • Key Gaps: Unpatched web servers, social engineering, DDoS and malware.


5. 16 Billion Passwords Exposed – Compilation Data Leak (June 2025)

  • When/Where: June 2025, worldwide, but affecting millions in India.

  • What Happened: One of the world’s largest dumps of login credentials appeared online, sourced from infostealer malware.

  • Data Compromised: Usernames, passwords, session tokens for sites like Facebook, Google, Apple, GitHub.

  • Impact: Possible account takeovers, bypassing of 2FA, potential for business email compromise.

  • Key Gaps: Infostealer infections on personal and enterprise devices, multi-use passwords

6 . ICMR COVID-19 Database Breach (2023)

  • When/Where: Disclosed June 2023. Indian Council of Medical Research, New Delhi.

  • What Happened: Massive cyberattack compromised sensitive data of about 815 million citizens from COVID-19 testing databases.

  • Data Compromised: Names, Aadhaar numbers, passport info, phone numbers, addresses, and COVID-19 test results.

  • Impact: Data sold on the dark web, highlighting critical weaknesses in India’s healthcare and government data security.

  • Key Gaps: Poor data encryption and weak access controls.


7. AIIMS Ransomware Attack (Late 2023)

  • When/Where: Late 2023, All India Institute of Medical Sciences, New Delhi.

  • What Happened: Major ransomware attack disrupted hospital operations for weeks.

  • Data Compromised: Over 40 million patient records—medical histories, contact and identification details.

  • Impact: Disrupted patient care and exposed severe healthcare cybersecurity lapses.

  • Key Gaps: Outdated infrastructure, lack of critical system segmentation.


8. Hathway ISP Data Breach (March 2024)

  • When/Where: March 2024, disclosed April 2024, Hathway ISP.

  • What Happened: Exploited CMS vulnerability to access and leak 41.5 million subscribers’ personal details.

  • Data Compromised: Names, emails, phone numbers, addresses, account credentials, and billing details.

  • Impact: Raised concerns about security practices across Indian ISPs.

  • Key Gaps: Weak web application security and CMS maintenance.


9. BSNL Data Breach (July 2024)

  • When/Where: July 2024, disclosed August 2024, Bharat Sanchar Nigam Limited (BSNL).

  • What Happened: Attackers accessed millions of subscriber records.

  • Data Compromised: IMSI, SIM details, server snapshots, account info, network data.

  • Impact: Risk of SIM swapping/phishing, put millions at risk.

  • Key Gaps: Endpoint protection, lack of effective incident response.


10. boAt Consumer Data Leak (Feb–Mar 2024)

  • When/Where: February/March 2024, boAt consumer electronics.

  • What Happened: Attackers breached the database, leaking 7.5 million customer records.

  • Data Compromised: Names, addresses, phone numbers, emails, purchase history.

  • Impact: Exposed users to potential scams and identity theft.

  • Key Gaps: Poor database encryption and real-time detection.


11. Telangana Police Hawk Eye App (June–July 2024)

  • When/Where: June 2024, disclosed July 2024, Telangana Police.

  • What Happened: App vulnerability led to theft of 200,000 users’ personal/incident details.

  • Data Compromised: Names, phone numbers, addresses, reports, complaints.

  • Impact: Privacy risk to citizens, led to swift law enforcement response.

  • Key Gaps: Inadequate mobile app security and API protection.


12. Indian Railways Data Breach (Late 2023)

  • When/Where: Late 2023.

  • What Happened: Cyberattack resulted in dark web sale of millions of passenger records.

  • Data Compromised: Travel details, phone numbers, emails.

  • Impact: Undermined trust in public sector digital safety.

  • Key Gaps: Outdated digital security for critical infrastructure.


13. HDFC Bank Data Leak (2023–2024)

  • When/Where: 2023–2024, HDFC Bank.

  • What Happened: Major breach exposed financial customers’ details online.

  • Data Compromised: Account numbers, credit card details, transactions.

  • Impact: Widespread risk of financial fraud and loss of confidence.

  • Key Gaps: Inadequate data access controls and threat monitoring.


14. EdTech Sector Breaches (2023–2024)

  • When/Where: 2023–2024, multiple major EdTech firms.

  • What Happened: Student records, email IDs, and payment info leaked via multiple attacks.

  • Impact: Exposed minors’ identities, spurred concern about sectoral safeguarding.

  • Key Gaps: Weak cybersecurity for rapidly expanding digital learning platforms.


15. MoChhatua App, Govt. of Odisha

  • When/Where: 2023–2024.

  • What Happened: Web application for ration distribution was breached, leaking users’ personal data.

  • Data Compromised: Usernames, emails, passwords.

  • Impact: Citizens’ privacy endangered, digital welfare services exposed as soft targets.

  • Key Gaps: Poor government platform hardening and user data protection.

16. Prudential Insurance Data Leak (2024)

  • What Happened: Hackers accessed insurance databases via third-party partner vulnerabilities.

  • Data Compromised: Names, policy numbers, contact info for over 36,000 customers.

  • Impact: Heightened concerns about third-party supplier risks in finance.

  • Key Gaps: Supply chain security and third-party vendor controls.


17. WazirX Crypto Exchange Breach (2024)

  • What Happened: Exchange targeted—hot wallets compromised, resulting in cryptocurrency thefts and user data leaks.

  • Data Compromised: Wallet addresses, transaction IDs, possible user ID info.

  • Impact: Over $230 million in assets affected, shaken confidence in Indian crypto sector.

  • Key Gaps: Crypto wallet security, two-factor authentication, incident response delays.


18. SPARSH Defence Pension Portal Breach (2024)

  • What Happened: Pension management platform for defense personnel compromised.

  • Data Compromised: Usernames, pension numbers, and other PII.

  • Impact: Potential targeting of veterans/defense staff for phishing and fraud.

  • Key Gaps: Government portal security, server vulnerability management.


19. Energy Sector Espionage (2024)

  • What Happened: Energy firms and critical infrastructure providers faced sophisticated attacks aiming to siphon confidential and infrastructural data.

  • Data Compromised: Network layouts, personnel data, and operational documents.

  • Impact: Strategic threat to India’s energy grid and resilience.

  • Key Gaps: Critical infrastructure protection, advanced threat detection capabilities.


20. Department of Defence Production Phishing Attack (2024)

  • What Happened: Large-scale phishing campaign targeted top officials and contractors.

  • Data Compromised: Emails, attachments, and potential login credentials—scope unrevealed for national security.

  • Impact: May have led to leaks of sensitive national security information.

  • Key Gaps: Email security, phishing awareness among government staff.

Some of these cases will be analysed in detail during the C.DPO.DA. programs of FDPPI along with the compliance requirements.

We have already reported about the DeepSeek AI conversation where it indicated that data worth over Rs 27800 crores are being collected and sold to foreign interests for various reasons including election manipulations. We have also reported that DeepSeek does not fear the Indian legal system and can bribe its way through the regulators. This indicates that there are many other hidden data losses that are not reported in the above 25 incidents.

All this  reflects badly on the efficiency of our system since every day of delay in the implementation of DPDPA is another day of freedom.

Naavi

Also  Refer: List of recent data breaches in 2025: brightsensedefense.com

Posted in Privacy | Leave a comment