Posted in Privacy | Leave a comment

Why the Supreme Court’s AI Draft Needs to Rethink “Court Data” and “Sensitive Judicial Data

(This is in further continuation of the previous article on definitions in the SC draft regulations)

The draft Supreme Court Regulations for the Use of Artificial Intelligence in Courts, 2026  (SCAIF) appears to treat “Sensitive Judicial Data” as a proxy for privacy risk and “Court Data” as a proxy for operational ownership.

These are two genuinely different ideas, and by mapping its protections onto them, the draft ends up conflating privacy-sensitivity with judicial-sensitivity.

The consequence is that some judicial information may be over-protected with measures designed for personal data, while other information — sensitive for reasons that have nothing to do with privacy — may be under-protected. Almost every inconsistency in the drafting flows from this single confusion.

What the draft actually says

Three definitions matter. “Data” carries the same meaning as in Section 2(h) of the Digital Personal Data Protection Act, 2023 — “a representation of information, facts, concepts, opinions or instructions in a manner suitable for communication, interpretation or processing by human beings or by automated means.”

“Court Data” is defined broadly and by source: any data generated by, or in the possession of, a Court. It is, in effect, an ownership concept.

“Sensitive Judicial Data” is defined differently. It covers (i) any personally identifiable information of parties, witnesses or legal representatives, and (ii) any information processed in connection with a Court process, the unauthorised disclosure of which may cause harm. “Court process” is itself drawn very widely, extending to filing, scheduling, hearing management, evidence handling, legal research, drafting, translation, transcription and record management. “Harm,” meanwhile, is defined to include damage both to the reputation or rights of an individual and to those of an institution.

The ambiguity is in the trigger, not in an omission

At first glance  it appears  that litigation content , pleadings, written arguments, judicial notes, draft orders, evidentiary records — has simply been left out of “Sensitive Judicial Data.” But  part (ii) of the definition, read together with the expansive meaning of “Court process,” may be argued as including such content.

The real problem is subtler. Whether litigation content qualifies turns entirely on a vague and subjective test — whether its “unauthorised disclosure may cause harm” — and it is not even clear whether that harm qualifier attaches only to part (ii) or also governs the personal-information . So the question is not whether pleadings and draft orders are categorically excluded, but on what uncertain basis they are sometimes in and sometimes out. A definition whose reach depends on a case-by-case harm assessment, undefined in method, is a fragile foundation for a protection regime. That is the ambiguity that needs fixing .

The mismatch this produces

The drafting consequences confirm the diagnosis. As the draft is structured, Sensitive Judicial Data is accorded the highest standard of protection under Section 10. Section 46 brings both Sensitive Judicial Data and Court Data within purpose limitation, but does not subject either to sovereign-cloud deployment. And Section 48’s data-localisation requirement is applied only to Sensitive Judicial Data — which, on the privacy-led reading, may exclude precisely the non-personal litigation content one would most expect to keep on Indian soil.

The clearest way to see the gap is to imagine material that is highly sensitive yet contains no personal data at all such as a draft judgment leaked before pronouncement, a judge’s deliberative notes, a sealed-cover national-security submission, or a trade secret disclosed in a patent dispute. None of these is obviously “Sensitive Judicial Data” under a privacy lens, yet each demands the very highest protection. If localisation and sovereign-cloud obligations track a personal-data concept, this category can slip through — protected, if at all, only by the discretionary harm test.

Why there is a problem

There may be a reason behind the choice. Personal data is where the DPDPA’s liability actually bites; tying the strongest safeguards to personal information aligns the regulation with the statute that will be enforced against the Courts as data fiduciaries. That is understandable. But judicial information is sensitive for reasons that extend well beyond privacy, such as, evidentiary integrity, judicial confidentiality, institutional security and national interest among them. A framework that measures sensitivity primarily through a privacy lens will systematically misjudge the material whose sensitivity has a different source.

The harder questions the classification must answer

A coherent scheme also has to take a position on open justice, which a privacy-led definition tends to obscure. The open-justice principle, reflected, for the Supreme Court, in Article 145(4) of the Constitution  means that judicial records are not confidential by default merely because they contain personal information. Once such records are lawfully in the public domain, restricting access to them should ordinarily require a specific justification: privacy, security, victim protection or an overriding public interest.

It can fairly be argued that merely filing a petition does not place the petitioner’s data “in the public domain” in the full sense, and that an in-camera hearing changes the position entirely. But absent such circumstances, there is little logic in treating every petitioner as automatically entitled to confidentiality under privacy law. The masking of data relating to women and minors, including in matters concerning the commission of offences, and not only where they are victims , is sometimes defended under a “right to be forgotten.” It is worth being candid that this right, in the context of court records, is unsettled in India and applied inconsistently across the High Courts; it is judge-made rather than clearly statutory. And it carries a real cost: indiscriminate masking degrades the accuracy of the judicial record for legitimate research. A more disciplined approach would confine masking to the data of victims and similarly vulnerable persons, rather than extending it to petitioners as a class.

Those vulnerable categories themselves deserve explicit treatment. The personal data handled by Courts is not uniform. The data of minors, of whistleblowers (who face retaliation risks that ordinary witnesses do not), and of foreign nationals (whose information raises cross-border-transfer and diplomatic dimensions under Section 16 of the DPDPA) all warrant differentiated handling. A flat “personal information” category cannot capture these gradations.

A more coherent model

The fix is not to abandon classification but to stop loading two different ideas onto one ladder. The draft’s instinct to separate ownership (“Court Data”) from sensitivity is correct; the trouble is that its sensitivity concept is really a privacy concept in disguise.

A cleaner model would classify information along clearly distinct dimensions:

First, ownership and source — the “Court Data” axis, identifying what the Court has generated or holds.

Second, confidentiality level — a graded scale running from Public, through Confidential and Restricted, to Highly Restricted. This is where draft judgments, deliberative notes and sealed-cover material belong, regardless of whether they contain personal data.

Third — and this is the dimension the present draft folds into the second , a personal-data attribute that cuts across the confidentiality scale, flagging information governed by the DPDPA and identifying its vulnerable subcategories (minors, victims, whistleblowers, foreign nationals and the like). A public judgment that happens to name a minor is still “Public” on the confidentiality axis, but carries a personal-data flag that triggers specific obligations. Treating personal-data status as a rung on the confidentiality ladder — as the current drafting implicitly does — simply repeats the original conflation.

Keeping these three apart lets the regime do what a single concept cannot: protect a non-personal draft judgment as fiercely as it protects a witness’s address, while still allowing a public judgment to remain public.

In summary it can be stated, The classification of judicial data should be harmonised with the principles emerging from the DPDPA, 2023 and the Information Technology Act, 2000, while recognising that judicial information governance extends beyond personal-data protection to encompass evidentiary integrity, judicial confidentiality, institutional security and national interest. The definitions of “Court Data” and “Sensitive Judicial Data” — and the protections in Sections 10, 46 and 48 that depend on them — would benefit from being revisited and re-anchored in a comprehensive judicial information governance framework, one that distinguishes clearly between privacy-sensitivity and judicial-sensitivity rather than blurring the two. The draft has identified the right problem. It now needs the right axes to solve it.

Naavi

Posted in Privacy | Leave a comment

Follow this website for AI related Risk identification

FDPPI’s AI Chair would like to draw attention of readers to to https://reglegbrief.com/ a website which provides many useful information to AI auditors.

RegLegBrief is the public research output of Verdus Technologies Pte. Ltd., a Singapore-incorporated regulatory technology firm.

The Company has been documenting failures of AI systems and  identifying patterns of failures for the guidance of other professionals.  The company is documenting hallucination so that previously probabilistic outcome becomes a predictable one, and a predictable failure mode becomes a designable safeguard.

The  Company releases press briefings, maintains a hallucination register and runs a briefings bloc.

It would be interesting to conduct a more detailed research on the briefings provided on the website. Probably the briefings provided on the website may help us understand the risk of model drift and model collapse which could occur over time.

Naavi

 

 

Posted in Privacy | Leave a comment

AI exposes the Big4 myth: Attention Board Directors !

I came across an interesting Linked in post today. I could have just given a link to the post and moved on. But I felt that it is better to reproduce the entire post just to ensure that readers donot miss the essence of this post…It  requires guts to write such posts and I appreciate Mr Nicholas P for  his post. (I hope Mr Nicholas has no objection for reproducing his post here)

Quote:

When a Big 4’s own AI report is filled with AI hallucinations – it’s clear the standards need to catch up.
In October 2025, KPMG published a glossy report about the miracle of agentic AI, and last week was exposed for hallucinating the very thing it was selling.
Forty of forty-five citations were fake, half the claims were invented, and the report contradicted KPMG’s own survey numbers – even citing a 2019 East Japan Railway press release as evidence of agentic AI, blowing enough smoke to give a new definition to full steam ahead.
We’ve seen this movie before, and it ended with everyone else footing the bill. Deloitte fed AI hallucinations into a government report, cut a refund, and walked – except a refund doesn’t un-spend the taxpayer money, doesn’t un-make the decisions built on the fiction, and teaches every firm watching that the cost of getting caught is pocket change.
KPMG, in a way, sold trust. A global concrete icon – a hundred and thirty years old, a quarter-million people across a hundred and forty countries, four letters that meant a banker in Frankfurt could believe a ledger in Singapore without ever shaking the hand that wrote it.
Trust is a belief about a partner’s goodwill; assurance is what you sell when that belief is gone – and it only works if you can still trust the firm selling it, which is seemingly a loan the Big 4 are starting to default on.
Buried in the report, KPMG found that integrity was the number-one driver of “customer loyalty”. A belly-chuckle of a finding, dripping with irony, 8 months before its own integrity disintegrated in public. And somewhere down the line, everybody will be asking:
What exactly were we paying for when we said we were buying trust?

Unquote

What Mr Nicholas has written applies to all consultants including Naavi and FDPPI. When we use AI for assistance, we should ensure that we keep our human control in tact. If not, we will churn out decisions which to our customer’s look like a human decision but it actually is an AI output passed through a human zombie.

We have been discussing the use of AI in Judiciary for the last week and in the last article we have debated about how to identify AI before we regulate them. The biggest challenge here is to identify AI elements “embedded” in what appears to be permissible software tools because we donot know what libraries are called in the background and where an AI hallucination can sneak in.

We have found that many companies have a blind faith in Big4 and pick them in preference to others because of the reputation which the Directors feel will cover up for their inability to understand the task. They trust the Big4 as if what ever they say is the truth. Even the Government agencies may have a similar inclination.

The article of Mr Nicholas exposes how hollow this trust is since what they may receive as consultancy may not be the brilliance of the persons who come to present their recommendations and carry IIT or IIM degrees  but the hallucinated AI outputs.

When companies invest in such services and further depend on them for their business, the shareholders have the right to ask them if the Board of Directors are really doing their job or they have to be replaced by an AI board (remember Mika of Dictador or Diella of Albania). We have of often quoted these innovative use of humanoid robots with “Sophia” the humanoid robot which got citizenship of Saudi Arabia as questionable decisions.

But on second thoughts it appears that Sophia decision is relatively less risky than Mika which  is less risky than Albania. This is because the decision making capacities are different in different models. Mika can hallucinate and ruin the company and Diella can hallucinate and ruin a country. May be Sophia is less powerful.  But all these are examples which we in India need to learn before we reflect on ET recommendation that Companies should install an AI agent for DPDPA compliance.

There is a need to adopt a policy of “Restrained Innovation” and not pursue “Innovation over Restraint” which is advocated by technology companies and often endorsed by NASSCOM and MeitY.

Unless a user of AI is able to read and understand every line of text created by an AI generative system, he will be inviting a trap when he uses AI either to write a software code or to draft a pleading in a Court or to write a consultation report on a project.

Directors need to have the ability to ask the right questions to the consultants before accepting their reports. In a  recent discussion at AIDAI (Association of Independent Data Auditors) we discussed the need for

a) Scope of a DPDPA audit to be written independently by some body other than the auditor or the company.

b) Peer reviewing a DPDPA compliance audit report.

These are principles of “Independence” incorporated in AIDAI Code of Conduct for their empanelled Auditors that distinguishes AIDAI from any of the Big4 or other auditors.

Companies who are looking for DPDPA auditors should therefore factor in the expertise available in AIDAI at least for a “Review of an Audit” already done even if it is by a Big 4.

Naavi is in the process of developing a “DGPSI Framework for Review of  a DPDPA Audit”. Perhaps it will be discussed in the next CIDA (Certified Independent Data Auditor)  training.

Hope companies who have done their present DPDPA gap assessment from a Big4 should think of a “Review” with AIDAI empanelled Independent Data Auditors who are not aligned even with NASSCOM and are not under the influence of NASSCOM controlling Big Tech  Companies.

Ponder….

Naavi

Posted in Privacy | Leave a comment

DGPSI updates the AI definition

Following the comparison of AI definition in the Supreme Court AI framework and the DGPSI definition a need for refinement of the AI definition for DGPSI auditors has arisen.

At present, the DGPSI definition of AI followed the description provided under this article” Defining  of AI: DGPSI approach” . Now the revised thought is as follows.

The objective of the revised DGPSI definition presented here is designed to keep the framework’s distinctive strength , that it triggers on the loss of human control rather than on a list of technical features , while curing the four ambiguities the earlier analysis identified: the undefined “acceptable threshold,” the mismatch between the control-based headline and the capability-based classes, the unstated relationship between the classes, and the “code-correcting” literalism.

Proposed definition

Core definition (the gate). An AI System is an automated data-processing system that, for a given input, produces decisions, predictions, recommendations or content which are not fully pre-determined by explicit human-authored instructions, because the system derives or adapts its own processing logic from data, models or probabilistic methods.

Accountability threshold (when the Standard applies). A system within the core definition is governed by this Standard where the degree of meaningful human intervention in either (a) the formation of its output, or (b) the application of that output to a decision affecting a person or a business or legal outcome, falls below the accountability threshold.

The deployer shall define and record the accountability threshold for each system in its risk documentation. In the absence of such a record, or where the system exhibits any characteristic in Classes 1 to 3 below, the threshold is presumed to be crossed and the system is treated as an AI System requiring governance.

Classes. The three classes are independent risk vectors, not an ascending scale, and are not mutually exclusive; a system may fall within more than one, in which case the obligations attaching to each apply cumulatively. Governance intensity (Low / Medium / High / Critical) is fixed by the risk tier, not by the class number.

  • Class 1 — Adaptive (self-learning) systems: a system that alters its own decision behaviour — by adjusting parameters, weights, rules, embeddings or operative prompts — without a human developer revising the underlying logic for each such change.
  • Class 2 — Autonomous-action (automated-decision) systems: a system whose output is implemented, or applied to a decision affecting a person or a business or legal outcome, without a human being able to review and override that specific output before it takes effect.
  • Class 3 — Generative and affective systems: a system that generates novel content (including text, images, audio, video or code), or that infers, simulates or responds to human emotional or behavioural states.

Interpretation. A system falls within this Standard on either of two independent grounds: because human control has dropped below the threshold (the control ground — Classes 1 and 2), or because the system possesses capabilities that generate unknown or emergent risk irrespective of human control (the capability ground — Class 3).

Either ground is sufficient on its own.

How this corrects each ambiguity

  • Undefined “acceptable threshold.” The threshold is no longer left floating. It is made procedurally determinate — the deployer must define and document it per system — and is backed by a default presumption: if it is undocumented, or any class characteristic is present, the threshold is deemed crossed. This also dovetails with DGPSI’s existing Deviation Justification Document discipline.
  • Headline-versus-classes mismatch. The definition now openly rests on two grounds rather than pretending everything reduces to “human intervention below a threshold.” Classes 1 and 2 carry the control axis; Class 3 is expressly placed on a separate capability axis, with the note that its inclusion “does not depend on any reduction in human intervention.” The earlier slippage is resolved by acknowledging it rather than papering over it.
  • Relationship between classes. It is now stated that the classes are independent, overlapping and cumulative, and that the number denotes risk type, not severity — so an agentic generative tool that is Class 1 + 2 + 3 attracts the combined obligations, and severity is read off the separate risk tier.
  • “Code-correcting” literalism. Class 1 now refers to a system altering its “decision behaviour — by adjusting parameters, weights, rules, embeddings or operative prompts,” expressly not limited to rewriting source code. Conventional machine learning, which changes weights rather than code, is now plainly captured.

Short-form version (for the body of the Standard)

An AI System is an automated data-processing system whose decisions, predictions, recommendations or content are not fully pre-determined by explicit human-authored instructions. It is governed by this Standard where meaningful human intervention in the formation or application of its output falls below a deployer-documented accountability threshold — which is presumed crossed where the system

(1) adapts its own decision behaviour without per-change human revision,

(2) applies a decision affecting a person without a human able to override the specific output, or

(3) generates novel content or infers or responds to human emotional or behavioural states.

A useful by-product: the core definition sentence now mirrors the OECD / EU AI Act / draft Supreme Court descriptive boundary, so the revised DGPSI definition is interoperable with the Court’s definition.

Comments welcome.

(P.S: This revised definition of AI does not affect the DGPSI-AI framework implementations. Kindly note that SCAIF defines the Algorithmic Software a term used in DGPSI rules and hence there is a need for integration of the two definitions.)

Naavi

For Better clarity kindly listen here: and

Posted in Privacy | Leave a comment

Definition of AI in the SCAIF

This is in further continuation of the discussion on the Supreme Court AI regulations (Draft)

In the Supreme Court draft regulations on AI usage, AI has been defined as

“a machine-based system that infers, learns, and generates decisions, predictions, and recommendations from data, with a varying degree of autonomy, such as, algorithms, computational processes, and software, deployed for court processes, excluding general-purpose software or digital tools, unless such software or tools are specifically embedded with, augmented by, or functionally dependent upon, artificial intelligence”.

DGPSI has used the definition as follows

“Definition of AI under DGPSI AI is a class of automated data processing system where the human intervention in decision output and application of decision to a business decision is below an acceptable threshold. In order to define the threshold, three classes of AI are recognized as part of the definition.

Class 1: Any software that has a Code correcting ability without the intervention of a human developer to generate an output is considered as an AI system-Class 1.

Class 2: Any AI system that automatically implements a decision affecting a human is considered as AI system- Class 2

Class 3: Any system that reacts to the human emotions, capable of creative outputs, including generative AI and is considered as AI system- Class 3″

If wee try to analyse these two definitions we find that

These two definitions are worth examining closely because they belong to two different definitional traditions, and the contrast explains a great deal about how each instrument intends to regulate.

The Supreme Court definition answers an ontological question — “what kind of thing is an AI system?” — and answers it descriptively, by listing capabilities.

The DGPSI definition answers a regulatory question — “at what point does a system require the safeguards that attach to AI?” — and answers it by reference to the displacement of human control.

The first draws a boundary around a category of technology; the second draws a boundary around a category of risk. This is the single most important difference, and almost every other contrast flows from it.

Supreme Court Definition

The Court’s wording is plainly modelled on the OECD’s revised AI definition and Article 3(1) of the EU AI Act — “a machine-based system,” operating “with a varying degree of autonomy,” that “infers, learns, and generates” outputs from data.

Adopting this lineage aligns Indian judicial practice with the emerging global consensus, makes the definition defensible against the charge of idiosyncrasy, and eases future interoperability and mutual recognition. The general-purpose carve-out is also understandable as it prevents ordinary word processors, spreadsheets and case-management software from being swept in, while re-capturing them once they are “embedded with, augmented by, or functionally dependent upon” AI.

This is functionally adequate for the purpose of defining AI for the regulation envisaged.

But the definition carries three drafting weaknesses.

First, it is circular: it defines artificial intelligence partly by reference to software being “functionally dependent upon artificial intelligence.” The term reappears inside its own definition, which gives the boundary no independent anchor at precisely the margin where disputes will arise (for example, a case-management system that calls an external AI translation API — is it “functionally dependent”?).

Second, the operative verbs are conjunctive — “infers, learns, and generates.” Read literally, a system would need to do all three to qualify, yet many narrow tools only infer, or only generate, without learning.

The OECD formulation avoids this by using “such as.”

Third, and most consequentially for a court, the listed outputs are “decisions, predictions, and recommendations” — the word content is absent. Generative systems that draft text, summaries or pleadings produce content, and a literal reading could leave the most common form of “judicial-context generative AI”, sitting awkwardly outside the core verb list, to be rescued only by interpretation.

The DGPSI definition: functional and accountability-anchored

DGPSI defines AI as automated data processing “where the human intervention in decision output and application of decision … is below an acceptable threshold.”

This is an elegant regulatory move because it ties the definition directly to the thing the law actually cares about — the point at which a human stops being meaningfully in control.

It coheres tightly with the human-primacy principle in Section 4 of the Court’s own draft and with DGPSI-AI’s second principle (one accountable human behind every algorithm). Where the Court must reach the same result through separate provisions on autonomy, risk tiers and the Regulation 20 prohibitions, DGPSI builds the accountability concern into the definition itself.

To be critical, DGPSI definition poses difficulties  of a different character. The phrase “acceptable threshold” is left to the discretion of the “Auditor” similar to the word “Reasonable” often used in regulations. It may presuppose a standard-setter to fix the threshold, failing which different deployers will draw the line differently.

The three classes are evidently meant to supply that content.

Class 1 (code-correcting without a human developer) and Class 2 (automatically implementing a decision affecting a human) sit naturally on the control axis.

Class 3, however  reacting to human emotions, creative and generative output  is a capability criterion that does not necessarily involve any reduction in human intervention; a generative tool can be fully human-supervised. So Class 3 quietly shifts the basis of the definition. This is to guard the future development of sentient AI systems.

There may be  two other ambiguities. The classes are not obviously hierarchical or mutually exclusive. For example an agentic generative system could be Class 1, 2 and 3 at once, and the framework does not say whether the classes are cumulative or alternative. Hence the highest class has to be adopted in such cases.

Also Class 1’s reference to “code-correcting ability” invites a literalism trap: most machine learning does not rewrite its own code; it adjusts weights and parameters. Read strictly, Class 1 might miss conventional ML and catch only exotic self-modifying systems; read purposively (any self-adjustment without a human in the loop), it is very broad. The intended reading should be stated. The intention is to include any change in code or weightages that can alter future decisions.

How the two map onto each other

The frameworks are complementary rather than contradictory, and they nest reasonably well. DGPSI Class 1 is a concrete instance of the Court’s “learns.” DGPSI Class 2 corresponds to “generates decisions” exercised with autonomy — but note that in the judicial setting this is largely the prohibited zone, since Regulation 20 bars algorithmic adjudication and automated outcomes; Class 2, in courts, mostly describes what is not allowed rather than what is approved. DGPSI Class 3 is precisely the generative/affective territory that the Court’s verb list under-specifies, so it usefully fills the “content” gap identified above.

What this means for the judicial context specifically

The practical test the Court needs is an approval gate: when a vendor seeks clearance under the draft, the committee must decide whether the tool is AI at all, and if so how intensively to regulate it.

For that purpose the Court’s descriptive definition is good at setting the outer boundary but poor at administrability  “functionally dependent upon AI” is hard to certify cleanly.

DGPSI’s class model is the opposite: easier to administer because a vendor can attest “this is a Class 2 system,” but weaker as an outer boundary because of the threshold’s relativity.

The natural synthesis  and this is exactly what the FDPPI submission  recommends for Regulation 3(1)(m),  to keep a descriptive, OECD-aligned definition as the gate, and layer a control-based classification (the Low/Medium/High/Critical tiers, informed by DGPSI’s classes) as the mechanism that sets the intensity of obligations once a system is inside the gate.

The Court defines what AI is; DGPSI explains how much to worry about a given instance.

P.S: This is an academic debate and comments are welcome.

Naavi

Posted in Privacy | Leave a comment