Theory of Data

Naavi has been working around the concept of “Data” from a Techno-legal, Behavioural and Managerial perspectives for some time. The objective is to bring better clarity on the nature of “Data” and in particular “Personal Data” so that regulations are clear from the point of view of compliance.

Unclear regulations are the bane of the industry because the practitioners donot know  how to comply and what to comply. This leads to discretionary decisions which may go wrong. Even the judiciary and the regulator often unwittingly comes to wrong decisions and has to face the criticism from the industry.

In trying to present the Data and its implications in terms of the current laws and the continuously moving technology base, Naavi has been sporadically writing about “Cyber Laws for Quantum Computing Era”, “Impact of AI on Cyber Laws”, “Evidence under Section 65B of IEA”, “Dynamic Theory of Data”, “Nuclear Theory of Data” and so on.

Some of these thoughts have emerged over time and has been presented based on the contextual discussions.

Time has now come to bring some more cogency to these thoughts and consolidation of these different explanations provided earlier into a consolidated “Theory of Data”.

Since there is already a body of academicians who are working on the “Theory of Data” from the perspective of technology and my approach is basically from the “Legal, Behavioural Science and Managerial perspective” of Data, it is necessary to prefix this theory as “Naavi’s Theory of Data” or “LBM theory of Data”.

This is primarily an academic exercise.  However, some thoughts that may be presented in this theory may have the potential to be implemented in the new Data Governance Framework or Data Protection Frameworks that the industry may try to follow in the context of new regulations and new technology.

For some time this work may proceed in the background and would be released in public from time to time.

Any thoughts and words of wisdom from readers is welcome.


Print Friendly, PDF & Email
Posted in Cyber Law | Leave a comment

The definition of “Personal data” in CCPA

California Consumer Privacy Act 2018 has tried to address some aspects of Privacy law which we in India will be discussing both through the PDPA (Personal Data Protection Act which is in draft Bill form) and the forthcoming “Data Governance Act of India” (DGAI) (expected) .

The PDPA focusses only on “Personal Data” which is defined as follows:

“Personal data” means data about or relating to a natural person who is directly or indirectly identifiable, having regard to any characteristic, trait, attribute or any other feature of the identity of such natural person, or any combination of such features, or any combination of such features with any other information;

In discussing the term “Anonymization”, the PDPA defines as follows:

“Anonymisation”in relation to personal data, means the irreversible process of transforming or converting personal data to a form in which a data principal cannot be identified, meeting the standards specified by the Authority.

In otherwords, a “Personally Identifiable Information” changes to an anonymized information if the identity parameters are removed as per the standards that may be prescribed by the DPA.

By the definition of “Anonymization”, the DPA has been entrusted with the resposnibility to define when an “identifiable” data becomes “Anonymized” data.

Under the CCPA it is interesting to observe the use of the term “Probabilistic identifier” to differentiate between Personal Data subject to the regulation and “Non Personal Data” outside the regulation.

In CCPA Personal Information is defined as follows:

 “Personal information” means information that identifies, relates to, describes, is capable of being associated with, or could reasonably be linked, directly or indirectly, with a particular consumer or household.

Personal information includes, but is not limited to, the following:

(A) Identifiers such as a real name, alias, postal address, unique personal identifier, online identifier Internet Protocol address, email address, account name, social security number, driver’s license number, passport number, or other similar identifiers.

(B) Any categories of personal information described in subdivision (e) of Section 1798.80.

(C) Characteristics of protected classifications under California or federal law.

(D) Commercial information, including records of personal property, products or services purchased, obtained, or considered, or other purchasing or consuming histories or tendencies.

(E) Biometric information.

(F) Internet or other electronic network activity information, including, but not limited to, browsing history, search history, and information regarding a consumer’s interaction with an Internet Web site, application, or advertisement.

(G) Geolocation data.

(H) Audio, electronic, visual, thermal, olfactory, or similar information.

(I) Professional or employment-related information.

(J) Education information, defined as information that is not publicly available personally identifiable information as defined in the Family Educational Rights and Privacy Act (20 U.S.C. section 1232g, 34 C.F.R. Part 99).

(K) Inferences drawn from any of the information identified in this subdivision to create a profile about a consumer reflecting the consumer’s preferences, characteristics, psychological trends, preferences, predispositions, behavior, attitudes, intelligence, abilities, and aptitudes.

It is interesting to note that this definition of “Identifiers” list many individual identifiers and calls all of them as inclusive of “Personal Information”. As a result, each individual item shown here for example ” the geolocation data”, directly or indirectly, with a particular consumer or household can be considered as “Personal Information”.

At the same time, while defining a “Probabilistic Identifier”, the Act says

“Probabilistic identifier” means the identification of a consumer or a device to a degree of certainty of more probable than not based on any categories of personal information included in, or similar to, the categories enumerated in the definition of personal information.

Further, the “Unique Identifier” is defined as follows:

“Unique identifier” or “Unique personal identifier” means a persistent identifier that can be used to recognize a consumer, a family, or a device that is linked to a consumer or family, over time and across different services, including, but not limited to, a device identifier; an Internet Protocol address; cookies, beacons, pixel tags, mobile ad identifiers, or similar technology; customer number, unique pseudonym, or user alias; telephone numbers, or other forms of persistent or probabilistic identifiers that can be used to identify a particular consumer or device.

This means that either the identifier should be directly tagged with an individual consumer or a “household” (household relates to the devices such as the IoT devices present in the household address”) or there should be some probabilistic certainty that the information belongs to a particular consumer or device.

The Act is however vague on how we determine the threshold probability. It is not clear if the authority will define a standard for this purpose as stated in PDPA or leave it to the business to address this as a business challenge.

While the Indian law sticks to defining “Anonymization”  and taking  “Anonymized Personal Data” out of the regulation, CCPA defines “Aggregate Consumer Information” and brings it also under the regulation. In this context including “Households” as “Consumer” which brings the IoT devices within the provisions of this law makes sense.

While India is now looking for a separate legislation which we have called “Data Governance Act of India” and formed an expert committee for the purpose, CCPA has tried to accommodate the business interests of processing “Non Personal Data” within the CCPA regulation itself.

CCPA defines “Aggregate Consumer Information” as follows

“Aggregate consumer information” means information that relates to a group or category of consumers, from which individual consumer identities have been removed, that is not linked or reasonably linkable to any consumer or household, including via a device. “Aggregate consumer information” does not mean one or more individual consumer records that have been de­identified.

We can observe that this includes the “Community Privacy” which Justice Srikrishna mentioned in his report and what the Kris Gopalakrishna committee has been asked to regulate as “Non Personal Data”.

Additionally, by including a reference to “households” and “Devices”, CCPA has extended the regulation to the “Non Personal Data”.

CCPA also incorporates a provision for “Selling of personal information” and considers the personal data as property that can be sold.

It is therefore reasonable to consider that part of the new regulation that the Kris Gopalakrishna Committee may consider to recommend may consist of measures which are included in the CCPA.

However, whether this committee will restrict itself to the addition of some CCPA provisions to the PDPA or go further and speak of a larger legislation for “Data Governance” will depend on the vision of the Committee members.

Since this is a good opportunity to bring in regulation of both personal and non personal data into an umbrella legislation on the “Data Governance Framework” , it would be interesting if the committee does consider this larger objective.




Print Friendly, PDF & Email
Posted in Cyber Law | Leave a comment

SIM hijacking vulnerabilities puncture Bank’s legal arguments

In all cases of Banking frauds involving “Unauthorized Access”, Banks have been putting across various arguments to shift the blame entirely on the customers.

We had reported a recent incident where HDFC Bank had tried to bully a customer for a credit card fraud in which the money had been drawn on a foreign website and the customer disputed the transaction forthwith.  After the customer brought the incident to the attention of RBI, immediate action was initiated by RBI which forwarded the complaint to the local Banking Ombudsman. Within hours, HDFC Bank confirmed that the transaction was reversed. But it was a fact that the Bank made multiple attempts to trap the customer and force him to file a police complaint. Had he filed a police complaint, the Bank would perhaps have taken it as an excuse to delay the payment.

When such bullying fails, Banks take up the argument that “Banks security system is fail safe” and if there is any compromise of credentials, it can happen only at the customer’s end and because of his negligence. They often hold out some old ISO27001 certificate and claim immunity from their responsibilities.

After the two judgements of TDSAT in the cases of  ICICI Bank Vs S Umashankar and another case of IDBI Bank vs Sudhir S Dhupia TDSAT has established the precedence of “Negligence by Bank as a Section 43(g) contravention” making them liable for cases of Phishing.

Though, the Banks may knock at the doors of High Courts to continue their fight against customers further, if the Courts are honest and informed, it is unlikely that the Banks will succeed in over turning these judgements from a specialized tribunal with adequate technical understanding.

However, considering that not all High Courts are equipped to handle such techno legal cases and there is a possibility that the highly paid Bank’s lawyers can also be influential enough to sway the Courts in their favour, there is a danger of some adverse decisions from the Courts if informed public donot monitor the Court proceedings in these cases.

Some times Banks also hold out the “OTP” as a security measure that is good enough to say that Banks are always right and customer is always wrong when a OTP based second factor authentication is used in a fraudulent transaction.

Naavi has always held that even when a customer mistakenly provides the OTP information to a fraudster, it is only because he thinks that he is interacting with the Bank and not otherwise. Banks make such deception possible first by not adopting the more secure authentication systems including the digital signature system or adaptive authentication using Artificial Intelligence which is now available without much of a cost.

Additionally, it has been now revealed that two critical vulnerabilities have been found in the SIM software which makes it possible for the fraudsters to hijack a mobile phone and thereby compromise the OTP system.

I would not like to go into the technicalities of SIMjacker vulnerability and WIBattack which have been explained in the articles linked above.

The summary however is that the authentication based on the OTP is not reliable enough to say that Banks should be absolved of their negligence in not adhering to the law of the land which mandates the use of “Digital Signature” or “E-Sign” for authentication or follow

a) the Internet Banking guidelines of June 2001 mandated by RBI or

b)  the E Banking Security guidelines of G Gopalakrishna Working Group, or

c) the Cyber Security Framework of 2016 or

d) the Limited Liability Circular of July 2017 from RBI.

I hope that the Courts will check with these guidelines before accepting any arguments from the Banks against Customers in cases of internet or card related frauds.

I request all Bar Councils to create an awareness amongst the Advocates so that they can defend the rights of the customers against the powerful Banks. The Judicial academies should also ensure that sufficient awareness is created amongst the senior judges on why they should be resistant to the vocal arguments from senior counsels in favour of the negligent Banks who only want to exploit the less informed customers represented by less informed or junior advocates.


Print Friendly, PDF & Email
Posted in Cyber Law | 1 Comment

The atomic structure of Data

(This is a continuation of the previous article)

When we look at “Data” as a human experience, it consists of many elements. For example, some data may be as simple as the hydrogen atom with just one proton in the nucleus and an electron revolving around it. The charge is neutral, the mass is very light.

In comparison, a Helium atom is heavier with two protons and two neutrons in the nucleus with two electrons revolving around the nucleus.

Though helium is heavier than Hydrogen, these are still lighter elements compared to the Uranuim or Lead.

Some elements when they combine with other elements become molecular compounds and exhibit properties different from that of the individual elements. When two gaseous elements like Hydrogen and oxygen combine together they may together form a liquid called water. Some of the molecules may be so tightly bound that they even become solids.

Thus depending on the combination of protons and neutrons as well as the binding force that binds one atom with anther atom, elements take different form and different properties . Along with these properties they acquire different values.

Similarly, when we look at “Personal Data”, if there is only one element of data such as “Name” with a company, it is an “Unstable” atomic existence. If it is associated with some additional core element like the E-mail address, Mobile Number or a Social Security/Aadhaar/PAN number, the personal data acquires a complete atomic structure. If the number of electrons are not balanced with the number of protons, there could be an energy imbalance such as what happens when the data security is weak.

When more data parameters get associated with personal data, it gains an atomic mass. Name becomes an identifiable data. identifiable personal data becomes health data or finance data and so on. Similarly personal data may become sensitive personal data and critical personal data depending on what type of data elements are associated with the data set.

Further, when an atom is broken up and some elements are released, there can be a fission that may lead to formation of other compounds and also release of energy. If the energy release is controlled or moderated, then it may be used to the benefit of the society. But if the energy is released to the wild like a hacker who places the data on the web just to prove that he has the hacking capability, then it may be destructive like an atomic bomb. The Snowden’s release of data was perhaps one such uncontrolled fission of data.

Some times fusion of one data set with another data set creates a molecule. This could be an aggregation of data of multiple people creating a community data. If conditions are ripe, the fusion may result in nuclear fusion resulting in release of enormous amount of energy which if not regulated can be destructive. The Big data industry and the IoT is creating such situation where regulation is becoming essential to ensure that aggregation instead of creating useful molecules can release destructive energies like in the case of Cambridge Analytica.

By appropriately removing select data elements, some personal data can be rendered “Anonymized Personal data”. What distinguishes an anonymous data from an identifiable data is the absence of certain “Identity elements”.

While building a “Data Set” in an organization, it is necessary to identify what constitutes the “Nucleus” of the data and what constitutes the peripheral data elements which simply revolve around the nucleus. The parameter like the  PAN number which in isolation can be used with  publicly available data base can be considered as a key nucleic particle. Even the Name cannot be called a uniquely identifiable sub element of data since there could be many persons who may have the same name. E Mail address alone is not also an identifiable data element since in the present regime, e-mail address is allocated on first cum first served basis and is associated with a domain only.

In some of the data protection regulations, there is a tendency to identify certain elements such as IP address, Street Name, Postal PIN or Date of Birth as “Personal Data”. As a result companies get worried about such data coming into their control. But unless such data is combined with a “Core data element” these individual elements are like unbound electrons that are floating around meaninglessly. When these start revolving around the nucleus of the core identity element, they together acquire the characteristic of a complete “Atom” or “Meaningful personal data”.

Some times new elements of data come into the system and they get attached to different nucleic data. Then the property of the core data may undergo a metamorphosis either on its own or when combined into a molecule along with other data.

The molecular bonding between different atomic particles can be manipulated by companies to discover “New Molecules” . This fusion can create a completely different product with hitherto unknown benefits. Probably the algorithms created by business to add value to data aggregation is therefore like the “Chemical Molecules” created by a bio technology industry on which Product and Process patents are recognized.

Thus, looking at “Personal Data Set” as an “Atom with Protons, Neutrons and electrons” and their ability to be combined or divided as creating value of different kinds gives a new perspective to the way we look at Data and how it has to be protected and regulated.

Protecting an atom from dissipating requires creation of a bonding force and a balance of charge. This is what data protection professionals need to do. If an atom lacks the required bonding then with the slightest nudge from an outside element a chain reaction of fission can be triggered with damage alround.

The Security measures need to ensure that fission if any is within a controlled environment of a heavy water surrounding so that the chain reaction can be controlled and harnessed.

Data Governance professionals need to build such nuclear reactors that are capable of moderating the fission and fusion so that the energy created can be harnessed for the benefit of the society.

The upcoming Data Governance Act of India should therefore be like a design of a “Data Reactor” that helps in the creation of data aggregation for the benefit of the society and protection of data from a radiation leak.

…May be others can add further thought to this Nuclear Theory of Data.



Print Friendly, PDF & Email
Posted in Cyber Law | Leave a comment

Fission and Fusion of Data Elements..1

Evolution of the Term “Data”

Before the advent of ITA 2000, the normal understanding of the term Data was information in raw form.  Data organized in a form in which it carried better use to the community was termed “Information”.

Data as Experience

With the advent of the ITA 2000, the meaning of the words Information and Data changed. Both were “Streams of binary states” of a magnetic or optical surface which was capable of being processed by a computing device, the purpose of such processing being converting the binary stream into a human intelligible experience.

Data was therefore an “Experience” delivered to a human by a computing device from a stream of binary notations contained in some device.

The device in which the binary stream was contained was called the “Container of Data”. The hard disk or the CD are examples of such “Containers of Data in binary form”. This binary form of data could be read only by machines which included the hardware and software like the computer, the application software etc. In order to enable the computer to render the binary stream into text or sound or image, the core element of the data had to be padded with header information that contained the instructions on how to “Experience” the data with the use of the appropriate computing device.

We may note that an Encrypted data or data read through the incompatible devices may come up as an unreadable gibberish or even corrupted data but it is still “Data” for the technical purpose though it is a “Meaningless junk” for the human purpose.

This “Data as Experience” was the central idea for which in the Indian Evidence Act, a section called Section 65B was introduced making it mandatory for every electronic evidence to be accompanied by a certificate from a human being for it to be admissible in a Court stating “What” and “How” he experienced the said data which is the evidence in a subject incident.

Arrival of Data Protection Regulations

With the arrival of the Data Protection Regulations in India first with ITA 2000 and more specifically with the amendments of 2008 and later the  GDPR and PDPA (Bill under consideration), the debate on defining the term “Data” in the context of “Personal Data” got more intense. The regulation focused on “Personal Data” (PD) and Sensitive Personal Data (SPD).

Today, whenever we speak of “Data” we are mostly discussing “Personal Data” almost  forgetting that there is data beyond “Personal Data” which also needs to be processed, protected and used productively. Those who called “Data as Oil” had this entire data set in mind where as those who called “Data is water” had the Personal Data in mind.

Debate on Non Personal Data

Privacy Laws are directed towards protecting the Privacy Rights of “Living Individuals” and hence data belonging to a Corporate/Organziational entity was outside the provisions of the data protection regulations. However Data of Corporates always continued  to be the subject of Cyber Crime regulations such as ITA 2000/8 or Computer abuse regulations or Digital Contract related regulations.

Now the discussions on “Non Personal Data” has surfaced with a greater intensity since there is a possibility of a new regulation in India exclusively for regulating “Non Personal Data”.

This law may be called “Data Governance Act of India” (DGAI) for our discussions.

Apart from covering  “Non Personal Data” which is part of data covered under the ITA 2000/8, the new regulation is likely to introduce new terms such as “Joint Personal Data”, “Community Data”, “Anonymized Data”, “Business Data” etc.

For those who were fully engrossed in GDPR discussions without taking into account the “Data beyond the Personal Data”, the new terms and categories of data will not be easy to absorb.

While the Data Protection Professionals keep struggling with issues such as

-Whether Business E-Mail is personal data?
-Whether Big Data is regulated under PDPA or the Data Governance Act?,
-Whether Community data is the property of the community as a whole or is a joint and several property of each of the community members,
-Whether an E Commerce data of an individual belongs to the individual or the E Commerce company,
-Whether a ML process generates a profiling algorithm which is the intellectual property of the Company or the individuals (or the community), etc

the new category of professionals namely the “Data Governance Managers” will emerge from the Business Schools and start demanding interpretations on

-What is Data?,
-Whether Data is “Property” or Right”?
-Whether Data Fiduciary is a contractual servant of the Data Principal or a Trustee of the Data Principal,
-Whether Data is an Asset of the organization and has to be deployed productively to a higher yield in terms of revenue
-How risks of non compliance can be transferred with “Cyber Insurance” after the Data Protection Professionals have mitigated it to a certain extent,
-Whether the “Risk Mitigation” is a residual effort after  Risk Avoidance, Risk Transfer and Risk absorption? or Risk Absorption is the residue after Risk Avoidance, Risk Mitigation and Risk Transfer?
-Whether the Data industry would be able to meet the Cyber Insurance regulations of “Contract of utmost faith”, “Co-insurance”, “Expected Protective Action as if no insurance is available”, “Whether Extortion and Zero day vulnerabilities are insurable”? etc

Naturally the views of a management professional will be different from an IT professional or a legal professional. Hence there is bound to be some disagreements and heated debates within the corporate circles on fundamental aspects of Data Protection and Data Governance.

It is therefore vital that we continue to keep boiling the academic debate on the subject of Data and what is meant by Data Governance and how does it differ from Data Security and so on. We may thereby contribute towards better clarity of the subject.

In the past, as an academician, Naavi has tried to describe Privacy and Data Protection from different perspectives. We have tried to look at Privacy Protection regulations from the Johari Window concept, Looked at Data as a “Dynamic Concept” etc.

In this direction, let us today explore another perspective of “Data” and “Personal Data” from the vision of “Nuclear Physics” and see if we can get a better hang of the complex concept called “Personal Data” which we need to regulate as if our life depends on it.

Some of the issues that we need to discuss here are

1.Is “Data” actually is a “Data Set” consisting of multiple sub elements like an “Atom” consisting of “Protons, Neutrons or Electrons”?

2. What will be impact on the human being in his experience of the “Data” when different “Elements of Data” come together,  or When some elements of data are removed from a data set, 

3. Will in such occasions, when a data set undergoes a fission or fusion, the value of the data as an experience increase or decrease, 

4. Will the ownership of the data change during this process of mutation?

5. How the laws of data governance handle the concept of “Data as Experience undergoing mutation due to environmental aspects in which different stakeholders have contributed differently”

We therefore need to take one more look at the “Data” and try if we can understand its nature better.



[P.S: Why this “Nuclearaization” of “Data Concept”?…
Naavi was a student of Physics at one point of time and was always fascinated by the fact that Internet was actually discovered in a Physics Lab. We observe that  the entire domain of Computers is directed by the “Transistors” who in their miniature form become “Chips” and carriers of data and computing instructions. This is a derivative of the branch of science called Physics.
The Quantum Computing has already taken the data holding space to nuclear and sub nuclear particles. The law of Electronic evidence can only be explained through data being represented by  the binary states which are the states of the transistors or electromagnetic orientation of particles in a container. 
Therefore it seems logical that even the concept of “Data” and its classifications into different forms for data protection is also explained in terms of the concepts of nuclear physics”. This should also in a way establish the role of the students Physics who have strayed into the domain of “Data Protection”  who have to often face the question of “What are you doing in this field of lawyers and engineers?”, and establish the “Oneness” of different fields of science. 

Print Friendly, PDF & Email
Posted in Cyber Law | Leave a comment

Personal Data Vs Business Data comes for discussion with Mr Modi

In his continued interaction with the US business, Mr Narendra Modi, PM of India met about 40 US global CEOs in which a discussion has emerged about the new Data Protection law that India is contemplating.

The US businessmen have raised the issue of “Data Localization” which has been one of the contentious issues. Mr Modi appears to have provided a diplomatic answer that Data belongs to the Data Subjects and law will balance the interests of the individual and that of the business trying to commercialize the personal data.

Refer article in TOI

Indian law provides ample provisions to enable cross border transfer of data. When the information is not sensitive or critical, it can be transferred though a working copy has to be maintained in India. In the case of sensitive personal information, other than those declared as “Critical” can be transferred subject to Standard Contractual Clauses, Explicit Consent, Adequacy of laws, Intra Group schemes approved by DPA, besides health emergencies etc.

The Cross border transfer rules of India are not much different from GDPR but is stated differently. While the GDPR says, data can be transferred subject to ……, Indian law says that data cannot be transferred unless……..

The issue behind this controversy is “Data Sovereignty” and there is need for US business to understand that Mr Modi stands as much for India as Mr Trump stands for US when it comes to the sovereign rights of the country.

In the process of the discussions, a point that has cropped up is the distinction between “Personal Data” and “Business Data”. Naavi has in the past highlighted the issue of how GDPR enthusiasts often consider Business E Mail as personal data and try to mount penal charges on the users of business e-mail for digital marketing purposes. The discussions in the US was perhaps centered around the “Transaction Data” related to an individual regarding the e-commerce transaction which the US business wants to exploit for commercial gains. Whether this is to be treated as “Business Data” or “Evolved Personal Data” in which the business has an intellectual property right and whether an individual can provide consent for the use and transfer of such data for a consideration are matters of further debate.

It must be noted that the Big Data industry which deals with “Anonymized” data has no problems with the Indian PDPA since such data is out of regulatory ambit. It is only in case of identifiable data of business transactions of an individual that needs to be recognized as a “Disputed Data Territory”.

A proper legal clarification on such data is perhaps possible to be issued when the next set of regulations on “Data Governance Framework” is considered in India. US business may therefore wait for this legislation to take shape before raising their voice which was so meek when it came to opposing the provisions of EU-GDPR but have become vocal while opposing Indian-PDPA.

Naavi has held that there is solution for this issue also which is feasible as a commercial business proposition both under the current ITA 2000/8 and the forthcoming PDPA. The problem is that the so called “innovative” start up entrepreneurs are too busy replicating business already ideas already in the market rather than investing in new ideas. We therefore have to wait a while until the start up entrepreneurs and the VCs mature in their thinking and be prepared to develop a business which has no precedence.


Print Friendly, PDF & Email
Posted in Cyber Law | Leave a comment