What is the life cycle of Data?

“Life Cycle” starts from birth and ends with death. In the case of humans, life form comes from the environment (in the form of the Pancha bhootas)  and goes back to the environment. Human life has two components namely the physical body and the “Life within it”. Human body is constantly aging and growing and is never static. According to wise men, even after a person dies, some activities of the human body such as hair growth and nail growth continues for some time. So, there is a specific distinction between “Body” and the “Soul”.

In non-living bodies, the structure may remain static unless otherwise affected by an external agency. A stone may remain static unless rain water flowing over it or dust accumulating over it keeps it either eroding or growing over a long time frame. This however is different from human growth which occurs from within with the intervention of what we identify as “life”.

It is interesting to ponder when does a “Stone” take birth and when does it die and what happens to it during its life cycle?

A “Stone” is born as “Soil”. It is compressed under the earth until it becomes hard. During this process fine particles may come together as a larger group of particles bound together closely. This may then come out of the earth and get exposed as stones and rocks. Some times it may come out like Lava and there after solidify into a rock form on cooling.

The death of a rock is in it being powdered back into soil form. Hence the life cycle of a rock is from the soil back to the soil. In the intermediate life cycle it can assume different forms such as sedimentary rocks, Metomorphic rocks or igneous rocks etc. Some may be carved as “Statues”, “Idols”, “Slabs” etc.

“Data” life cycle has some similarity. It is debatable if it has “Life”. It is also interesting to think …When does “Data” take Birth and When does “Data Die”? and what forms it can take in between is difficult to answer. During its life cycle, Where does “Data Reside”? Is it in a hard disk”,  Is it in a tape? Is it in a Memory Card? how does it flow from one data holding device to another?… Will it copy? or will it move?.. are questions to which we believe we have an answer on a contextual basis. We say this can be this and also that most of the time.

If Data can be in different forms at different times, it is fine. But can some body explain this? Physicists often say Light is a stream of Photons but Light also behaves some times like a wave. But some body has to explain how can it be both? and at what circumstances the Light Photons behave like a wave? That explanation is what we may call the “Light Wave Theory”Another set of Physics went further and said there is a “Matter Wave Theory ” according to which all “Particles” have a “Wave like behaviour”.

Now, if we have doubts about “What is Data”? and “Data is many things to many people based on the context”, we cannot simply accept this “Context Based Excuse” as a “Definition”. Based on one such notions, law makers create laws and Judges interpret them according to their own whims and fancies and the industry is left to deal with the issue on an adhoc basis.

Some times the Supreme Court thinks “Data Disclosure” is a freedom or expression and some times it thinks it is a “Privacy Right”. Same kind of uncertainty exists when Quantum Computing says a “Data State can be either zero or one at the same time”.

If the differences between “Data” from classical computing and “Data” from  Quantum Computing has to be resolved, and “Data” from Classical/Quantum computing perspective to Judicial perspective has to converge, then we need a “New” “Theory of Data”.

Naavi is trying to explore this interesting theoretical concept of “Data” and trying to find a model of description which should fit into the different perceptions of “what data is”.

This is the “New Theory of Data” which will be revealed bit by bit here. Watch for more…

(P.S: This discussion is purely academic)

Naavi

 

Posted in Cyber Law | Leave a comment

The New Theory of Data

Can there be a single answer to “What is Data” which resolves the dilemma of the common man?

Watch these columns for more ….

Naavi

P.S: This concept was expanded through a series of articles culminating in a discussion in the book on Personal Data Protection Act of India (PDPA 2020). Though this is an academical discussion, this discussion is required to develop Data Protection Jurisprudence for the future.

The other articles can be found through a search on Theory of Data  Some of the key articles explaining the theory is also given below.

October 8 2019: New Data Theory of Naavi built on three hypotheses

October 8, 2019: Theory of Data and Definition Hypothesis

October 10, 2019: Reversible Life Cycle hypothesis of the theory of Data

October 11, 2019: Additive value hypothesis of ownership of data

November 20 2019: Will Personal Data Protection Act be compatible to the Theory of Data?

March 31, 2018: Theory of Dynamic Personal Data

Posted in Cyber Law | 3 Comments

Data Governance Framework as it exists in India now

With the formation of an expert committee titled “Data Governance Framework Committee” Data Professionals in India are now wondering what is in store.

Some of the questions that are in the minds of the Data Regulation Observers are

Will this committee modify the Personal data protection Bill (PDPB)?

Will it give an excuse to the Government to push the PDPB to a standing committee so that the implementation can be indefinitely delayed?

Will the Data Localization requirement of PDPB be circumvented by re-defining “Non Personal Data” to include part of the “Personal Data”?

The answers to the above will depend on the integrity of the Committee which consists of mostly a representation of business interests  which had produced a dissenting note to the Srikrishna committee report.

As is the tradition of Naavi.org, we will closely watch the developments and report our views whether it would be palatable to others or not.

In the meantime, it is essential to reflect on what is the current “Data Governance Framework” in our legislation, if any.

If we look back at ITA 2000, in the 2000 version of the Act, the emphasis was mostly on E Commerce and it introduced the important element of the use of “Digital Signature based authentication” as part of the data governance.

Additionally, some sections such as the Section 43 of the Act  and the mention of liability under Section 79 in the absence of “Due Diligence” gave some directions to the Corporate world on how the data has to be governed in their environment to avoid any liabilities.

This was  the concept of “Cyber Law Compliance” first discussed by Naavi in December 2000 in a CII seminar in Chennai.  The book “Cyber Law Compliance the Corporate Mantra for the Digital Era”, published at that time was a first attempt to bring the attention of Corporates handing data into a recommended data governance framework under ITA 2000.

Industry however looked at ITA 2000 as a law which mattered only to the Police and Lawyers and paid scant attention to ITA 2000 compliance. The stakes became higher with the amendments of 2008 and the need for ITA 2008 grew stronger. ITA 2008 also introduced the concept of Personal and Sensitive personal information along with “Intermediary guidelines under Section 79” and “Reasonable Security Guidelines under Section 43A”. (Amendment Act notified on 27th October 2009 and Rules notified on 11th April 2011)

Further the sections 67C, 69,69A, 69B, 70B, 72A etc all covered different aspects of Data Governance.

Most of the industry observers failed to recognize the data governance elements contained in the ITA 2008 and its notifications but did make efforts to comply with Section 43A. The concept of ITA 2000/8 compliance was to some extent recognized in the post 2011 time and some Techno Legal professionals emerged advising the Companies how to remain compliant with ITA 2000/8 mainly from the perspective of Section 43A.

Naavi was in the forefront of this Compliance brigade and highlighted the compliance requirements under ITA 2000/8 through the following Risk identification model.

A Comprehensive Information Security Framework IISF 309 was also recommended indicating the following responsibilities.

As a rough glance of this framework indicates, out of the 30 different requirements listed here, 23 referred to Non IT Governance. In a way this was a “Data Governance Framework” recommended under ITA 2000/8.

The focus however was on “Meeting Due Diligence” to avoid vicarious liabilities under Section 79 and Section 85 of ITA 2000/8. To that extent, it was not projected as a “Data Governance Framework”.

However after the PDPA came into broader discussion, Naavi introduced the “PDPSI” (Personal Data Protection Standard of India)  where more aspects of Data Governance were added. In particular, the Data Classification system indicating 16 different types of data and the suggested system of Personal Data Keepers and Internal data controllers etc., indicated the Governance requirements though this was in the context of the “Personal Data”. The discussion on DPSI (Data Protection Standard of India) was deferred since it was not a priority at that time.

These discussions extended by the ideas like the DTS, laid the ground work for a Data Governance Model. Though these efforts were focussed more towards “Data Protection”, they also created the early framework in India for Data Governance.

I therefore consider that a “Data Governance Framework”  does exist in India as a reference and the Data Governance Framework committee can take some ideas from these suggestions scattered through this website. Probably when I am able to collate these ideas in the New theory of Data being developed, there will be a better reference book on how to develop the Data Governance Framework.

Let us see if a working draft of the Theory would be available in time to be presented to the Committee before it arrives at its final recommendations.

Naavi

 

 

 

Posted in Cyber Law | Leave a comment

The Journey to the development of a New “Theory of Data” begins

Yesterday, I made an announcement that I will be working on a “Theory of Data”. I consider www.naavi.org as a global publication platform. All my work most of which are products of my own research have been published here. Hence the objective of codifying a “Theory of Data” is also being elaborated here.

Naavi


Why this exercise?

“Data” has become a topic of wide interest in our world  from Mark Zuckerberg to Mr Narendra Modi, from Justice Srikrishna to the Mr Mukesh Ambani . Everybody is speaking about Data.

But different people speak of Data in different perspectives. When Mr Modi says “Data is cheap” in India and therefore international business should find it attractive to do business in India using Data as a raw material. Somewhere else Data Protection professionals say, “Data” is “Gold” and very valuable. Industry 4.0 says “Data is the New Oil” and can be harnessed for prosperity.

In one previous occasion, Naavi has likened the “Samudra Manthana” story in Indian Puranas to describe how Data can be churned with the right tools to extract useful outputs as long as there is a “Visha Kanta” to gulp the poison that may come out and a “Mohini” to ensure that the “Amrutha or Nector” does not fall into wrong hands.

The Data Scientists and the Big Data Industry are concerned that with so much of interest being shown on what is their raw material, the day is not far off when their business will become a play ground for people of every kind. This may actually land the industry in trouble sooner or later since different people come with different perspectives and unless a common understanding emerges, industry cannot have a turbulence free eco system to operate in.

Data Protection Vs Governance

For some, “Data” is the key to Privacy Protection. For others “Data” is the Key to enrichment. For some others “Data” is like the air we breath, access to which should be a fundamental right. For some “Data” is an asset which can be converted into Cash either lawfully or unlawfully through “Data harnessing” or “Data Exploitation” or “Data Laundering”

Between the different perspectives that exist about “Data”, the law makers are trying to make laws to regulate the Life Cycle of Data such as collection,  use, storage, disclosure and destruction of data. These regulations are to be implemented by organizations and are expected to be followed by the entire global population at all times. Failure to be compliant with the laws results in heavy penalties for the business managers often ending them with the prospect of imprisonment.

Those who want to govern business in a lawful manner are exploring the ways of “Data Governance” for better productivity while the regulators are watching every one of their steps to ensure that they remain within the boundaries of law.

Data Protection professionals are developing a framework for protection while Data Governance Professionals are going beyond Data Protection Framework to develop a Data Governance Framework.

Already, we see conflicts emerging with the multitude of Data Protection Laws that make the life of a Data Manager miserable. While Data moves across geographical boundaries freely, when we see that this data consists of Personal Data of Indians, EU Citizens, Californians, Canadians, Australians etc., the Data Governance official is immediately alerted to the fact that each of these data types are subject to different regulations and the Governance model needs to implement them in such a manner that there is no contravention of any of these laws.

These are conflicts arising out of overlapping regulations and we need a solution to overcome the challenges. Naavi has suggested a technical solution within the framework of a Personal Data Protection Standard of India (PDPSI) to address this issue.

But as we go along, there will be conflicts arising out of Data Protection Professionals taking a stand different from the Data Governance Professionals when implementing certain operational decisions of business. This conflict is dangerous since it is internal to an organization, will expose all the behavioural challenges that confront all Man Managers.

If this internal conflict is not handled with finesse, an organization can simply collapse not because its business environment is negative, but because the internal management teams donot see eye to eye because each of them think that they are correct and the other person is wrong.

It is this concern of the problem of the future generation of corporate managers that has prompted me to start work on developing a “Theory of Data” that tries to develop an understanding of Data that all types of professionals whether they are Lawyers or Computer Engineers or Data Analysts or Corporate Managers appreciate with mutual respect and empathy.

This Theory of Data should be consistent with the present and future regulations related to data. However since some regulations are already in place, it is possible that they may not be in sync with this theory. However it is expected that “Jurisprudence” will enable syncing of the present regulations with this theory while new regulations may adopt the concepts propounded in this theory during the construction of the regulation itself.

This “Naavi’s Theory of Data” does not explore the “Technical aspects” but addresses the “Legal”, Behavioural Science” and “Management” aspects of how “Data” comes into existence and lives through its life until its death. If it is necessary to distinguish this theory with the existing theories, then it may be necessary to identify this theory as “Naavi’s theory of Data” or “LBM Theory of Data”. For the rest of our discussion here, we shall however refer to this as simply the “Theory of Data”.

Theory and Hypothesis

The hall mark of a “Theory” is supposed to be establishment of a principle through experimentation and testing. It may start with a hypothesis which is a statement of what a situation is as per an educated guess. Then through various experimentation, the hypothesis is either proved or disproved or further refined until it becomes the theory.

Naavi’s Theory of Data will also adopt a similar approach of first pronouncing some hypothesis and then subjecting it to observations that either support it or refine it.

If other readers have a hypothesis of their own, they are free to submit it to me and I will try to incorporate a discussion on such hypothesis also as part of the development of this theory.

This is therefore a journey which we are starting now.

Naavi

Posted in Cyber Law | Tagged , , , | 1 Comment

Theory of Data

Naavi has been working around the concept of “Data” from a Techno-legal, Behavioural and Managerial perspectives for some time. The objective is to bring better clarity on the nature of “Data” and in particular “Personal Data” so that regulations are clear from the point of view of compliance.

Unclear regulations are the bane of the industry because the practitioners donot know  how to comply and what to comply. This leads to discretionary decisions which may go wrong. Even the judiciary and the regulator often unwittingly comes to wrong decisions and has to face the criticism from the industry.

In trying to present the Data and its implications in terms of the current laws and the continuously moving technology base, Naavi has been sporadically writing about “Cyber Laws for Quantum Computing Era”, “Impact of AI on Cyber Laws”, “Evidence under Section 65B of IEA”, “Dynamic Theory of Data”, “Nuclear Theory of Data” and so on.

Some of these thoughts have emerged over time and has been presented based on the contextual discussions.

Time has now come to bring some more cogency to these thoughts and consolidation of these different explanations provided earlier into a consolidated “Theory of Data”.

Since there is already a body of academicians who are working on the “Theory of Data” from the perspective of technology and my approach is basically from the “Legal, Behavioural Science and Managerial perspective” of Data, it is necessary to prefix this theory as “Naavi’s Theory of Data” or “LBM theory of Data”.

This is primarily an academic exercise.  However, some thoughts that may be presented in this theory may have the potential to be implemented in the new Data Governance Framework or Data Protection Frameworks that the industry may try to follow in the context of new regulations and new technology.

For some time this work may proceed in the background and would be released in public from time to time.

Any thoughts and words of wisdom from readers is welcome.

Naavi

Posted in Cyber Law | Leave a comment

The definition of “Personal data” in CCPA

California Consumer Privacy Act 2018 has tried to address some aspects of Privacy law which we in India will be discussing both through the PDPA (Personal Data Protection Act which is in draft Bill form) and the forthcoming “Data Governance Act of India” (DGAI) (expected) .

The PDPA focusses only on “Personal Data” which is defined as follows:

“Personal data” means data about or relating to a natural person who is directly or indirectly identifiable, having regard to any characteristic, trait, attribute or any other feature of the identity of such natural person, or any combination of such features, or any combination of such features with any other information;

In discussing the term “Anonymization”, the PDPA defines as follows:

“Anonymisation”in relation to personal data, means the irreversible process of transforming or converting personal data to a form in which a data principal cannot be identified, meeting the standards specified by the Authority.

In otherwords, a “Personally Identifiable Information” changes to an anonymized information if the identity parameters are removed as per the standards that may be prescribed by the DPA.

By the definition of “Anonymization”, the DPA has been entrusted with the resposnibility to define when an “identifiable” data becomes “Anonymized” data.

Under the CCPA it is interesting to observe the use of the term “Probabilistic identifier” to differentiate between Personal Data subject to the regulation and “Non Personal Data” outside the regulation.

In CCPA Personal Information is defined as follows:

 “Personal information” means information that identifies, relates to, describes, is capable of being associated with, or could reasonably be linked, directly or indirectly, with a particular consumer or household.

Personal information includes, but is not limited to, the following:

(A) Identifiers such as a real name, alias, postal address, unique personal identifier, online identifier Internet Protocol address, email address, account name, social security number, driver’s license number, passport number, or other similar identifiers.

(B) Any categories of personal information described in subdivision (e) of Section 1798.80.

(C) Characteristics of protected classifications under California or federal law.

(D) Commercial information, including records of personal property, products or services purchased, obtained, or considered, or other purchasing or consuming histories or tendencies.

(E) Biometric information.

(F) Internet or other electronic network activity information, including, but not limited to, browsing history, search history, and information regarding a consumer’s interaction with an Internet Web site, application, or advertisement.

(G) Geolocation data.

(H) Audio, electronic, visual, thermal, olfactory, or similar information.

(I) Professional or employment-related information.

(J) Education information, defined as information that is not publicly available personally identifiable information as defined in the Family Educational Rights and Privacy Act (20 U.S.C. section 1232g, 34 C.F.R. Part 99).

(K) Inferences drawn from any of the information identified in this subdivision to create a profile about a consumer reflecting the consumer’s preferences, characteristics, psychological trends, preferences, predispositions, behavior, attitudes, intelligence, abilities, and aptitudes.

It is interesting to note that this definition of “Identifiers” list many individual identifiers and calls all of them as inclusive of “Personal Information”. As a result, each individual item shown here for example ” the geolocation data”, directly or indirectly, with a particular consumer or household can be considered as “Personal Information”.

At the same time, while defining a “Probabilistic Identifier”, the Act says

“Probabilistic identifier” means the identification of a consumer or a device to a degree of certainty of more probable than not based on any categories of personal information included in, or similar to, the categories enumerated in the definition of personal information.

Further, the “Unique Identifier” is defined as follows:

“Unique identifier” or “Unique personal identifier” means a persistent identifier that can be used to recognize a consumer, a family, or a device that is linked to a consumer or family, over time and across different services, including, but not limited to, a device identifier; an Internet Protocol address; cookies, beacons, pixel tags, mobile ad identifiers, or similar technology; customer number, unique pseudonym, or user alias; telephone numbers, or other forms of persistent or probabilistic identifiers that can be used to identify a particular consumer or device.

This means that either the identifier should be directly tagged with an individual consumer or a “household” (household relates to the devices such as the IoT devices present in the household address”) or there should be some probabilistic certainty that the information belongs to a particular consumer or device.

The Act is however vague on how we determine the threshold probability. It is not clear if the authority will define a standard for this purpose as stated in PDPA or leave it to the business to address this as a business challenge.

While the Indian law sticks to defining “Anonymization”  and taking  “Anonymized Personal Data” out of the regulation, CCPA defines “Aggregate Consumer Information” and brings it also under the regulation. In this context including “Households” as “Consumer” which brings the IoT devices within the provisions of this law makes sense.

While India is now looking for a separate legislation which we have called “Data Governance Act of India” and formed an expert committee for the purpose, CCPA has tried to accommodate the business interests of processing “Non Personal Data” within the CCPA regulation itself.

CCPA defines “Aggregate Consumer Information” as follows

“Aggregate consumer information” means information that relates to a group or category of consumers, from which individual consumer identities have been removed, that is not linked or reasonably linkable to any consumer or household, including via a device. “Aggregate consumer information” does not mean one or more individual consumer records that have been de­identified.

We can observe that this includes the “Community Privacy” which Justice Srikrishna mentioned in his report and what the Kris Gopalakrishna committee has been asked to regulate as “Non Personal Data”.

Additionally, by including a reference to “households” and “Devices”, CCPA has extended the regulation to the “Non Personal Data”.

CCPA also incorporates a provision for “Selling of personal information” and considers the personal data as property that can be sold.

It is therefore reasonable to consider that part of the new regulation that the Kris Gopalakrishna Committee may consider to recommend may consist of measures which are included in the CCPA.

However, whether this committee will restrict itself to the addition of some CCPA provisions to the PDPA or go further and speak of a larger legislation for “Data Governance” will depend on the vision of the Committee members.

Since this is a good opportunity to bring in regulation of both personal and non personal data into an umbrella legislation on the “Data Governance Framework” , it would be interesting if the committee does consider this larger objective.

Naavi

 

 

Posted in Cyber Law | Leave a comment