New Data Theory of Naavi built on three hypotheses

In searching a proper expression and articulation of the Theory of Data, Naavi has decided to adopt a set of three hypotheses which are the pillars of this New Theory of Data.

The three hypotheses are

a) A hypothesis that tries to explain the definition of data

b) A hypothesis that tries to explain the life cycle of data

c) A hypothesis that tries to explain the ownership of data

The three hypotheses combine together in developing a comprehensive theory of data.

Watch out for more …

Naavi

Posted in Cyber Law | Leave a comment

Six amendments proposed to California Consumer Privacy Act

The California Consumer Privacy Act (CCPA) which is applicable to the collection and processing of the personal data of Californian residents is set to become effective from 1st January 2020.

CCPA has already distinguished itself by its honest approach to privacy protection  by specifically admitting the possibility of “Sale of Personal Data”. Unlike GDPR which does not provide clarity on whether personal data may be “Sold” even when there is an “Explicit Consent” and leaves the data processing companies in doubt, CCPA is clear in its prescriptions.

CCPA also recognizes a “Financial Value” for the personal data, recognizes the right of ownership of the data subject to deal with it even in commercial terms. While Privacy activists may debate the ethics of “Trading of Personal Data”, the fact is that this provision gives some breathing space for data dependent businesses.

Now before the act becomes effective, some amendments have been proposed and is likely to be discussed and probably passed before the January 1, 2020 deadline for implementation.

The Six amendments are as follows.

  1. Reasonable Authentication: 

CCPA shall allow a consumer to submit requests through a “Consumer Account”, if the customer maintains an account with the business.

The employee information collected in the course of a natural person acting as a job applicant, employee, owner,director, officer,medical staff member or contractor is exempted from the definition of personal information for one year (until January 2021)

The exemption also covers employee emergency contact information and information used to administer benefits, but it does not apply to a business’s obligation to provide notice to employees about its collection practices or employees’ eligibility for the data breach provision’s private right of action.

2. Classification of Personal Information

This amendment adds the phrase “reasonably capable of being associated with . . . a particular consumer or household.” to the definition of how a data is identified as a personal data.

The bill also clarifies that any information made available by federal, state or local government is “publicly available” and is not personal information.

The amendment also eliminates the provision of the CCPA stating that publicly available information that a company uses in a manner incompatible with the purpose for which it was originally collected by the government is considered covered personal information.

It also clarifies that personal information does not include de-identified or aggregate information

3. Right to Forget

The amendment adds a new exception to a consumer deletion request that allows a business to deny the request if the information is needed to “fulfill the terms of a written warranty or product recall conducted in accordance with federal law.”

It also creates an industry-specific exemption from the right to opt out of the sale of personal information for vehicle or ownership information maintained or shared between an automobile dealer and a manufacturer if it is maintained or shared for certain purposes.

4. Data Brokers

This amendment requires “data brokers” – defined as a “business that knowingly collects and sells to third parties the personal information of a consumer with whom the business does not have a direct relationship” – to register with the California attorney general.

5. Miscellaneous amendments

a) A one-year exemption  to be provided for personal information exchanged in certain business-to-business communications.

b) A covered business does not have to collect or retain consumer information for CCPA purposes that it would not otherwise collect or retain in its ordinary course of business.

c) Businesses must disclose to consumers their right to request specific pieces of information a business has collected about them, and includes some changes to the CCPA’s exception for consumer-credit information covered by the Fair Credit Reporting Act (FCRA)

6.  Exemption from Toll free phone number

An exclusively online business with a direct relationship with a consumer need not provide a toll-free phone number to which consumers can submit a request for disclosure of information. It need only provide consumers with an email address.

Additional clarification in the form of draft regulations is expected from the California attorney general in late October or early November.

It is also expected that California may also pass a State Privacy Legislation soon. Since many other states (16 by last count) are following the steps of CCPA, the changes in CCPA is likely to have wide impact on the Privacy protection regime in USA.

There is a need to closely watch the developments in the Privacy regime overtaking USA for Indian businesses to structure their compliance measures.

Naavi

Posted in Cyber Law | Leave a comment

Data Is Always Evolving

One of the myths that is being perpetuated by Data Protection Regulations is that there is some thing called “Personal Data” and some thing called “Sensitive personal data” which companies collect and which needs to be protected.

The regulators however forget that in a corporate environment several kinds of data keep flowing in and out. It is not always that a “Data Set” like a Name, Address, E Mail, Mobile, health Data, Financial Data etc come at one single point of time so that they can be immediately tagged and protected as required. It happens only in cases where a company puts out a Web Form and collects some designated information from a source. In such cases a “Consent” can be obtained and data protection compliance can be achieved.

However in most cases, data flows in in different contexts and through different channels often in unstructured format. A company could have received the name and E Mail address an year back and today the same person’s further data may just land within the Data environment of the organization. When the new information is fused to the earlier information, the simple data grows into bigger and more sensitive form.

Similarly, it is possible for a set of available data can be disintegrated and a sensitive data may be converted into a non sensitive data and also anonymized data.

The fact that a personal data is always a “Set” of  elements one of which is the core identifier of a living natural person and there is an organic growth of the data into different forms is not adequately captured in the data protection regulations. Some of the data protection regulations define individual identifiers themselves as “Personal Data” without recognizing that any identifier not being identifiable with another “identity” of a living individual cannot be called “Personal Information” is often missed.

As an example we often hear, IP address is Personal Information or Physical Address is “Personal Information” etc. Though data protection practitioners try to enable their processes to identify the conversion of the status of data from one state to the other through manual intervention or with the use of AI, this remains a lacuna in the regulatory definition of data.

The New Theory of Data has to therefore capture in its Data Definition that “Data is Dynamic”, “It evolves over time” and “Consent” obtained when the data is in its Zero day status fails when a new data element comes within the radar of the Company.

An example could be that a Company may have a group photo of people many of whom is not known to it. Suddenly, one of the person becomes identifiable because he sends in a job application with a photo. Now the Group photo which is already in the data system as of a past date becomes an “Identifiable” data. This dynamic nature also affects the Data Portability and Data Erasure requests.

The New Theory of Data needs to recognize these anomalies and ensure that there is a valid explanation of these special instances of data within the theory of data.

Similarly “Data as a Property” of the Data Subject or Data as a productive asset of an organization is not properly captured by the present technical or legal approach to data.

Thus the current system of understanding data from the perspective of technology and law appears to be posing contradictions because each domain of stake holders have at different points in time tried to describe the term “Data” for their own convenience. If these differences are not amicably resolved the Corporate managements will find it difficult to balance the differing demands of the technologists, lawyers and the business managers.

The need for a new approach to understanding data is therefore critical and this new theory should be capable of creating a proper definition for the term data so that all seemingly contradictory views converge under the new theory.

Watch out for more…

Naavi

Posted in Cyber Law | Leave a comment

Data Science Has to Evolve From Technical Perspective…

Data Science is an important area of study in the present days when “Data” is considered as an important economic asset which can be harnessed like “Oil” or “Mined” like Gold.

According to Wikipedia,

“Data science is a multi-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data.”

The view of most data scientists however is limited to disciplines such as “Statistics”, “Mathematics”, “Computer Science” and “Information Science”.

Data science is considered as a “concept to unify statistics, data analysis, machine learning and their related methods” in order to “understand and analyze actual phenomena” with data. The term “Data Science”  is often used interchangeably with  concepts like business analytics, business intelligence, predictive modeling, and statistics.

The business of “Big Data”, “Machine Learning”, “AI Algorithms” all depend on “Data Scientists”.

For those of us who have watched the growth of “Information Security” as a professional domain, the evolution of “Technical Aspects of Information Security” to “Techno Legal Aspects of Information Security” was clear. Today, the legal aspects of Information Security has taken a firm grip in the Information Security domain. With “Information Security” migrating to “Data Security” and emergence of stringent laws such as GDPR, the future of “Information Security” has slipped out of the hands of the CISOs to DPOs. (Data Protection Officers).

While the IS domain soon realized the importance of “People” along with “Processes”, the transition was fully extended into the third dimension of “Behavioural Science” by practitioners like Naavi.

At present the Data Security domain has taken a further step ahead towards “Data Governance” bringing in the “Business Management” professionals closer to the group of Data Security Professionals.

A similar graduated evolution is now also required in the field of Data Science. The various theories of Data that support the Data Science field as of today work around the technical aspects of Data.  Statistical tools to segregate and detect correlations in a heap of data, drawing predictive conclusions, creating self learning and self improving algorithms are all based on technical perspective of data.

The technical perspective of data is as a “Stream of Binary Notations” as can be read by a “Binary Reading Device”. The “Binary Stream” rests on some surface like the platter of a hard disk or a Compact Disk or a Memory Card. The reader reads the binary stream, passes it through an application that assigns meaning to the data stream and thereafter sends it out into another data processing step or to a data delivery device like the “Monitor” or a “Speaker” for the human being to “Experience” the data.

To enable this conversion of data from a binary stream to a human experience, computer engineers have developed a protocol such as splitting a data stream into some finite bit packs, separate it with limiters, adding meta data to instruct the devices on processing. How efficiently data can be read, multiple data can be aggregated, profiling can be achieved etc are the problems that the Data Scientists try to examine. But in this entire process they deal with “data as a binary stream”.

In “Technical Perspective”, Encrypted data is also a binary stream though it is different from the binary stream of the parent data itself.

The moment we start recognizing the binary stream as a word, sentence, picture, sound etc., we are adding human interpretations of the binary stream. Then Data is no longer in the technical domain only. It has crossed over to the human domain.

If the human observing the data is blind, he will not see any data. If he is colour blind, he may see some data but miss some other parts of the data. If he is deaf, he may miss some sound. If his ear/brain cannot respond to some frequencies then he will hear sounds which are different from the sound which his neighbor is hearing from the same speaker. Even in texts, if the person does not know the language of the text, he will not understand the data.

Thus “Data” is not what the “Binary Stream” suggests. Data is what the human perceives. It is for this reason we say “Data is in the beholder’s eyes”.

Do the Data Scientists of the day factor in this possibility that Data may be different for different people?

Similarly, for law enforcement, when they look at “Data” as “Evidence” the same issue confronts them. What is the data that a person sees in dependent on the technology that converts the binary stream into a  human experience. If the devices (hardware and software) used for the purpose do not do their job as expected, then even the person who is not color blind or deaf will also not see the same data that some body else with another device may see.

For example, a Word document created in MS Word can be read also in Libre Office or some other word processor. But will the reproduction will be exactly same as what one can see in MS Office? .. is doubtful. Similarly, a web document may look differently in different browsers and with different configurations. Hence the data as seen by a human even if he has no disabilities is still dependent on several aspects and is not the faithful rendition of the binary bits which the data scientists recognize as the data.

The principle is similar to the Einstein’s theory of relativity. Data is not absolute. It is relative to the devices used to convert the binary stream into text, sound or image and further on the ability of the observer to observe faithfully what is rendered.

An ideal theory of data should therefore cannot stop at studying the data only from the perspective of technology without fully absorbing how other factors affect the human experience of data.

Perhaps there is a need to think differently and develop a “New Theory of Data”.

Watch out for more…

Naavi

Posted in Cyber Law | Leave a comment

What is the life cycle of Data?

“Life Cycle” starts from birth and ends with death. In the case of humans, life form comes from the environment (in the form of the Pancha bhootas)  and goes back to the environment. Human life has two components namely the physical body and the “Life within it”. Human body is constantly aging and growing and is never static. According to wise men, even after a person dies, some activities of the human body such as hair growth and nail growth continues for some time. So, there is a specific distinction between “Body” and the “Soul”.

In non-living bodies, the structure may remain static unless otherwise affected by an external agency. A stone may remain static unless rain water flowing over it or dust accumulating over it keeps it either eroding or growing over a long time frame. This however is different from human growth which occurs from within with the intervention of what we identify as “life”.

It is interesting to ponder when does a “Stone” take birth and when does it die and what happens to it during its life cycle?

A “Stone” is born as “Soil”. It is compressed under the earth until it becomes hard. During this process fine particles may come together as a larger group of particles bound together closely. This may then come out of the earth and get exposed as stones and rocks. Some times it may come out like Lava and there after solidify into a rock form on cooling.

The death of a rock is in it being powdered back into soil form. Hence the life cycle of a rock is from the soil back to the soil. In the intermediate life cycle it can assume different forms such as sedimentary rocks, Metomorphic rocks or igneous rocks etc. Some may be carved as “Statues”, “Idols”, “Slabs” etc.

“Data” life cycle has some similarity. It is debatable if it has “Life”. It is also interesting to think …When does “Data” take Birth and When does “Data Die”? and what forms it can take in between is difficult to answer. During its life cycle, Where does “Data Reside”? Is it in a hard disk”,  Is it in a tape? Is it in a Memory Card? how does it flow from one data holding device to another?… Will it copy? or will it move?.. are questions to which we believe we have an answer on a contextual basis. We say this can be this and also that most of the time.

If Data can be in different forms at different times, it is fine. But can some body explain this? Physicists often say Light is a stream of Photons but Light also behaves some times like a wave. But some body has to explain how can it be both? and at what circumstances the Light Photons behave like a wave? That explanation is what we may call the “Light Wave Theory”Another set of Physics went further and said there is a “Matter Wave Theory ” according to which all “Particles” have a “Wave like behaviour”.

Now, if we have doubts about “What is Data”? and “Data is many things to many people based on the context”, we cannot simply accept this “Context Based Excuse” as a “Definition”. Based on one such notions, law makers create laws and Judges interpret them according to their own whims and fancies and the industry is left to deal with the issue on an adhoc basis.

Some times the Supreme Court thinks “Data Disclosure” is a freedom or expression and some times it thinks it is a “Privacy Right”. Same kind of uncertainty exists when Quantum Computing says a “Data State can be either zero or one at the same time”.

If the differences between “Data” from classical computing and “Data” from  Quantum Computing has to be resolved, and “Data” from Classical/Quantum computing perspective to Judicial perspective has to converge, then we need a “New” “Theory of Data”.

Naavi is trying to explore this interesting theoretical concept of “Data” and trying to find a model of description which should fit into the different perceptions of “what data is”.

This is the “New Theory of Data” which will be revealed bit by bit here. Watch for more…

(P.S: This discussion is purely academic)

Naavi

 

Posted in Cyber Law | Leave a comment

The New Theory of Data

Can there be a single answer to “What is Data” which resolves the dilemma of the common man?

Watch these columns for more ….

Naavi

P.S: This concept was expanded through a series of articles culminating in a discussion in the book on Personal Data Protection Act of India (PDPA 2020). Though this is an academical discussion, this discussion is required to develop Data Protection Jurisprudence for the future.

The other articles can be found through a search on Theory of Data  Some of the key articles explaining the theory is also given below.

October 8 2019: New Data Theory of Naavi built on three hypotheses

October 8, 2019: Theory of Data and Definition Hypothesis

October 10, 2019: Reversible Life Cycle hypothesis of the theory of Data

October 11, 2019: Additive value hypothesis of ownership of data

November 20 2019: Will Personal Data Protection Act be compatible to the Theory of Data?

March 31, 2018: Theory of Dynamic Personal Data

Posted in Cyber Law | 3 Comments