The Shape of Things to Come..The New Data Protection Act of India-6 (Clarifications-binary)

(Continued from the previous article)

P.S: This series of articles is an attempt to place some issues before the Government of India which promises to bring a new Data Protection Law that is futuristic, comprehensive and Perfect.

In our previous article, we discussed the definition of Privacy. One of the new concepts we tried to bring out is that “Sharing” should be recognized only when identified personal data is made accessible to a human being.

In other words, if personally identified data is visible to an algorithm and not a human, it is not considered as sharing of identified data if after the processing of personal data by the algorithm, the identity is killed within the algorithm and the output contains only anonymised information.

Typically such a situation arises when a CCTV captures a video. Obviously the video captures the face of a person and therefore captures a critical personal data. However, if the algorithm does not have access to a data base of faces to which the captured picture is compared with and identified, the captured picture is only an “Orphan Data” which has an “Identity parameter” but is not “Identifiable”. The output which let us say a report of how many people passed a particular point as captured by the camera etc is devoid of the identity and is therefore not a personal information.

The algorithm may have an AI element where the captured data is compared to a data base of known criminals and if there is any match, the data is escalated to a human being where as if there is no match, it is discarded. In such a case the discarded information does not constitute personal data access while the smaller set of identified data passed onto human attention alone constitutes “Data Access” or “Data Sharing”.

Further, the definition provided yesterday used some strange looking explanation of “Sharing” as

“..making the information available to another human being in such form that it can be experienced by the receiver through any of the senses of seeing, hearing, touching, smelling or tasting of a human..”

This goes with my proposition that “Data is in the beholder’s eyes” and “Data” is “Data” only when a human being is able to perceive it through his senses.

For example, let us see the adjoining document which represents a binary stream.

A normal human being cannot make any meaning out of this binary expression. If it is accessed by a human being therefore, it is “Un-identifiable” information.

A computing device may however be able to make a meaning out of this.

For example, if the device uses a binary to ascii converter, it will read the binary stream as ” Data is in the beholder’s eyes”. Alternatively, if the device uses a binary to decimal converter, it could be read as a huge number. If the AI decides to consider each set separated by a space as a separate readable binary stream, it will read this as a series of numbers.

Similarly if the binary stream was a name, the human cannot “Experience” it as a name because he is not a binary reader. Hence the determination whether a binary stream is “Simple data” or “a Name” or a “Number” etc is determined by the human being to whom it becomes visible. In this context we are calling the sentence in English or number in decimal form as “visibility”. If the reader is an illiterate, even the converted information may be considered as “Not identifiable”. At the same time if the person receiving the information is a “Binary expert who can visualize the binary values”, he may be a computer in himself and consider the information as “Identifiable”.

It is for these reasons that in Naavi’s Theory of Data, one of the hypothesis is that “Data is in the beholder’s eyes”.

The “Experience” in this case is “Readability” through the sensory perception of “Sight”. Similar “Experience” can be recognized if the data can be converted into a “Sound” though an appropriate processing and output device. Theoretically it can also be converted into a sense of touch, smell and taste if there are appropriate devices to convert them into such forms.

If there is a “Neuro input device” associated, then the binary stream can be directly input into the human brain by a thought and it can be perceived as either a sentence or number as the person decides.

These thoughts have been incorporated in the definition of “Privacy” and “Sharing” which was briefly put out in the previous article.

The thought is definitely beyond the “GDPR limits” and requires some deep thinking before the scope of the definition can be understood.

In summary, the thought process is

If an AI algorithm can be designed that identifiable data is processed in such a manner that identity is killed within the algorithm, then there is no privacy concern.  In fact a normal “Anonymizing” algorithm will be one such device which takes in identifiable information and spits out anonymous information. In this school of thought, such processing does not require consent and does not constitute viewing of identifiable data even by the owner of the algorithm (as long as there is no admin over ride)

I request all of you to read this article and the previous article once again and send me a feedback.

P.S: These discussions are presently for a debate and is a work in progress awaiting more inputs for further refinement. It is understood that the Government may already have a draft and may completely ignore all these recommendations. However, it is considered that these suggestions will assist in the development of “Jurisprudence” in the field of Data Governance in India and hence these discussions will continue until the Government releases its own version for further debate. Other professionals who are interested in participating in this exercise and particularly the Research and Academic organizations are invited to participate. Since this exercise is too complex to institutionalize, it is being presented at this stage as only the thoughts of Naavi.  Views expressed here may be considered as personal views of Naavi and not that of FDPPI or any other organization that Naavi may be associated with.

Next article


I am adding a reply to one of the comments received on Linked In:


Consider the situation of google processing your personal data from cookies or server and providing you specific ad. Google claims this automatic processing and output is anonymous.
So your suggestion to allow this?


It is a good question. It may require a long answer.
In such cases we first need to check through a DPIA what is the harm caused to the individual and arrive at a decision.
In the example cited,  there are three levels of processing. 
At first level there is collection of personal information. If the cookies are persistent cookies and linked to a known customer, it could be personal data and consent is required. If the entire cookie data collected is only anonymous and the collector is not reasonably capable of identifying the individual with other data on hand, it is processing of non personal data. 
At the second level, a profiling occurs and the users are categorised into different market segments may be without individual identity.
For example, if we say category A user’s would be interested in buying a computer, this analysis is not causing harm to the user. Usually this is done by an intermediary market research company. This company need not know the identity of the user and hence it only processes anonymised personal data which is outside the privacy protection regime.
At the third level advertisement is served. If the ad server is aware of the identity of the recipient and  target the ads then it is an activity which could cause harm to privacy.
Let us also take the example of TV ads or hoardings on the street. They are not specifically targeted ads and hence don’t infringe privacy.
Similarly if there are ads on the web space which are not targeted, it would be difficult to call it as infringement. If the ads are targeted by identity, without doubt it would be an infringement.
What you are indicating is a case which falls in between the above two extreme cases of targeted ads to identified individuals and generic ad serving just like the hoarding on the street which is open to everybody.
The determination of privacy impact is determined more by the platform where advertisement is placed. If it is a private space like your email inbox, you may say that there was an infringement. But if it is on a website which you go and visit, the ads may be treated like hoardings and not infringing.
Hence the platform on which the ads are served may determine whether there is harm or not.
What I have suggested would basically apply to intermediaries who only process data without any idea of the data subject and gives out the results to another recipient.  This is what an “Algorithm” would do.
But if Google is able to identify who has provided the data and who is getting the ads, they may not have the status of an “Intermediary” and there could be infringement of privacy.
Hence we may have to take a view based on the context. 

  1. Introduction
2. Preamble 3.Regulators
4. Chapterization 5. Privacy Definition 6. Clarifications-Binary
7. Clarifications-Privacy 8. Definitions-Data 9. Definitions-Roles
10. Exemptions-Privacy 11. Advertising 12. Dropping of Central Regulatory authority
13. Regulation of Monetization of Data  14. Automated means ..


About Vijayashankar Na

Naavi is a veteran Cyber Law specialist in India and is presently working from Bangalore as an Information Assurance Consultant. Pioneered concepts such as ITA 2008 compliance, Naavi is also the founder of Cyber Law College, a virtual Cyber Law Education institution. He now has been focusing on the projects such as Secure Digital India and Cyber Insurance
This entry was posted in Cyber Law. Bookmark the permalink.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.