In the four previous Series of these articles, we have tried to establish the meaning of Privacy, its relation to Data & Protection, as per the Indian PDPB2019 (Series I & II) We have discussed Data Privacy in some detail, as well as the concept of Personal & Sensitive Personal information (Series III). In the last (Series IV) we have touched upon Data Classification.
In this Fifth (& Sixth) Series of Article, we learn how Data is to be treated as per the Indian PDPB2019.
The entire applicability is summarised in Section 2 of the Act which states as under Indian PDPB2019,
Application of Act to processing of personal data
The provisions of this Act shall apply to,
(a) the processing of personal data where such data has been collected, disclosed, shared or otherwise processed within the territory of India;
(b) the processing of personal data by the State, any Indian company, any citizen of India or any person or body of persons incorporated or created under Indian law;
(c) the processing of personal data by data fiduciaries or data processors not present within the territory of India, if such processing is
(i) in connection with any business carried on in India, or any systematic activity of offering goods or services to data principals within the territory of India; or
(ii) in connection with any activity which involves profiling of data principals within the territory of India.
(D) shall not apply to the processing of anonymised data, other than the anonymised data referred to in the Bill.
Among the host of security techniques available, pseudonymisation or anonymisation is highly recommended by the Privacy regulations. Such techniques minimise risk and are helpful for data Fiduciary & Data Processors in fulfilling their data compliance regulations.
Anonymized Data
Anonymization as per Section 3(2) is defined as
"anonymisation" in relation to personal data, means such an irreversible process of transforming or converting personal data to a form in which a data principal cannot be identified, which meets the standards of irreversibility specified by the Authority.
Accordingly, data about living individuals which has been anonymised such that it is not possible to identify the data principal from the data or from the data together with certain other information.
Another crucial aspect to be noted is that the provisions of this Act will not be applicable to Anonymized data.
Therefore, to protect data, anonymise personal data and make sure re-identification by combining anonymised data with other population data is impossible. Statistical packages may have tooling for anonymisation. Additionally, Anonymisation may also be described as a type of information sanitisation whose intent is to protect privacy. It is the process of either encrypting or removing personally identifiable information from data sets so that the people whom the data describe remain anonymous. Identifiers can apply to any natural or legal person, living or dead, including their dependents, ascendants, and descendants. Included are other related persons, direct or through interaction.
Anonymised data is always unrecognisable, even to the data owner
On the other hand, Pseudonymisation is a procedure in which identifying fields in a data record are replaced by artificial identifiers (pseudonyms). There can be a single pseudonym for a collection of replaced fields or a pseudonym per replaced field."Pseudonymisation" of data means replacing any identifying characteristics of data with a pseudonym, or, in other words, a value which does not allow the data subject to be directly identified. The purpose is to make it harder to identify individuals from the data record and thus to lower respondent or patient objections to its use. Data in this form are suitable for extensive analytics and processing.
There can be a single pseudonym for a collection of replaced fields or a pseudonym per replaced field.
As an example, in the following scenario, pseudonymisation will facilitate
So, when it is necessary to not fully anonymise your data? When data subjects have the right to withdraw their data from the study. Here, the data controller has to be able to identify the data of a specific subject in order to delete this data from the dataset.
Although pseudonymisation has many uses, it should be distinguished from anonymisation, as it only provides a limited protection for the identity of data subjects in many cases as it still allows identification using indirect means. Where a pseudonym is used, it is often possible to identify the data subject by analysing the underlying or related data.
The legal distinction between anonymised and pseudonymised data is its categorisation as personal data. Pseudonymous data still allows for some form of re-identification (even indirect and remote), while anonymous data cannot be re-identified.
In general terms, a natural person can be considered as “identified” when, within a group of persons, he or she is "distinguished" from all other members of the group. Accordingly, the natural person is “identifiable” when, although the person has not been identified yet, it is possible to do it. Thus, a person does not have to be named in order to be identified. If there is other information enabling an individual to be connected to data about them, which could not be about someone else in the group, they may still “be identified”.
By Sameer Mathur, Founder and CEO, SM Consulting
President, Delhi-NCR Chapter of the Foundation of Data Protection Professionals in India
With inputs from Vijayashankar Nagaraj Rao