select search filters
briefings
roundups & rapid reactions
before the headlines
Fiona fox's blog

expert reaction to study looking at data anonymisation and privacy

A study, published in Nature Communications, reports that current levels of data anonymisation may be inadequate to protect privacy.

 

Prof Stephen Evans, Professor of Pharmacoepidemiology, London School of Hygiene & Tropical Medicine, said:

“This paper shows that if a single dataset has very many attributes recorded then individuals in them will have unique combinations of those characteristics, even if a person’s name and personal details are not included.  If there are other datasets that do contain the personal information and sufficient information in common, then individuals could be identified if someone has access to both or several datasets.

“The implications of this for the type of medical research using electronic records in the UK are not strong.  Almost all the examples used by the authors, the “anonymised” datasets, include exact date of birth and individual post code.  In the UK medical research databases, exact dates of birth and postcodes are not available to researchers.  Those who administer these databases are very well aware of the methods by which identification could take place and take careful steps to prevent it.  They also place restrictions on publications that give tables in which a small number of individuals are reported in any cell of a table.

“In addition, these datasets are not “publicly available”.  They can only be used by bona fide researchers for well-defined research purposes.

“The implications for the use of various social media datasets which are publicly available are much greater, and the possibility of identification from them is obviously much higher.

“Public health will suffer if research cannot be carried out on large-scale medical data by medical professionals, but it is true that vigilance is required to protect the anonymity of patients’ data.  Theft or loss of paper records which have no attempts at anonymisation generally represent a greater threat to loss of personal freedom.”

 

‘Estimating the success of re-identifications in incomplete datasets using generative models’ by Luc Rocher et al. was published in Nature Communications at 16:00 UK time on Tuesday 23 July 2019. 

DOI: 10.1038/s41467-019-10933-3

 

Declared interests

Prof Stephen Evans: “I have carried out research using large databases of patient data for research purposes, including providing analyses of death rates for the Bristol Royal Infirmary Inquiry.  I have no financial interests in any of this research.”

in this section

filter RoundUps by year

search by tag