New amendments to the Privacy Act announced yesterday by the Attorney-General George Brandis to protect the security of anonymised data could have the perverse effect of making it harder to uncover flaws in anonymisation and encryption techniques.
Brandis yesterday announced that the government would be amending the Privacy Act to “create a new criminal offence of re-identifying de-identified government data. It will also be an offence to counsel, procure, facilitate, or encourage anyone to do this, and to publish or communicate any re-identified dataset.”
Re-identification is the process of using anonymised data that has been released by public authorities to match it up to individuals within the group — either through exploiting the linkage key used to structure the data or using other data to narrow down the likelihood of a single data point belonging to an identified individual.
While Brandis said the prompt for the change was a Senate health committee report, it appears to have been due to the uncovering of serious flaws in a recently released Medicare and Pharmaceutical Benefits Scheme dataset that allowed re-identification of individuals. The flaws were discovered by the University of Melbourne IT security specialist Vanessa Teague at the Department of Computing and Information Systems.
De-identification has also been controversial given the Australian Bureau of Statistics’ decision to transform the census into an ongoing personalised longitudinal document for every citizen using names, addresses and data linkage keys. As Crikey and others explained, it can be trivially easy to re-identify data that has been de-identified, making the ABS’ new approach very risky from a privacy point of view. Privacy Commission Timothy Pilgrim has also warned of the need to ensure notionally de-identified data is treated as carefully as private information.
Brandis’ amendments would outlaw efforts to do that to Commonwealth datasets — such as the census, or health records. So all good? Well, to an extent, yes. A legislative prohibition is backed by public health experts in the US as one way of addressing concerns about the risk of de-anonymisation.
But there’s a very real risk that any prohibition will also prevent people — such as Teague and her colleagues — from identifying flaws in de-identification methods, or communicating those flaws once they discover them. This already happens to some white-hat hackers who subject commercially available software and corporate and government systems to penetration testing.
While some major companies offer rewards for people who spot and pass on security vulnerabilities, some hackers find themselves prosecuted for revealing potentially highly damaging flaws. The definition of “counsel, procure, facilitate, or encourage” would accordingly need to be drafted to exclude legitimate testing and sharing of, for example, flaws in the statistical linkage keys employed by the ABS. As online rights group Digital Rights Watch said this morning:
“The specific wording of ‘counsel, procure, facilitate or encourage’ will need to be framed carefully to exclude innocent acts, such as rigorous penetration testing of encryption software. Likewise, the whole area of research into de-identification research, such as that undertaken by the CSIRO, could be jeapordised through heavy-handed legislation … Criminalising security testing is the wrong way to increase security. The Government should instead focus on ensuring that data is not collected or stored in forms that allow re-identification.”
But the legislation will be drafted by the Attorney-General’s Department, which is openly contemptuous of data security issues, not to mention basic rights. AGD failed to respond to Crikey‘s request for clarification on the issue.
This is a complex issue because it is at the intersection of public health ethics and the IT industry mindset that is highly sceptical of any claims to security, and eager to subject them to rigorous testing. As Daniel C. Barth-Jones said in his magisterial look at the area in 2013:
“… if you’re a cryptographer, it is not surprising that you might be more inclined to suspect that everyone’s a spy — it’s just part of your training to do so … it should not be too surprising then that white hat hackers conducting ‘penetration testing’ likely think that other researchers are just fooling themselves if they rely on social and cultural norms, data use contracts and other legal protections, and ‘security by obscurity’ as part of the total package which prevents the occurrence of re-identification attempts.”
In Barth-Jones’ view, the risks of de-identification can be overstated (particularly when de-identification targets atypical people who are more easily identified than most of the population) and security experts rely too heavily on absolute guarantees rather than cost-benefit analyses about privacy-health benefit trade-offs. Nonetheless, he recommends “a carefully designed prohibition on re-identification attempts could still allow research involving re-identification of specific individuals to be conducted under the approval of Institutional Review Boards”.
The idea of the people who gave us the data retention debacle “carefully designing” anything, however, remains laughable. The net result could be that it becomes illegal to discover how badly encrypted datasets are, or let anyone know.