Talk Abstract: The risk of unintended information disclosure in data publishing.
Sensitive information about individuals can be recovered from different types of data releases. This presentation will explore the privacy risks in publishing data in different formats and introduce privacy techniques to defend against them. From low-dimensional microdata files and raw location traces to aggregate statistics and machine learning models, we will look at real-world examples of unintended information disclosure, highlight different attack models and discuss principles and techniques to protect privacy for individuals present in the data.
Some of the takeaways of the session include:
– Common pitfalls of anonymising datasets
– How linkage attacks can be used to re-identify individuals using quasi-identifiers
– Privacy attacks on machine learning models and how they can be used to recover sensitive information about individuals in the training data
– An introduction to the differential privacy framework and how it can be used to mitigate privacy risks
Bio: Thom is a software engineer at Privitar, focused on creating privacy enhancing technologies to help get the maximum value from data while preserving the privacy of individuals. He has contributed to products that enable tokenisation at scale, dataset watermarking and differential privacy.