Many analyses require results only about aggregate data, not individual personal data. Such analyses can in principle be done over anonymized data. If user data is strongly anonymized in the GDPR sense, then it is by law and in fact not personal data. Personal data management systems should therefore exploit the benefits of anonymization whenever and wherever possible.
In the past, data anonymization has been extremely difficult for a variety of reasons. There have been no general-purpose, easy-to-use tools for anonymization. Rather, data anonymization has required a deep understanding of a set of complex anonymization mechanisms, has been complex, difficult, time consuming, and unfortunately prone to error.
This presentation describes Diffix, a new approach to database anonymization that offers a substantially better utility/privacy trade-off than existing approaches including K-anonymity or Differential Privacy. Diffix acts as an SQL proxy between the analyst and an unmodified live database.
Diffix works with any type of data, and configuration is simple and data-independent: the administrator does not need to consider the identifiability or sensitivity of the data itself. In short, a service that gathers personal data can in most cases use Diffix for aggregate analytics as it would any database. Diffix was developed through a research partnership between the Max Planck Institute for Software Systems and Aircloak.