Part VII


For years, data protection was viewed as an annoying task that companies had to pay lip service to, but was often overlooked or underfunded. With the advent of the EU General Data Protection Regulations, all that changed. GDPR is one of the most wide-ranging pieces of EU legislation to date and carries such significant penalties that no company can afford to ignore it.

Building a privacy-preserving

data analytics stack

In this paper we have shown you how to build a modern data analytics stack that embeds data privacy at its heart. We started by giving a conceptual model that sought to tease out the core functionality and elements needed in any any analytics stack.

The core elements are the Data Loader, the Data Warehouse and the Data Consumer. We saw how each of these elements requires certain functionality ranging from Ingestion and ETL to Analysis and Presentation. The idea here is to give you a better understanding of how all the parts of the analytics stack fit together.

We then looked in detail at the data security and data privacy functions. These are the only functions that are present in every element of the analytics stack. We explained what data security and privacy are and showed why they are the key functions of any analytics stack.

Later in the paper we explored how GDPR affects your data analyses. We saw that it affects anyone who handles the personal data of EU residents and explained the stringent penalties for companies that breach the regulation. We then showed how anonymization provides a simple way to bypass the requirements of GDPR. If data is properly anonymized it is no longer covered by the regulations and so can be used freely. To this end we looked at some of the concepts behind anonymization, and compared them with pseudonymisation.

Finally, we introduced you to Aircloak Insights, our turnkey solution that allows you to upgrade any existing or future analytics stack and make it fully GDPR-compliant. Aircloak Insights is the first technology that applies the Diffix anonymization approach. Diffix works by applying controlled amounts of pseudo-random noise to the query results. This has the key benefit that it avoids the problem of the query budget which affects many other dynamic privacy mechanisms. This allows the system to perform dynamic anonymization on data queries and pass the results straight to the analyst for further processing and visualisation. In conclusion, privacy regulations have often been viewed negatively by companies, but as data-driven business models emerge in more and more industries who deal with highly sensitive data, such as the healthcare and finance sector, modern technology and the right planning mean the regulations needn’t be a burden, and may even allow you to build and retain the trust of an ever-more wary public.

Ready to see what Aircloak can do for you?