May 29d 2018 Right in time with the new EU general data protection regulation (GDPR) coming into effect, Berlin-based startup Statice launches a product that offers companies companies big and small a cure for their GDPR headaches.
The notion of an open data economy has been popularized over the last several years. A recently published IDC study estimates that by 2025 over 20% of the world’s digital data will be utilized by both public and private sector entities to manage, protect and improve our daily lives through increased investments in a wide range of open data initiatives.
With the GDPR in effect, it is virtually impossible for companies to freely share or collaborate with partners on sensitive customer data without having to ask for customer consent to do so or to anonymize the data before sharing. If companies do not comply they may face fines of up to 4% of their global revenue.
However, even anonymizing data might not do be secure enough in the post-GDPR world. The big problem sharing and opening up data, is the risk of re-identification of personally identifiable information - even when anonymized with traditional methodologies. The rapid advances in technology make it easier to combine data from multiple sources which could compromise privacy rights. The most famous case for this is the Netflix case.
In 2006, Netflix released over 100 million movie ratings made by 500,000 of their subscribers as part of their competition to improve the company's system of DVD recommendation. Netflix allegedly anonymized the data set by removing any personal details. Nevertheless, researchers were able to re-identify the data - simply by comparing the Netflix data against publicly available ratings on the Internet Movie Database.
This is where young startups like Statice come into play. Statice’s goal is it to to solve this pain enable companies to share their data freely to drive collaboration - all while protecting customers’ privacy.
Statice is an automatic data anonymization software leveraging deep generative models to create privacy-preserving synthetic data. This synthetic data set preserves the statistical properties of original data and thus its most informative value. The synthetic dataset can then be used for all secondary data use cases such as training machine learning models or sharing data with partners.
The underlying technology has only been made possible by recent advances in deep learning over the last two years and thus indicates a new major advancement in the data privacy field.
One of the use cases the company is focusing on, is the sharing of medical data between research partners as well as with application developers to advance predictive medical products. Therefore, Statice’s technology goes beyond traditional anonymization technologies that were either not secure or would lead to a major loss in data information.
Statice announced that it is launching its product last week at TechCrunch Startup Battlefield in Paris.