Privacy-aware Data Analysis

Our work in this area aims to support the analysis of data while maintaining personal privacy. Currently our focus is on data sanitization techniques (such as anonymization) and privacy-preserving data analysis techniques (differential privacy). Our research and development address the following topics:

  1. Data sanitization (releasing anonymized data sets, e.g., patient records). In this area, we consider different k-anonymity-based algorithms to allow for:
    • Efficient anonymization of large sets of data (data sets that do not fit in memory).
    • Support of heterogeneous and semi-structured data sets.
    • Support for user-guided (targeted) anonymization to improve utility.
  2. Privacy-preserving data analysis (e.g., releasing anonymized statistics based on the data set). In this area, we develop methods to produce statistics that provide differential privacy guarantees.
  3. Developing practical, deployable solutions including:
    • Standard REST API's
    • Support for Big Data

Much of our work is conducted as part of EU projects such SUNFISH and PRISMACLOUD, where we also have access to real use cases and data.

Our goal is to integrate state-of-the-art anonymization solutions into leading IBM offerings.


Micha Moffie, IBM Research - Haifa