Based on the latest innovation in differential privacy research and best practices from real-world applications

Designed for both statistical analysis and machine learning applications

Open source framework to create and test new algorithms and techniques.

Why this toolkit?

Based on cutting-edge Differential Privacy algorithms

Data scientists, analysts, scientific researchers and policy makers often need to analyze data that contains sensitive personal information that must remain private.

Commonly-used privacy techniques are limiting and can result in leaks in sensitive information.

Differential Privacy is a technique that offers strong privacy assurances, preventing data leaks and re-identification of individuals in a dataset.

Learn More

Theory Behind Differential Privacy

Flexible native runtime

A native runtime library to generate and validate differential privacy results that can be used with C, C++, Python, R, and other languages.

Built-in Connectivity to Data Sources

Access data from Data Lakes, SQL Server, Postgres, Apache Spark, Apache Presto, and CSV files.

Granular Privacy Risk Controls

Track privacy risk by managing multiple requests on data. Use privacy budgets to control the number of queries permitted by different users.

Privacy Loss Tester

An evaluator to automatically stress test existing differential privacy algorithms.

How It Works

This Toolkit is designed to be a layer between queries and data systems to protect sensitive data.

When a user queries the data or trains a model, the system:

Adds statistical noise to the results,

Calculates the privacy risk metric or information budget, used by the query,

Subtracts from the remaining budget to limit further queries.

Applications

Differential Privacy for Statistical Analysis

In-built support for commonly used mathematical and statistical operators

Use cases span healthcare, sensitive socio-economic data and more

Examples

Differential Privacy in Machine Learning

Built-in support for training simple machine learning models like linear and logistic regression

Compatible with open-source training libraries such TensorFlow Privacy

Examples

Getting Started

Build and deploy easily with the Differential Privacy toolkit!

Install the toolkit

Contribute

As a community project we encourage you to join the effort and contribute feedback, algorithms, ideas and more, so we can evolve the toolkit together!

Contribute

Key Contributors

With support from

Use Data.
Preserve Privacy.

A differential privacy toolkit for analytics and machine learning

Why this toolkit?