Cape Privacy is the data privacy solution for collaborative data science and machine learning.
Open community. Open learning.
Cape Privacy was conceived from the beginning as an open source effort. To truly have an impact, we believe the Cape solution must be open source, so that it’s freely accessible to developers. At Cape Privacy, we are committed to growing an online and open community that will change the worlds of Data Privacy and Data Science.
Get started now
Download Cape Python in seconds from available package repositories: DockerHub or pip. Easily import and use with your favorite data science libraries like Pandas and Apache Spark.
Made for data science and machine learning
Get started with Python, Apache Spark and Apache Arrow integrations to allow you to keep using the tools you know and love in a safe and secure way. New integrations are in development for our open-source tools. Slack us or hop onto our Cape Feature Roadmap to learn more about upcoming features and let us know what you would like to see.
Extensible advanced privacy and security features
Cape Transformations are shipped in our open-source repository. We encourage you to edit, change and add new transformations for data protection, data security and data privacy in your data science workflows. Check out our repository and examples to learn how to write your own data protection functions and use them in Cape!
Latest Release Notes
- Write Policy for Pandas and Apache Spark
- - Write clear and simple policy for privacy in your data science projects.
- - Apply policies to your data science workflows to transform and protect sensitive data.
- Five Built-in Transformations
- - Tokenization with Linkability
- - Perturbation for Numeric and Date Fields
- - Date Truncation
- - Rounding
Things to try in this version
For Safer Apache Spark DataFlows
Apply policy in your Apache Spark pipelines to ensure sensitive data is protected. You can apply Cape policy directly in Spark, making it easier for your data engineers to protect data as it moves in and out of your organisation.
For Exploratory Data Analysis
Leverage tokenization, cipher-based transformations and redaction to improve privacy protections on your data science projects. With the ability to implement linkable, reversible and reproducible tokenization, you can perform most EDA tasks without having to sacrifice customer trust and privacy.
For Shared Policy Writing
Write policy that is applied across different data science workflows, allowing your data science and privacy teams to collaborate on what data should be protected and how. This shared policy writing experience is a sneak peek of more features to come in Cape Core.
Cape Privacy's Open Source Contributions
TF Encrypted For Machine Learning
Check out our work on TF Encrypted, allowing you to perform deep learning on encrypted data.