Open source

Cape Privacy is the data privacy solution for collaborative data science and machine learning.

Open community. Open learning.

Cape Privacy was conceived from the beginning as an open source effort. To truly have an impact, we believe the Cape solution must be open source, so that it’s freely accessible to developers. At Cape Privacy, we are committed to growing an online and open community that will change the worlds of Data Privacy and Data Science.

Get started now

Download Cape Python in seconds from available package repositories: DockerHub or pip. Easily import and use with your favorite data science libraries like Pandas and Apache Spark.

Join us on GitHub

Made for data science and machine learning

Get started with Python, Apache Spark and Apache Arrow integrations to allow you to keep using the tools you know and love in a safe and secure way. New integrations are in development for our open-source tools. Slack us or hop onto our Cape Feature Roadmap to learn more about upcoming features and let us know what you would like to see.

Extensible advanced privacy and security features

Cape Transformations are shipped in our open-source repository. We encourage you to edit, change and add new transformations for data protection, data security and data privacy in your data science workflows. Check out our repository and examples to learn how to write your own data protection functions and use them in Cape!

Latest Release Notes

View changelog

  • Write Policy for Pandas and Apache Spark
    • - Write clear and simple policy for privacy in your data science projects.
    • - Apply policies to your data science workflows to transform and protect sensitive data.
  • Five Built-in Transformations
    • - Tokenization with Linkability
    • - Perturbation for Numeric and Date Fields
    • - Date Truncation
    • - Rounding

Things to try in this version

For Safer Apache Spark DataFlows

Apply policy in your Apache Spark pipelines to ensure sensitive data is protected. You can apply Cape policy directly in Spark, making it easier for your data engineers to protect data as it moves in and out of your organisation.

For Exploratory Data Analysis

Leverage tokenization, cipher-based transformations and redaction to improve privacy protections on your data science projects. With the ability to implement linkable, reversible and reproducible tokenization, you can perform most EDA tasks without having to sacrifice customer trust and privacy.

For Shared Policy Writing

Write policy that is applied across different data science workflows, allowing your data science and privacy teams to collaborate on what data should be protected and how. This shared policy writing experience is a sneak peek of more features to come in Cape Core.

Cape Privacy's Open Source Contributions

TF Encrypted For Machine Learning
Check out our work on TF Encrypted, allowing you to perform deep learning on encrypted data.

Join the discussion on Slack

Be a part of our open community and embrace the learning opportunities only Cape can provide.

Join our workspace

Our mission is simple: data privacy for all

At Cape, our foundational core values are those of respect, collaboration, and trust.

Learn about us

Get started today

Find out moreContact us