Encrypted Learning: Helping Data Scientists Solve Problems, Protect Privacy

  • Reesha Dedhia
    Reesha Dedhia
A man and a women holding a clipboard analyzing data

Over the next ten years, data science roles will change as the skills and tools data scientists rely on evolve, driving 15% growth for the profession, according to the Bureau of Labor Statistics. The ability to extract valuable insights from data is too important for organizations to ignore, and investments and innovations in data science will improve decision-making. And a big part of that evolution will take place through advancements and innovations in machine learning tools that will make it easier for data scientists to do their jobs. Those innovations are already unfolding.

Familiar Tools

Whether it's simplified Python packaging, automated machine learning (AutoML) frameworks, or privacy-enhancing technologies (PET), there is a clear trend underway to expand the portfolio of tools available to data scientists and to simplify the tools that are available today. By extension, broader capabilities and a more intuitive user experience will give data scientists more options for finding and extracting more precise insights from data stores, including data from third parties that are currently out of reach. One area that is ripe for innovation—and where Cape Privacy is focused—is on expanding the horizons of data science through advancements in PET that can be used to train machine learning algorithms on protected data, without exposing the actual data. We call it Encrypted Learning. Encrypted learning is a new method for training machine learning algorithms by using data that is highly classified or would otherwise be off-limits to data scientists because of privacy regulations. With encrypted learning, however, privacy is protected by default because the data is always encrypted. Data owners have assurance that their data is safe, and data science teams can maintain the confidentiality of their models.

Focus on Solving Problems

By making the tools of the data science trade easier to use, data scientists can spend less time on things like data preparation, coding, and cryptography, and more time developing and applying a diverse set of complementary technology and soft skills, like communications, teamwork, and creative thinking that are essential to innovative problem-solving, thus maximizing data science's potential. When data scientists are able to focus their energies on actual problem-solving, the opportunities for innovation grow. That's why we've focused our efforts on building a cloud-based platform that integrates open-source technologies like Python, Docker, NumPy, Apache Spark, and TensorFlow, which are already in wide use within the data science community. And we've built our own pycape tool that supports cryptography without requiring that the user learn it. Similar to the familiar experience of using a Jupyter notebook, pycape has an intuitive user interface that makes it easy to put to work.

Audentes Fortuna Iuvat

And, at the center of our strategy, is the idea of giving data scientists the ability to ply those tools with new, powerful sources of data that were previously unavailable to them. It is our belief that the best data science comes from the best data, and the best data is often prohibited from use due to privacy and data protection regulations. Now, with encrypted learning, new stores of data are put to work solving problems, without compromising privacy. More than two millennia ago, the Roman poet Virgil wrote, "Audentes fortuna iuvat." That translates to "Fortune favors the bold." In this age of data, the means are available to take bold action based on insights once hidden behind privacy concerns. Encrypted learning can let you be bold. Get in touch for more information or a demo.

Share this post