Operationalize Your Encrypted Data in a Snowflake Data Clean Room

  • Reesha Dedhia
    Reesha Dedhia
Data Clean Room - Blog Image

Keeping data secure is a business imperative. That's because the costs associated with a data breach are significant, and the risk of technical or human error, or an attack by threat actors, is ever-present. According to a 2021 cost of a data breach report, compromised data will cost an average of $4.24 million per incident; but for certain industries the costs are much higher. Healthcare and financial services are at the top of the list with average costs of $9.23 million and $5.72 million respectively. Fines, legal fees, remediation, customer churn, and erosion of brand trust are among the factors contributing to resultant losses.

Those risks come into conflict with another business imperative: putting data to work for the creation of decision intelligence. Whenever an organization puts its data stores to work, it often requires some element of data security risk. Traditionally, accessing data for the purpose of running various models for the creation of decision intelligence means moving data and decrypting it, and every time data is moved or decrypted, there is a chance that it will be inadvertently or intentionally exposed to unauthorized parties. But the lure of discovering the insights that can lead to improved performance and competitive advantage is strong, driving innovation in how enterprises apply their data models, striking the balance between risk and reward.

One such innovation is known as a data clean room. Data clean rooms are secure facilities for the specific purpose of segregating their sensitive user data while allowing third parties to bring their models and test performance, usually associated with targeted advertising, against the aggregator's data. Because of the security risks associated with using consumer data, the aggregator is provided with the various models a partner wants to run, and the results are shared with the partner. As you might imagine, data clean rooms are expensive to build and operate, and they do not afford much flexibility for the organization testing their models.

For all their shortcomings, data clean rooms have the advantage of using actual and complete user datasets, rather than partial samples or synthetic data. However, because of the lack of flexibility and the processes involved in safely moving and segregating data, precision and timeliness is sacrificed for the sake of security.

The concept of the data clean room remains valid, and value inherent in the results of running models in a data clean room environment is high, and so the concept of a software-based data clean room was created, but the security limitations associated with collecting and sharing data—especially third party data—limits the usefulness of a software-based approach to data clean rooms to narrow band of use cases, and often the data run in a software data clean room are not complete.

Recently, Snowflake introduced their own take on the idea with their "distributed data clean room," combining several features of their data cloud in a way that makes it possible for Snowflake users to safely share their data with other subscribers. By taking advantage of Snowflake's data sharing technology, platform, secure functions, and secure join capabilities, Snowflake customers can replicate the operation of a data clean room. Originally, data security was effected through the hashing of data when shared through the distributed data clean room, but now there's another way—and it's one that extends the utility of the distributed data clean room and affords access to all types of data while ensuring complete data security throughout the entire process.

When Cape Privacy announced its partnership with Snowflake on November 16, it brought to bear Cape Privacy's cloud-based platform that delivers a novel combination of the secure data sharing methods secret sharing and secure, multiparty computation (MPC). These two techniques take advantage of innovations in data processing and artificial intelligence that make it possible, for the first time, operationalize fully encrypted data.

Now, when a Snowflake customer collects data, they can apply encryption at the point of capture and keep the data encrypted as it is transferred to Snowflake and then used within a Snowflake distributed data clean room. Because the data is never decrypted, even when run through whatever AI prediction a customer chooses, it remains secure throughout the entire process. There is no single point of failure that could result in operator error or threat activity compromising the data, resulting in a costly data breach.

That means Snowflake users can not only keep their own data secure for the creation of predictive and decision intelligence, but it extends the ability to share third-party data within the distributed data clean room environment, resulting in more precise outcomes while keeping the integrity of all data secure throughout the entire process. With secure access to new, rich sources of data, timelier, more precise predictive modeling is now possible for financial services and other data-rich industries. And timelier, more precise predictive modeling translates to better business outcomes.

For more information about Cape Privacy, or how you can take advantage of the Cape Privacy platform within the Snowflake Data Cloud to operationalize your encrypted data, contact us.

Share this post