Transform data to secure it: Use Cloud DLPTransform data to secure it: Use Cloud DLPHead of Solutions StrategyProduct Manager, Google Cloud

When you want to protect data in-motion, at rest or in use, you usually think about data discovery, data loss detection and prevention. Few would immediately consider transforming or modifying data in order to protect it. 

But doing so can be a powerful and relatively easy tactic to prevent data loss.  Our data security vision includes transforming data to secure it, and that’s why our DLP product includes powerful data transformation capabilities.

So what are some data modification techniques that you can use to protect your data and the use cases for them?

Delete sensitive elements

Let’s start with a simple example: one of the best ways to protect payment card data and comply with PCI DSS is to simply delete it. Deleting sensitive data as soon as it’s collected (or better yet, never collecting it in the first place) saves resources on encryption, data access control and removes – not merely  reduces – the risk of data exposure or theft.

More generally, deleting the data is one way to practice data minimization.  Having less data that attracts the attackers is both a security best practice (one of the few that is as true in the 1980s as in 2020s) and a compliance requirement (for example, it serves as one of the core principles of GDPR)

Naturally, there are plenty of types of sensitive data that you can’t simply delete, and for which this strategy will not work, like business secrets or patient information at a hospital. But for many cases, transforming data to protect it satisfies the triad of security, compliance and privacy use cases.

In many cases, data retains its full value even when sensitive or regulated elements are removed. Customer support chat logs work just as well after an accidentally shared payment card number is removed. A doctor can make a diagnosis without seeing a Social Security Number (SSN) or Medical Record Number (MRN). Transaction trend analysis works just as well when bank account numbers are not included. For many contexts, the sensitive, personal or regulated parts don’t matter at all. 

Another area this works well is when a communication’s purpose is satisfied even with data removed. For example, a support rep can help a customer use an app without knowing that customer’s first name and last name.

As another example, our DLP system can clean up the datasets used to train an AI, so that the AI systems can learn without being exposed to any personal or sensitive data. Even first and last names can be automatically removed from a stream of data before it’s used to train an AI. Does your DLP do that? 

In practice, this tactic can be applied to both structured (databases) and unstructured (email, chats, image captures, voice recordings) data. Removing “toxic” elements that are a target for attackers or subject to regulations reduces the risk, and preserves the business value of a dataset.