Automatically clean up unneeded data with TTL for Cloud SpannerAutomatically clean up unneeded data with TTL for Cloud SpannerSoftware Engineering ManagerProduct Manager, Cloud Spanner

CalculatedRoutes holds the possible options for a given delivery. The RouteSteps table captures the related turn-by-turn routing for a given option. Each calculated route has many steps. The RouteSteps table is interleaved in its related CalculatedRoutes. Table interleaving is a helpful performance optimization for one-to-many related data that is frequently accessed together, for example, to display turn-by-turn directions or calculate the total score of a route in the optimization algorithm. 

The quality and speed of the matching algorithm are key to the overall user experience. It’s critical that this data is highly available and consistent during the real-time optimization process. Spanner provides scale for these types of read-write, interactive workloads, with up to  99.999% availability and global consistency. Once a route is confirmed, the alternatives are no longer needed, though. For the high volume of deliveries that our fictitious app handles, this short-lived data is unwieldy. Spanner provides the ability for applications to set row-level deletion policies via TTL to better manage data. 

Fully Managed Background Clean-up with TTL

Spanner supports deleting data today using DELETE in SQL or partitioned deletes. These are useful for removing data where you need precise control over the transaction boundaries or if you want to minimize the processing time by devoting high-priority resources. TTL complements this with a fully managed background process that minimizes the overall system impact—no need to deal with manual partitioning, batching, or retrying. With TTL you can just set it and forget it.

This new approach in Spanner has several important benefits: 

  • Simplicity: Row deletion policies are declarative. You tell Spanner when rows are eligible to be deleted, not how to delete them. TTL logic is centrally defined in your schema, making it easy to manage along with the tables it governs and straightforward to reason over—no more digging through complex application code or external scripts to understand critical clean-up logic.

  • Scalability: TTL scales with your databases. Spanner seamlessly distributes the scanning for expired rows and their deletion across all nodes in your instance. As your database grows, TTL dynamically adjusts without additional intervention.

  • Predictability: TTL is designed to minimize the impact on other database workloads. The TTL sweeper process works in the background at system low priority. It spreads work over time and available instance resources more efficiently than what’s possible in custom queries for minimal overhead.

  • Observability: TTL is integrated into Cloud Monitoring for end-to-end insight into progress and warnings alongside the rest of your stack without additional plumbing for your developers to build and maintain. 

Configuring a Row Deletion Policy

To see it in action, let’s configure Spanner to automatically remove the ephemeral calculated routes in our example after nine days. 

The first step is to add a row deletion policy to the CalculatedRoutes table in the schema definition. As with other schema changes, setting or changing a row deletion policy is a privileged action.

Leave a Comment