More than just relational data at scale with Spanner’s new JSON data typeMore than just relational data at scale with Spanner’s new JSON data typeProduct Manager, Cloud SpannerCloud Spanner Software Engineer

JSON, or JavaScript Object Notation, is the format that developers rely on for hierarchical or semi-structured data. As a subset of JavaScript, JSON’s popularity has been driven by explosive growth of rich, interactive experiences in the browser and scripting environments like Node.js. Cloud Spanner’s new JSON data type allows you to extend your relational data with sparse, nested, or less structured JSON data. This provides flexibility and agility without having to compromise on the availability and consistency at scale that your applications rely on with Spanner.

Relational Is No Longer Enough

There are very few technologies that can match the ubiquity and staying power of the relational data model. E. F. Codd’s original paper likely predates many readers of this blog. Tables of rows and columns, related by keys are a natural way to capture structured data for operational applications: A “Customer” has “Sales Orders” which are made up of “Order Lines”, each with a well defined set of attributes. However, not all today’s data lends itself well to strict modeling in tables. For example, what if Customer data is sourced from three different systems, each with its own set of attributes, or the definition of an Order Line changes frequently or is defined on the fly by users? 

Take, for example, a large electronics manufacturer with hundreds of different products. Each of these products has its own unique set of attributes. Modeling this relationally would require schema changes for each new attribute, even if their users or applications don’t need to query on them. With a growing business and new products coming online all the time, the analysis, modeling, deployment, and testing cycle for schema changes can be a drag on innovation. What they really need is the ability to query over a consistent set of key attributes, common to most products, and then to easily manage the long tail of other attributes without having to completely abandon the transactions and rich queries that Spanner provides. 

JSON is great for representing key-value pairs (objects), ordered lists (arrays), strings, numbers, and Booleans, without having to predefine anything about the structure or the allowable values. 

Our electronics manufacturer might model products with the following (grossly simplified) Products table.

Leave a Comment