Jay Grossman
Jay Grossman
Published on:

Snowflake's lateral flatten function on variant data type

Snowflake's lateral flatten function on variant data type

Snowflake is a really interesting new data warehouse built on top of AWS. I like their architecture because they had the interesting idea to separate data storage (backed by small files on S3) and compute to run queries (EC2 instances running their API).

I inherited a project where we would store complex JSON in a string in a field as varchar(64000). Then we would use regex patterns to get the values we wanted from them. Sometimes these regexes would get really involved, yuck.

The Variant data type

Snowflake offers mechanism to store semi structured data in field that is easy to parse - the Variant data type

In snowflake I have a table variant_demo with a single Variant field named json_data as shown below:

sf_variant

If we were to click on the field, we could see the JSON elements as shown below:

sf_json_data

If I wanted to get the value of the "membership" field, I could write the following SQL:

1
2
SELECT json_data:membership::string as membership 
FROM variant_demo;

sf_flatten_1

Now what if I wanted to pull out the list associated with "products" and join them to our products table?

Lateral Flatten Function

Snowflake has this really cool function that allow us to normalize a row from a list from JSON attribute from variant field. The SQL below uses lateral flatten to take the items in the list from json_data:products make them their own dataset:

1
2
3
4
5
6
7
8
9
10
WITH p as (
SELECT
    json_data
FROM
    variant_demo
)
SELECT
    b.value::string as product_style
FROM p,
    lateral flatten(input=> p.json_data:products) b  

sf_flatten_2

Once you have a nice clean record set, we can join on data with other tables:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
WITH p as (
    SELECT 
        json_data
    FROM
        variant_demo
) 
SELECT 
    b.value::string as product_style
    , designer
    , list_price
FROM p,
    lateral flatten(input=> p.json_data:products) b  
    INNER JOIN products prod  
        ON v.value::string=prod.style 

sf_flatten_3