Data Engineering & Lake

Jun 2, 2024

Snowflake Database Change Management (+SQL Templates, +Jinja)

Jinja is a powerful templating engine that brings the flexibility and reusability of code templates to your Snowflake environment. 

With Jinja, you can create dynamic SQL scripts that adapt to different environments, parameters, and variables, significantly simplifying your data engineering workflows.

Why Jinja Templating?

Here's why Jinja templating is a valuable asset in your Snowflake toolkit:

  • Environment Flexibility: Easily manage database object creation, data loading, and transformations across various environments (e.g., dev, QA, production) using the same template.

  • Parameterization: Define variables and placeholders that can be dynamically replaced during script execution, allowing you to reuse scripts with different input values.

  • Code Reusability: Create reusable templates for common tasks, minimizing code duplication and promoting consistency across projects.

  • Improved Readability: Jinja's syntax makes your SQL code more structured and easier to read, enhancing maintainability.

A Practical Example: Dynamic Data Model Setup

Let's explore a practical example demonstrating how Jinja can streamline the setup of your OrdersDB data model across multiple environments.

Code Jinja Demo Example (Available in GitHub): 

/*
------------------------------------------------------------------------------
-- Snowflake Demo Script: Orders Data Model-- 
-- Description: 
-- This script sets up a sales data model in Snowflake. It includes the creation of tables for customers, device,
-- sales_order, and sales_order_item, along with sample data insertion and tagging of 
-- columns for PII, lead source, and sales stage. Additionally, it defines RBAC 
-- privileges, functions, stored procedures, and views for analysis purposes.
--
-- Author: Fru N.
-- Website: DemoHub.dev
--
-- Date: June 2, 2024
--
-- Copyright: (c) 2024 DemoHub.dev. All rights reserved.
--
-- Disclaimer:  
-- This script is for educational and demonstration purposes only. It is not
-- affiliated with or endorsed by Snowflake Computing. Use this code at your 
-- own risk.
------------------------------------------------------------------------------
*/


--!jinja

{% set environments = ['dev', 'qa', 'staging', 'prod'] %}
{% set DATA_PRODUCT = "ORDERS" %}  
   
{% set BU_GEO = "US" %}

{% for environment in environments %}

   {% set database_name = DATA_PRODUCT + 'db_' + BU_GEO+ '_'+ environment   %}
   {% set s3_url = 's3://demohubpublic/data/' %}  --Public dataset used for demos. Change this if you have to. 
   

   -- +----------------------------------------------------+
   -- |       1. DATABASE AND SCHEMA SETUP       |
   -- +----------------------------------------------------+
   
   CREATE OR REPLACE DATABASE {{ database_name }};
   USE DATABASE {{ database_name }};
   
   CREATE OR REPLACE FILE FORMAT CSV_SCHEMA_DETECTION TYPE = CSV PARSE_HEADER = TRUE SKIP_BLANK_LINES = TRUE TRIM_SPACE = TRUE ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE;
   
   CREATE OR REPLACE STAGE DEMOHUB_S3_INT URL = '{{ s3_url }}' DIRECTORY = (ENABLE = true) COMMENT = 'DemoHub S3 datasets';
   
   -- +----------------------------------------------------+
   -- |       2. LOADING DATA WITH SCHEMATIZATION      |
   -- +----------------------------------------------------+
   
   {% for table_name in ['customer', 'device', 'sales_order', 'sales_order_item'] %}
    CREATE OR REPLACE TABLE {{ table_name }} USING TEMPLATE (SELECT ARRAY_AGG(OBJECT_CONSTRUCT(*)) FROM TABLE (INFER_SCHEMA(LOCATION => '@demohub_s3_int/orders/{{ table_name }}/', FILE_FORMAT => 'CSV_SCHEMA_DETECTION')));
   {% endfor %}
   
   
   -- +----------------------------------------------------+
   -- |       3. COPY DATA INTO TABLE       |
   -- +----------------------------------------------------+
   
   {% for table_name in ['customer', 'device', 'sales_order', 'sales_order_item'] %}
   COPY INTO {{ table_name }} FROM '@demohub_s3_int/orders/{{ table_name }}/' FILE_FORMAT = 'CSV_SCHEMA_DETECTION' MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE;
   {% endfor %}

   
{% endfor %}

The provided code leverages Jinja templating in Snowflake to streamline the creation and management of database objects across multiple environments (development, testing, production, etc.). 

It dynamically generates SQL statements based on predefined variables, ensuring consistency and reducing manual effort. This approach is essential for efficient data engineering workflows, as it allows you to easily replicate your data models and load data across different stages of your development process. 

It also promotes maintainability by centralizing the schema definitions and data loading logic in a single template. This is a key component of effective database change management, allowing you to track, version, and deploy your database changes more reliably.

Breakdown Explanation

1. Define Variables and Configurations:

Start by defining the necessary variables using Jinja's {% set %} syntax. In this case, we'll specify the environments, data product name, business unit, and data source:

Jinja

{% set environments = ['dev', 'qa', 'staging', 'prod'] %}  -- Environments for deployment
{% set DATA_PRODUCT = "ORDERS" %}                          -- Name of the data product
{% set BU_GEO = "US" %}                                   -- Business unit/geography
{% set s3_url = 's3://demohubpublic/data/' %}             -- Base S3 URL for data
2. Create Databases Dynamically:

Loop through the environments list and create databases for each environment:

Jinja

{% for environment in environments %}
  {% set database_name = DATA_PRODUCT + 'db_' + BU_GEO+ '_'+ environment %}  -- Dynamically construct database names
  CREATE OR REPLACE DATABASE {{ database_name }};  -- Create database using Jinja variable
  USE DATABASE {{ database_name }};                 -- Switch to the newly created database
{% endfor %}
3. Define File Format and Stage:
  CREATE OR REPLACE FILE FORMAT CSV_SCHEMA_DETECTION TYPE = CSV PARSE_HEADER = TRUE SKIP_BLANK_LINES = TRUE TRIM_SPACE = TRUE ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE;
   
   CREATE OR REPLACE STAGE DEMOHUB_S3_INT URL = '{{ s3_url }}' DIRECTORY = (ENABLE = true) COMMENT = 'DemoHub S3 datasets';
4. Create Tables and Load Data:

Use Jinja to loop through your table names (customer, device, sales_order, sales_order_item) and dynamically create tables using schema inference. Then, load data into these tables from your S3 stage:

Jinja

{% for table_name in ['customer', 'device', 'sales_order', 'sales_order_item'] %}
  CREATE OR REPLACE TABLE {{ table_name }} USING TEMPLATE (
    SELECT ARRAY_AGG(OBJECT_CONSTRUCT(*)) 
    FROM TABLE (INFER_SCHEMA(LOCATION => '@demohub_s3_int/orders/{{ table_name }}/', FILE_FORMAT => 'CSV_SCHEMA_DETECTION'))
  );

  COPY INTO {{ table_name }} 
  FROM '@demohub_s3_int/orders/{{ table_name }}/' 
  FILE_FORMAT = 'CSV_SCHEMA_DETECTION' 
  MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE;
{% endfor %}
4. Execute or Schedule Scripts:

Snowflake simplifies database change management through its Git integration, allowing you to fetch the latest code changes from your Git repository using ALTER GIT REPOSITORY ... FETCH and apply them to your Snowflake environment using EXECUTE IMMEDIATE FROM, keeping your database objects synchronized and version-controlled.

-- Fetch the latest changes from the remote Git repository into the Snowflake repository stage
ALTER GIT REPOSITORY Tutorials_DemoHub_Git_Repo FETCH;

-- Execute the SQL script stored in the Git repository to create or update database objects
EXECUTE IMMEDIATE FROM 
  @tutorials_demohub_git_repo/branches/main/DCM/OrdersDB_DDL_Template.sql  -- Specify the file path in the repository stage
  USING (
    DATA_PRODUCT => 'ORDERS',      -- Set the data product name (used as a variable in the script)
    BU_GEO => 'US'                -- Set the business unit/geography (used as a variable in the script)
  )

Alternatively, execute using the UI

Remember

This is just a simplified example. You can tailor Jinja templates to perform complex data transformations, run custom SQL queries, and even generate reports – all with the power of dynamic variables and code reuse.

Resources

Join Our Community

Join our newsletter list for occasional updates, products and insights.

Join Our Community

Join our newsletter list for occasional updates, products and insights.

Join Our Community

Join our newsletter list for occasional updates, products and insights.

More from
Data Engineering & Lake

Snowflake Dynamic Tables

Snowflake Dynamic Tables

Snowflake Dynamic Tables

Snowflake Database Change Management (+Git Integration)

Snowflake Database Change Management (+Git Integration)

Snowflake Database Change Management (+Git Integration)

Snowflake Database Change Management (+SQL Templates, +Jinja)

Snowflake Database Change Management (+SQL Templates, +Jinja)

Snowflake Database Change Management (+SQL Templates, +Jinja)

Copyright 2020-24 - All Right Reserved

Copyright 2020-24 - All Right Reserved

Copyright 2020-24 - All Right Reserved