How to Handle Change Management for Dimensional Data Models

airbyte.com

Summary

Change can be thought of in two contexts: (1) modifying model logic (CI/CD), or (2) re-materializing models after source data refreshes. Having a way to safely deploy models to production is important. Two important aspects of model deployments are consistency and transparency.

Transparency for data consumers means knowing what data is available at a given time, and whether that data meets consistency and freshness SLAs. A data transformation job is invoked, and A succeeds but B fails. If a model succeeds, it should never need to be re-built until more data arrives or more model definition changes are made.

“Blue-green” deployments are a software deployment approach which can be used in many different contexts, from provisioning of cloud infrastructure to deploying new versions of an application. In the context of data transformation deployments, we can consider the current version of our production analytics database to be “blue” When we need to re-deploy models, we create a clone of that database where transformations can run. If the transformation job succeeds and all tests pass, then we can atomically swap the blue and green databases.

Before rolling back any models, we must first identify the affected subgraph for a given node invocation failure. For dbt users, here is a practical way to identify the Affected Subgraph. Once we have a list of affected models, there are a variety of methods to achieve the rollbacks.

web site website internet site site comic book rubber eraser rubber pencil eraser laptop laptop computer-0

3 Comments

throwschenOP

Any time I've run through this process, it's a nerve wrecking month to ensure things don't blow up.

practicalmagic

Do you follow somewhat of what's written in the article or do you have your own process?

throwschenOP

We follow what the article describes as the naive approach. I'm looking at improving this to be more like the blue-green one. It'll be less stressful with that approach.

Enjoying SpeakBits?

How to Handle Change Management for Dimensional Data Models

Summary

3 Comments