AutomateDV advise

Does anyone implemented AutomateDV in data vault project in Snowflake ? I watched all the videos about AutomateDV. There are few questions on how to use AutomateDV in Snowflake ?

  1. How to create and use Streams and Tasks with AutomateDV macros, when loading data in Hubs, Links, and Sats ?
  2. How the Snowflake Task dependencies will work with AutomateDV?
  3. Any community help from AutomateDV ?

Any inputs would be appreciated?

Hey there Rudra,

This feels more like a dbt issue than an AutomateDV or Data Vault 2.0 question. Do you know how to create and use streams and tasks with dbt (outside of AutomateDV). Worth checking in with them before applying the DV2.0 methodology to your tech.

Hope that helps

Frankie

Thanks for the reply. Here are my two cents.

AutomateDV developed based on DBT macros. This means when we load data into DV objects using AutomateDV - Hub, Sat, Link patterns / data pipelines loading data in Snowflake Tables. Eventually these patterns need to be scheduled to run in Snowflake Tasks and use Streams for near-real time.

So, does AutomateDV have any future plans to support Snowflake Tasks and Streams in the framework?

While creating Tasks and Streams within Snowflake is a straightforward process, the integration approach with AutomateDV introduces additional considerations. The manner in which AutomateDV interacts with Snowflake Streams and Tasks will significantly influence its suitability for the use case, as well as any potential implementation challenges. A thorough evaluation of this integration is essential to ensure alignment with architectural and operational requirements.

I see your two cents and raise you four :coin: ,

AutomateDV is a tool that templates your sql for Data Vault 2.0 purposes it doesn’t touch your data platform.

dbt is a tool that runs sql against your data platform (and a bunch more things but that’s the general gist).

AutomateDV is a package for dbt so it’s not capable of directly adding dbt functionality.

You need to ask the dbt team whether they support streams and tasks scheduling or not (I think it’s still a no for now?).

Hope that helps,

Frankie

2 Likes

Hi both,

Sorry for weighing in on this late!

Short version: No we are not currently planning to support this HOWEVER

Frankie is correct in that this is not directly an AutomateDV problem to solve, and is more so a dbt support consideration.

As it is, dbt does not have any direct support for Streams and Tasks - but it is not really designed for this kind of processing. You would have to have an orchestrator trigger dbt every few minutes on a small selection of models - this could be costly.

However, we have some ideas about how to support the use case of real-time data feeds and taking better advantage of functionality such as Streams and Tasks (and other platform equivalents). We’re in the early stages of experimenting so I cannot go into more detail at this time - watch this space!