Structure of the PIT table for Snowflake DB

Hello,

For PIT tables, do we have to repeat the hashkey for each Satellite table along with the load_dts, if the DV is implemented in Snowflake database? In other DBs to help the query plan we might want to do that but is it necessary for Snowflake. I see examples in Dan’s book and also Patrick Cuba’s article, they repeat the hashkey for each satellite. Please advise. Thanks in advance!

Hi,

I’m not sure what exactly you mean to ask.

Is it “do I need PITtables when using Snowflake”?

Or - which I assume - just about the hahskey of the satellite.

In that case, I would say yes (if you use hashes at all), because a specific satellite might not contain any data about that business object (hub-record). So, you might have to store the hash of the dummy-record.

Forgive me if my assumption is not correct .

Regards,

Klaas

Hi, your assumption is right, I am asking about the repetition of hash key for each satellite in the PIT table for a given hub.

hashkey, snapshot_dt, sat1_hk, sat1_ldts, sat2_hk, sat2_ldts, …

So in the above PIT table structure, hash_key = sat1_hk = sat2_hk.

Can we just do in snowflake DB the below as I heard snowflake is smart enough to identify the right query plan

hashkey, snapshot_dt, sat1_ldts, sat2_ldts, …

?

the extra hkeys that seems equal/extras, is mainly for the single INSERT of the ghost row (hkey=00000000000000000000000000000000 (32 zeros)) SAT record that all new created SATs shall insert upon SAT ddl initial execution, so for example PITs can have either entitiy hkey ornthe specific SAT hkey.