Computation in DataVault

julesKr · 14 July 2025 06:27

Hello,
I have stored my data in DataVault and need to do complex computations, which involve the entities from multiple schemas of the foundation layer. The results need to be persisted in the DataVault. What is the right place for it? Should it be stored in the input or foundation layer? I guess foundation layer makes no sense, because the result does not have the structure of hub, satelite etc.?
I did not find any literature on that.
Best regards,
Jules

Klaas.vanVeelen · 14 July 2025 10:28

Hi Jules,

Generally speaking, the outcomes of computations are considered Business Vault objects. I’m not sure what you mean with foundation layer (raw vault?) and why you’ve separated it into multiple schemata. However, generally speaking again, derived data (outcome of calculations) are based on raw-vault data (and/or already created business-vault-data).
Does that clarify things?
Regards,
Klaas

Dodging_Produce · 17 July 2025 12:58

Hi Jules, perhaps some nomenclature can help resolve some issues.

Source data: (Multiple schema data, disparate data)

Stage: (load and stage tables) for landing data and implementing hard business rules (HBRs) or data type alignment.

Raw Data Vault: (data loaded from stage) contains hubs, links, and satellites. No calculations applied at this layer. Only HBRs.

Business Vault: (PIT and bridge tables, well-defined soft business rules (SBRs)). Optional

Information layer: calculated data, SBRs, sources raw data vault, may source business vault.

Your computations persist in the information layer, which can be physicalized or virtualized.

Hope this provides some insight and clarity.

Cheers,

Z

patrickcuba · 19 July 2025 01:26

RV + BV = DV
RV = Hubs, Links and Satellites
BV = Links and Satellites

All DV is the captured output of calculations, business rules etc.

RV stores the outcome of business rules from source systems (the software used to automate business processes)
BV stores the output of soft rules, these use RV tables as a source (and sometimes other BV artefacts). This is why BV never has a hub table, it is merely extending RV with derived rules in your data warehouse/lakehouse (or whatever).
PITs and Bridges are NOT BV artefacts because they are ephemeral, they are structures used to simplify and speed up extraction of DV content for your Information Mart / Presentation layer. PITs and Bridges can be built up and taken down reference the keys and dates from RV and BV

How to apply calculations and store them as BV artefacts is easy

They can be in any language you choose, python, sql, rust, whatever – RV sources are the same
You must have an applied date, that way your BV artefacts are tied to the upstream artefacts they are based on. Joining RV and BV around a hub or a link is easy then if they have a date that these rules are based on. It also means, your DV is always bi-temporal — you will have an applied date (extract date) and a load date (version date)
To load BV, you simply stage the output of running soft rules on RV and use the same loading patterns for satellites and links and therefore through naming standards have sparsely extended the RV … i.e. RV+BV=DV. Building ephemeral structures on top of these as PITs and Bridges takes away from users the needlessly apply creativity for querying your DV, you have solved the complexity for them.

Topic		Replies	Views
PIT, MAS, and Bridge Tables — Do They Belong to the Raw Vault or the Business Vault? Data Vault 2.0 dv-20-structure	4	90	29 May 2026
How do i build the Business Vault with dbtvault? AutomateDV automatedv	2	861	7 December 2021
Refactoring of business vault Data Vault 2.0	3	329	2 August 2023
Business Rules Engine as the / in the BV Layer? Data Vault 2.0 business-vault	2	495	25 August 2023
What is a Source Data Vault? Data Vault 2.0	7	1263	16 May 2022

Computation in DataVault

Related topics