Ha, while I haven’t written any books, I was mentioned in the Acknowledgment section of a book, Data Governance Needs Risk Management, written by Mark Atkins and Terry Smith. They have a tool, Intralign, their company is Intraversed, and they teach you how to define business terms, i.e., classification, purpose, function, and characteristics, and other “rule types!” It’s great!!!
So, I’ve developed my “ontology” of business concepts, i.e., say, account, portfolio, asset, client, participant, agent, and a few key other core business elements, hence Hubs.
In my table/file, from my understanding, we have the grain of BK columns in the file, i.e., Unit of Work, and then the grain of the unique row count. Thus, if account, portfolio, and asset give me the unique row count grain, we are saying that if the file contains the grain of other BK columns, they should NOT be in the link?
That means, then, if we put those in a satellite, and we now need those BKs in an information mart, we’d need to create a link in the BV/Bridge. In my head, to do that, I would need to query the said satellite in the landing/stage, to create the hash, coordinate it as a job to insert into the hub(s), then go back to my BV, create a query to create a new link, and hash said BKs. Then, for the IM, could I properly rejoin the Hub to get descriptive data from the Hub satellite? Correct? Aside from that being a lot of future technical debt, the rule, if I’m correct, is that you can’t join a BK in a satellite back to its hub and join on the BKs, as this violates data vault rules.
Hence, I’m confused on how BKs in a file/table, that don’t participate in the row grain count for uniqueness, but are BK column(s) grain in the file/table, don’t belong in the link???
For “I think that links should contain only what is needed to define them. If you have another source system in the future that has this relationship but not your extra business keys is that going to cause you a problem?” That is what the BV is for, correct?
Thank you,
John