Multiple (not composite) business keys in the same hub

Hi All,

I have 2 distinct business keys linked with the same business entity. One of them (A) has full coverage for underlying entity, while another (B) is better known to the business users but is not always present.
I have two not overlapping sources:

  1. Source 1 contains A and B for each record
  2. Source 2 contains A but no B
  3. A’s from Source 1 and Source 2 do not overlap

I’m thinking to get both keys in a hub with connected same-as-link to capture relationship between A and B (where present). Technically it means that single record from Source 1 is represented by two records in the hub.

Can you see any problem with this approach?

If bks different for same entity then seems SAL needed as entry access point for business.

And regarding SATs good practice is:

  • split by source system.
  • splits by speed change.
1 Like

Splitting by PII can be beneficial too depending on how strict your requirements are.

1 Like

I agree with what Emanueol and Frankie have already mentioned. One additional consideration is the potential use of a Master Hub structure.

If having multiple records per entity becomes a challenge, you can leverage your same as link to create a rule that assigns each entity a single Master Key.

This allows you to treat the Master Key as a lone identifier while still retaining all underlying linked records for context.

1 Like

Can you outline what a master hub looks like? Its a concept I’ve figured people use but not seen one in action. Is the idea to essentially filter down both sides of a same-as-link to a single record in the hub?

I’ve always been intruiged how people use same as links in practice without things getting super messy in their curated layer/mart definitions

It only gets as messy as the logic required to identify a master key…and that’s usually messy.

A ‘master hub’ itself is simple. Structurally it’s just a normal hub, but instead of storing every unique business key from every source, it stores one “master key” after your deduplication logic has run. That logic usually already exists in teams that are turning multiple customer instances down to one, so the master key is just another business concept to record.

In practice, a same as link connects all the source level hub keys to that master key. This lets your final layer point consistently at the mastered entity, but still maintain the ability to track back to the original keys.

2 Likes

Hi ! I found this paper of interest :

And Andrew Foad introduced the “keyset” notion that may also be of interest !

1 Like