Loading multiple record versions in one run #262

maxmue · 2024-09-27T10:38:10Z

maxmue
Sep 27, 2024

Greetings!

In my source, I have a CDC mechanism which results in a table "study" like the following:

ID	study_name	study_status	updated_timestamp
1	study a	active	2000-01-01 00:00:00
2	study b	active	2000-01-01 00:30:00
1	study a	closed	2000-01-01 01:00:00

As you can see, I have multiple versions of the same record.

My goal is to have all versions loaded into the study satellite to have the complete history in my DV.

Can I solve this with your library? When I try a reguar satellite, I see the duplications in the satellite table: I see some rows from the source table duplicated, some not, it's not consistent.

Or, do I need to model this a MAS and apply the logic later downstream?

@tkirschke, we quickly touchpointed on this during the conference.

Best regards,
Max

tkirschke · 2024-10-07T06:14:44Z

tkirschke
Oct 7, 2024
Maintainer

Hi @maxmue ,

Yes you can use our satellite macro for such a case, but the ldts column needs to be manipulated properly, to represent the arriving order of deltas.

One solution (which I do not recommend) would be to just use the updated_timestamp as a load_date (in the staging model). This should archieve your desired loading behavior. I don't recommend this, because you typically want to fully control the ldts, using the CDC timestamp is an argued solution.

A safe, but more complex solution includes a ROW_NUMBER(), calcuated based on the updated_timestamp. In your example, you want to first generate a ROW_NUMBER() OVER (PARTITION BY ID ORDER BY updated_timestamp). This can be done in a pre-stage model, inside a first CTE. In a second CTE, you use this ROW_NUMBER() to do a TIMESTAMP_ADD() of ROW_NUMBER() Miliseconds, to what you typically use as a ldts. This ensures a proper order of deltas, by keeping the ldts fully under control. This new column is now used in a regular staging model on top, as the ldts definition, and then also in the regular satellite model.

Let me know if the second solution solves your problem!

Best regards
Tim

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Loading multiple record versions in one run #262

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Loading multiple record versions in one run #262

Uh oh!

Uh oh!

maxmue Sep 27, 2024

Replies: 1 comment

Uh oh!

tkirschke Oct 7, 2024 Maintainer

maxmue
Sep 27, 2024

tkirschke
Oct 7, 2024
Maintainer