Merged PR 4106: Avoids duplicates in athena verification with fees after price changes

# Description

There was some duplicated records that effectively raised alarms in data tests.

The fix is simple:
* Ensure that is using the timestamp field, rather than the date of checkout.
* This was not enough because in some cases the timestamp is exactly at midnight. So I changed the between to an explicit logic - start condition is inclusive, while end condition is exclusive.

# Checklist

- [ ] The edited models and dependants run properly with production data.
- [ ] The edited models are sufficiently documented.
- [ ] The edited models contain PK tests, and I've ran and passed them.
- [ ] I have checked for DRY opportunities with other models and docs.
- [ ] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #26622
This commit is contained in:
Oriol Roqué Paniagua 2025-01-19 10:07:39 +00:00
parent 873402fd8e
commit 06bdb81cfe

View file

@ -32,7 +32,11 @@ select
from ranked_verifications v
left join
stg_seed__athena_price_history ph
on v.checkout_date_utc between ph.start_at_utc and ph.end_at_utc
-- The following condition ensures avoiding duplicates.
-- Keep in mind that the start_at_utc is inclusive to the price,
-- while the end_at_utc is exclusive.
on v.checkout_at_utc >= ph.start_at_utc
and v.checkout_at_utc < ph.end_at_utc
where
-- Select only the most recent verification for each id_booking
v.rn = 1