Merged PR 2725: Force id user field to lower in staging

# Description

Forces lower case to all id_users in staging. Removes hardcoded lower case in intermediate. Adapts readme to contemplate the lowering of id users.

I propose to merge, run in prod and run tests in prod as a proper evaluation method.

BTW, I only find one id_user_host that was in capital letters, so that's why probably we didn't care that much about this. Still, I prefer have things clean from the start!

```
select *
from staging.stg_core__booking scb
left join intermediate.int_core__unified_user icuu
on lower(scb.id_user_host) = lower(icuu.id_user)
where scb.id_user_host <> icuu.id_user
```

# Checklist

- [ ] The edited models and dependants run properly with production data. **All models run in stg, did not check all the dependants**
- [ ] The edited models are sufficiently documented. **Have not checked**
- [ ] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [ ] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #20776
This commit is contained in:
Oriol Roqué Paniagua 2024-09-03 14:36:21 +00:00
parent 1b30fbbca9
commit 6d59e21310
18 changed files with 22 additions and 22 deletions

View file

@ -57,7 +57,7 @@ For other matters, use a `chores` branch (i.e. `chores/add-dbt-package`).
## Project organization
We organize models in four folders:
We organize models in three folders:
- `staging`
- Pretty much this: <https://docs.getdbt.com/best-practices/how-we-structure/2-staging>
@ -89,6 +89,8 @@ We organize models in four folders:
- Split schema and domain with double underscode (ie `stg_core__booking`).
- Always use sources to read into staging models.
- SQL formatting should be done with `sqlfmt`.
- Other conventions
- In staging, enforce a `lower()` to user UUID fields to avoid nasty propagations in the DWH.
When in doubt, do what dbt guys would do: <https://docs.getdbt.com/best-practices>
Or Gitlab: <https://handbook.gitlab.com/handbook/business-technology/data-team/platform/dbt-guide/>