Merged PR 2629: Integrates logic to detect New Dashboard users within DWH

# Description

First step towards reporting New Dash is to be able within DWH to know which hosts have been migrated.
In order to do so, and anticipating that there's going to be new phases in the future, I've created a `int_core__user_migration` model that reads from a configuration macro `get_new_dash_migration_phases_config` that will allow semi-automatic user retrieval in the future. This avoids nasty hardcoding within the model itself.
The information of whether a user is migrated, in which phase and when the phase was deployed is available at user level in the `int_core__user_host` table.

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models. **-> I selected a view for this model since I don't believe we should materialse this data other than the user host table**

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #19570
This commit is contained in:
Oriol Roqué Paniagua 2024-08-22 12:10:25 +00:00
parent 3d09d04068
commit c8f4d2be70
4 changed files with 129 additions and 1 deletions

View file

@ -0,0 +1,32 @@
/*
Macro: get_new_dash_migration_phases_configuration
Provides a general configuration for the different phases of the
New Dash migration. Each phase is identifiable via a phase_name,
that is the "expected display" for users. The assumption is that
each user migration is identified via claim_type. Lastly, we
apply here a hardcode of when the deployment was carried out.
Important note: if a user migrates once a phase has started, we
will not be able to tell when that happened. However, it is likely
that other indicators will provide an estimate. For example:
The migration A happens on 1st July 2024.
User A is migrated on 1st July 2024 (as expected)
User B is migrated on 10th July 2024 (not expected)
It is likely that User B won't have Bookings from new dash
until it's migrated. So this migration date should be considered
as a hard, lower-limit of dates.
*/
{% macro get_new_dash_migration_phases_config() %}
{% set migration_phases = [
{
'phase_name': 'MVP',
'claim_type': 'KYGMVP',
'deployment_date': '2024-07-30'
}
] %}
{{ return(migration_phases) }}
{% endmacro %}

View file

@ -6,6 +6,7 @@
with
int_core__unified_user as (select * from {{ ref("int_core__unified_user") }}),
int_core__user_role as (select * from {{ ref("int_core__user_role") }}),
int_core__user_migration as (select * from {{ ref("int_core__user_migration") }}),
stg_core__claim as (select * from {{ ref("stg_core__claim") }}),
-- A USER CAN HAVE MULTIPLE ROLES, THUS DISTINCT IS NEEDED TO AVOID DUPLICATES
@ -40,6 +41,10 @@ select
uu.joined_at_utc,
uu.joined_date_utc,
uu.created_date_utc,
uu.updated_date_utc
uu.updated_date_utc,
case when um.id_user_host is not null then true else false end as is_user_migrated,
um.migration_phase,
um.lower_limit_migration_date_utc
from int_core__unified_user uu
inner join unique_host_user uhu on uu.id_user = uhu.id_user
left join int_core__user_migration um on uu.id_user = um.id_user_host

View file

@ -0,0 +1,44 @@
{% set migration_phases = get_new_dash_migration_phases_config() %}
with
stg_core__claim as (select * from {{ ref("stg_core__claim") }}),
user_migration_from_claim as (
select
id_user as id_user_host,
case
{% for phase in migration_phases %}
when upper(claim_type) = '{{ phase.claim_type }}'
then '{{ phase.phase_name }}'
{% endfor %}
else null
end as migration_phase,
case
{% for phase in migration_phases %}
when upper(claim_type) = '{{ phase.claim_type }}'
then '{{ phase.deployment_date }}'
{% endfor %}
else null
end as lower_limit_migration_date_utc
from stg_core__claim
where
{% for phase in migration_phases %}
(upper(claim_type) = '{{ phase.claim_type }}')
{% if not loop.last %} or {% endif %}
{% endfor %}
)
-- GET ONLY THE FIRST TIME THE USER WAS MIGRATED
select id_user_host, migration_phase, lower_limit_migration_date_utc
from
(
select
id_user_host,
migration_phase,
lower_limit_migration_date_utc,
row_number() over (
partition by id_user_host order by lower_limit_migration_date_utc asc
) as rank
from user_migration_from_claim
)
where rank = 1

View file

@ -2160,6 +2160,53 @@ models:
description: |
Date of the last time the information of the Host was updated
in our systems.
- name: is_user_migrated
data_type: boolean
description: |
Flag to determine if this user host has been migrated according
to the logic implemented in user_migration table.
- name: migration_phase
data_type: string
description: |
The name of the phase this user was first migrated.
- name: lower_limit_migration_date_utc
data_type: date
description: |
The date that the deployment of the migration happened.
It does not necessarily mean that this user was migrated in
this date. This user could have not been migrated before
this date.
- name: int_core__user_migration
description: |
This table provides information of the host users that have been migrated.
At this stage, the main objective is to account for the user migration within
the scope of New Dashboard migration.
It uses the migration configuration settled in the macro:
- user_migration_configuration -> get_new_dash_migration_phases_config
columns:
- name: id_user_host
data_type: character varying
description: The unique user ID for the Host.
tests:
- not_null
- unique
- name: migration_phase
data_type: string
description: |
The name of the phase this user was first migrated.
tests:
- not_null
- name: lower_limit_migration_date_utc
data_type: date
description: |
The date that the deployment of the migration happened.
It does not necessarily mean that this user was migrated in
this date. This user could have not been migrated before
this date.
tests:
- not_null
- name: int_core__address_validations
description: