Merged PR 2743: Fixes deal-based issues on the billing country dimension

# Description

Before deploying KPIs by Billing Country, we spotted some issues that were basically increases on the volumes of any metric on the by billing country dimension that was based on Deal. This means, `int_core__mtd_deal_metrics` and `int_xero__mtd_invoicing_metrics`.

This PR changes the following:
* Now the 2 abovementioned models depend on the `int_core__deal` model, instead of `int_core__user_host` (thus removing duplicated stuff)
* Now all models use the main billing country at deal level, instead of doing it so at host level. The reason is that some small amount of hosts that share the same deal can have a different billing country. To avoid weird stuff, everything points to this simplification - that in general, it's not a massive change in the output.
* In order to do so easily, the 3 main billing country per deal fields have been propagated to `int_core__user_host`

To exemplify the solution, find here a snapshot of the differences in behavior:

```
select
    dimension,
    sum(deals_booked_in_month) as deals_booked_1,
    sum(deals_booked_in_6_months) as deals_booked_6,
    sum(deals_booked_in_12_months) as deals_booked_12,
    sum(total_revenue_in_gbp) as total_revenue,
    sum(xero_operator_net_fees_in_gbp) as operator_revenue,
    sum(xero_booking_net_fees_in_gbp) as booking_fees,
    sum(xero_listing_net_fees_in_gbp) as listing_fees,
    sum(xero_verification_net_fees_in_gbp) as verification_fees,
    sum(total_guest_revenue_in_gbp) as guest_revenue,
    sum(xero_waiver_paid_back_to_host_in_gbp) as waiver_paid_back_to_hosts,
    sum(waiver_net_fees_in_gbp) as waiver_net_fees
from intermediate.int_mtd_vs_previous_year_metrics
where date in ('2024-01-31')
group by 1
order by 1
```
Production:
![image.png](https://guardhog.visualstudio.com/4148d95f-4b6d-4205-bcff-e9c8e0d2ca65/_apis/git/repositories/54ac356f-aad7-46d2-b62c-e8c5b3bb8ebf/pullRequests/2743/attachments/image.png)

vs.
Local:
![image (2).png](https://guardhog.visualstudio.com/4148d95f-4b6d-4205-bcff-e9c8e0d2ca65/_apis/git/repositories/54ac356f-aad7-46d2-b62c-e8c5b3bb8ebf/pullRequests/2743/attachments/image%20%282%29.png)

Keep in mind that still Global dimension can be greater than any other dimension aggregated since not all users have a deal. Mismatches between the other 2 dimensions might be linked to the dump.

Commits are meaningful and help navigate in the changes.

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #20823
This commit is contained in:
Oriol Roqué Paniagua 2024-09-05 09:53:16 +00:00
parent bc77e5df08
commit 435db55c1e
10 changed files with 49 additions and 26 deletions

View file

@ -10,7 +10,7 @@ Please note that strings should be encoded with " ' your_value_here ' ",
{% set dimensions = [
{"dimension": "'global'", "dimension_value": "'global'"},
{"dimension": "'by_number_of_listings'", "dimension_value": "active_accommodations_per_deal_segmentation"},
{"dimension": "'by_billing_country'", "dimension_value": "billing_country_iso_3"}
{"dimension": "'by_billing_country'", "dimension_value": "main_billing_country_iso_3_per_deal"}
] %}
{{ return(dimensions) }}
{% endmacro %}

View file

@ -100,7 +100,7 @@ with
inner join int_core__unique_accommodation_to_user atu
on atu.id_accommodation = al.id_accommodation
inner join int_core__user_host u on atu.id_user_owner = u.id_user_host
and u.billing_country_iso_3 is not null
and u.main_billing_country_iso_3_per_deal is not null
{% endif %}
group by 1, 2, 3
{% if not loop.last %}

View file

@ -38,7 +38,7 @@ with
and d.date = mas.date
{% elif dimension.dimension == "'by_billing_country'" %}
inner join int_core__user_host u on b.id_user_host = u.id_user_host
and u.billing_country_iso_3 is not null
and u.main_billing_country_iso_3_per_deal is not null
{% endif %}
group by 1, 2, 3
{% if not loop.last %}
@ -65,7 +65,7 @@ with
and d.date = mas.date
{% elif dimension.dimension == "'by_billing_country'" %}
inner join int_core__user_host u on b.id_user_host = u.id_user_host
and u.billing_country_iso_3 is not null
and u.main_billing_country_iso_3_per_deal is not null
{% endif %}
group by 1, 2, 3
{% if not loop.last %}
@ -93,7 +93,7 @@ with
and d.date = mas.date
{% elif dimension.dimension == "'by_billing_country'" %}
inner join int_core__user_host u on b.id_user_host = u.id_user_host
and u.billing_country_iso_3 is not null
and u.main_billing_country_iso_3_per_deal is not null
{% endif %}
group by 1, 2, 3
{% if not loop.last %}
@ -123,7 +123,7 @@ with
{% elif dimension.dimension == "'by_billing_country'" %}
inner join int_core__bookings b on b.id_booking = bce.id_booking
inner join int_core__user_host u on b.id_user_host = u.id_user_host
and u.billing_country_iso_3 is not null
and u.main_billing_country_iso_3_per_deal is not null
{% endif %}
group by 1, 2, 3
{% if not loop.last %}

View file

@ -17,8 +17,8 @@ with
int_core__mtd_accommodation_segmentation as (
select * from {{ ref("int_core__mtd_accommodation_segmentation") }}
),
int_core__user_host as (
select * from {{ ref("int_core__user_host") }}
int_core__deal as (
select * from {{ ref("int_core__deal") }}
),
deals_metric_aggregation_per_date as (
{% for dimension in dimensions %}
@ -91,8 +91,8 @@ with
on al.id_deal = mas.id_deal
and al.date = mas.date
{% elif dimension.dimension == "'by_billing_country'" %}
inner join int_core__user_host u on al.id_deal = u.id_deal
and u.billing_country_iso_3 is not null
inner join int_core__deal ud on al.id_deal = ud.id_deal
and ud.main_billing_country_iso_3_per_deal is not null
{% endif %}
group by 1, 2, 3
{% if not loop.last %}

View file

@ -52,7 +52,7 @@ with
{% elif dimension.dimension == "'by_billing_country'" %}
inner join int_core__user_host u
on vr.id_user_host = u.id_user_host
and u.billing_country_iso_3 is not null
and u.main_billing_country_iso_3_per_deal is not null
{% endif %}
group by 1, 2, 3
{% if not loop.last %}
@ -81,7 +81,7 @@ with
{% elif dimension.dimension == "'by_billing_country'" %}
inner join int_core__user_host u
on vr.id_user_host = u.id_user_host
and u.billing_country_iso_3 is not null
and u.main_billing_country_iso_3_per_deal is not null
{% endif %}
group by 1, 2, 3
{% if not loop.last %}
@ -110,7 +110,7 @@ with
{% elif dimension.dimension == "'by_billing_country'" %}
inner join int_core__user_host u
on vr.id_user_host = u.id_user_host
and u.billing_country_iso_3 is not null
and u.main_billing_country_iso_3_per_deal is not null
{% endif %}
group by 1, 2, 3
{% if not loop.last %}
@ -141,7 +141,7 @@ with
inner join int_core__verification_requests vr on vr.id_verification_request = p.id_verification_request
inner join int_core__user_host u
on vr.id_user_host = u.id_user_host
and u.billing_country_iso_3 is not null
and u.main_billing_country_iso_3_per_deal is not null
{% endif %}
group by 1, 2, 3
{% if not loop.last %}

View file

@ -72,7 +72,7 @@ with
and d.date = mas.date
{% elif dimension.dimension == "'by_billing_country'" %}
inner join int_core__user_host u on vp.id_user_host = u.id_user_host
and u.billing_country_iso_3 is not null
and u.main_billing_country_iso_3_per_deal is not null
{% endif %}
where upper(vp.payment_status) = {{ var("paid_payment_state") }}
group by 1, 2, 3

View file

@ -8,6 +8,7 @@ with
int_core__user_role as (select * from {{ ref("int_core__user_role") }}),
int_core__user_migration as (select * from {{ ref("int_core__user_migration") }}),
stg_core__claim as (select * from {{ ref("stg_core__claim") }}),
int_core__deal as (select * from {{ ref("int_core__deal") }}),
-- A USER CAN HAVE MULTIPLE ROLES, THUS DISTINCT IS NEEDED TO AVOID DUPLICATES
users_with_host_roles as (
@ -41,6 +42,9 @@ select
uu.company_name,
uu.email,
uu.id_deal,
d.main_billing_country_name_per_deal,
d.main_billing_country_iso_2_per_deal,
d.main_billing_country_iso_3_per_deal,
uu.joined_at_utc,
uu.joined_date_utc,
uu.created_date_utc,
@ -51,3 +55,4 @@ select
from int_core__unified_user uu
inner join unique_host_user uhu on uu.id_user = uhu.id_user
left join int_core__user_migration um on uu.id_user = um.id_user_host
left join int_core__deal d on uu.id_deal = d.id_deal

View file

@ -2207,6 +2207,24 @@ models:
description: |
Main identifier of the B2B clients. A Deal can have multiple Hosts.
A Host can have only 1 Deal or no Deal at all. This field can be null.
- name: main_billing_country_name_per_deal
data_type: string
description: |
Name of the main country in which the Deal is billed.
It's a simplification of the billing country that is common to all users
that share the same Deal. It can be null.
- name: main_billing_country_iso_2_per_deal
data_type: string
description: |
ISO 3166-1 alpha-2 main country code in which the Deal is billed.
It's a simplification of the billing country that is common to all users
that share the same Deal. It can be null.
- name: main_billing_country_iso_3_per_deal
data_type: string
description: |
ISO 3166-1 alpha-3 main country code in which the Deal is billed.
It's a simplification of the billing country that is common to all users
that share the same Deal. It can be null.
- name: joined_at_utc
data_type: timestamp
description: |

View file

@ -31,7 +31,7 @@ with
{% elif dimension.dimension == "'by_billing_country'" %}
inner join int_core__user_host h
on d.date >= h.created_date_utc
and h.billing_country_iso_3 is not null
and h.main_billing_country_iso_3_per_deal is not null
{% endif %}
{% if not loop.last %}
union all

View file

@ -34,7 +34,7 @@ with
select * from {{ ref("int_dates_mtd_by_dimension") }}
),
int_xero__contacts as (select * from {{ ref("int_xero__contacts") }}),
int_core__user_host as (select * from {{ ref("int_core__user_host") }}),
int_core__deal as (select * from {{ ref("int_core__deal") }}),
resolution_host_payment as (
{% for dimension in dimensions %}
select
@ -66,9 +66,9 @@ with
{% elif dimension.dimension == "'by_billing_country'" %}
inner join int_xero__contacts c on c.id_contact = bt.id_contact
inner join
int_core__user_host u
on c.id_deal = u.id_deal
and u.billing_country_iso_3 is not null
int_core__deal ud
on c.id_deal = ud.id_deal
and ud.main_billing_country_iso_3_per_deal is not null
{% endif %}
group by 1, 2, 3
{% if not loop.last %}
@ -108,9 +108,9 @@ with
and d.date = mas.date
{% elif dimension.dimension == "'by_billing_country'" %}
inner join
int_core__user_host u
on sdm.id_deal = u.id_deal
and u.billing_country_iso_3 is not null
int_core__deal ud
on sdm.id_deal = ud.id_deal
and ud.main_billing_country_iso_3_per_deal is not null
{% endif %}
where
upper(sdm.document_status) in {{ relevant_document_statuses }}
@ -170,9 +170,9 @@ with
and d.date = mas.date
{% elif dimension.dimension == "'by_billing_country'" %}
inner join
int_core__user_host u
on sdm.id_deal = u.id_deal
and u.billing_country_iso_3 is not null
int_core__deal ud
on sdm.id_deal = ud.id_deal
and ud.main_billing_country_iso_3_per_deal is not null
{% endif %}
where
upper(sdm.document_status) in {{ relevant_document_statuses }}