data-dwh-dbt-project/models/intermediate/xero/int_xero__mtd_invoicing_metrics.sql

240 lines
9.7 KiB
MySQL
Raw Normal View History

/*
This model provides Month-To-Date (MTD) based on anything related to
invoicing metrics. This includes Operator (Host) fees, waiver payments,
APIs and Host Resolutions.
*/
Merged PR 2583: Invoicing metrics per customer segmentation # Description Silly change: * Modifies `int_core__mtd_guest_payments_metrics` to apply the proper key on date, dimension and dimension_value. * -> *The weird thing is that the previous dbt test I run worked well. Is it possible that the configuration in the schema file prevails on top of the model configuration? I thought it was the other way around...* Main changes: * Modifies `int_xero__mtd_invoicing_metrics` to include the customer segmentation based on listings. * `schema.yaml` is also affected including new fields and tests * Added the macro to retrieve the production dimension in `int_core__mtd_vs_previous_year_metrics` to avoid propagating this upwards and messing up with the data display. Overall, follows a similar strategy as we did for Booking, Guest Journey, Deal, Accommodation and Guest Payments metrics. For reference, [here's the previous PR on Guest Payments](https://guardhog.visualstudio.com/Data/_git/data-dwh-dbt-project/pullrequest/2580). This is the last PR on the source models for KPIs. Will follow: propagation + exposure # Checklist - [X] The edited models and dependants run properly with production data. * **Important note**: this segmentation provides null values for all API-related KPIs. Makes sense, since the 4 deal id we have for APIs do NOT have, or have had, a listing linked to them. I'd say it's not a blocker. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #19325
2024-08-20 12:05:53 +00:00
{% set dimensions = get_kpi_dimensions() %}
{% set relevant_document_statuses = "('PAID', 'AUTHORISED')" %}
{% set booking_fee_items = "('EU BOOKING FEE','ZAR BOOKINGS','BOOKING FEE - NON-UK','USD BOOKINGS','CAD BOOKINGS','BOOKING FEE - UK','AUD BOOKINGS')" %}
{% set listing_fee_items = "('USD LISTINGS','LISTING FEE - NON UK','ZAR LISTINGS','CAD LISTINGS','LISTING FEE - UK','AUD LISTINGS','EU LISTINGS')" %}
{% set waiver_items = "('DAMAGE WAVER', 'DAMAGE WAIVER')" %}
{% set verification_fee_items = "('VERIFICATION FEE')" %}
{% set resolutions_host_payment_account_name = "('RESOLUTIONS - HOST PAYMENT')" %}
{% set e_deposit_account_name = "('E-DEPOSIT FEES')" %}
{% set guesty_account_name = "('GUESTY FEES', 'GUESTY ADMINISTRATION FEE')" %}
Merged PR 2583: Invoicing metrics per customer segmentation # Description Silly change: * Modifies `int_core__mtd_guest_payments_metrics` to apply the proper key on date, dimension and dimension_value. * -> *The weird thing is that the previous dbt test I run worked well. Is it possible that the configuration in the schema file prevails on top of the model configuration? I thought it was the other way around...* Main changes: * Modifies `int_xero__mtd_invoicing_metrics` to include the customer segmentation based on listings. * `schema.yaml` is also affected including new fields and tests * Added the macro to retrieve the production dimension in `int_core__mtd_vs_previous_year_metrics` to avoid propagating this upwards and messing up with the data display. Overall, follows a similar strategy as we did for Booking, Guest Journey, Deal, Accommodation and Guest Payments metrics. For reference, [here's the previous PR on Guest Payments](https://guardhog.visualstudio.com/Data/_git/data-dwh-dbt-project/pullrequest/2580). This is the last PR on the source models for KPIs. Will follow: propagation + exposure # Checklist - [X] The edited models and dependants run properly with production data. * **Important note**: this segmentation provides null values for all API-related KPIs. Makes sense, since the 4 deal id we have for APIs do NOT have, or have had, a listing linked to them. I'd say it's not a blocker. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #19325
2024-08-20 12:05:53 +00:00
{{ config(materialized="table", unique_key=["date", "dimension", "dimension_value"]) }}
with
int_xero__sales_denom_mart as (
select * from {{ ref("int_xero__sales_denom_mart") }}
),
int_dates_mtd as (select * from {{ ref("int_dates_mtd") }}),
int_xero__bank_transaction_line_items as (
select * from {{ ref("int_xero__bank_transaction_line_items") }}
),
int_xero__bank_transactions as (
select * from {{ ref("int_xero__bank_transactions") }}
),
Merged PR 2583: Invoicing metrics per customer segmentation # Description Silly change: * Modifies `int_core__mtd_guest_payments_metrics` to apply the proper key on date, dimension and dimension_value. * -> *The weird thing is that the previous dbt test I run worked well. Is it possible that the configuration in the schema file prevails on top of the model configuration? I thought it was the other way around...* Main changes: * Modifies `int_xero__mtd_invoicing_metrics` to include the customer segmentation based on listings. * `schema.yaml` is also affected including new fields and tests * Added the macro to retrieve the production dimension in `int_core__mtd_vs_previous_year_metrics` to avoid propagating this upwards and messing up with the data display. Overall, follows a similar strategy as we did for Booking, Guest Journey, Deal, Accommodation and Guest Payments metrics. For reference, [here's the previous PR on Guest Payments](https://guardhog.visualstudio.com/Data/_git/data-dwh-dbt-project/pullrequest/2580). This is the last PR on the source models for KPIs. Will follow: propagation + exposure # Checklist - [X] The edited models and dependants run properly with production data. * **Important note**: this segmentation provides null values for all API-related KPIs. Makes sense, since the 4 deal id we have for APIs do NOT have, or have had, a listing linked to them. I'd say it's not a blocker. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #19325
2024-08-20 12:05:53 +00:00
int_core__mtd_accommodation_segmentation as (
select * from {{ ref("int_core__mtd_accommodation_segmentation") }}
),
int_dates_mtd_by_dimension as (
select * from {{ ref("int_dates_mtd_by_dimension") }}
),
int_xero__contacts as (select * from {{ ref("int_xero__contacts") }}),
Merged PR 2743: Fixes deal-based issues on the billing country dimension # Description Before deploying KPIs by Billing Country, we spotted some issues that were basically increases on the volumes of any metric on the by billing country dimension that was based on Deal. This means, `int_core__mtd_deal_metrics` and `int_xero__mtd_invoicing_metrics`. This PR changes the following: * Now the 2 abovementioned models depend on the `int_core__deal` model, instead of `int_core__user_host` (thus removing duplicated stuff) * Now all models use the main billing country at deal level, instead of doing it so at host level. The reason is that some small amount of hosts that share the same deal can have a different billing country. To avoid weird stuff, everything points to this simplification - that in general, it's not a massive change in the output. * In order to do so easily, the 3 main billing country per deal fields have been propagated to `int_core__user_host` To exemplify the solution, find here a snapshot of the differences in behavior: ``` select dimension, sum(deals_booked_in_month) as deals_booked_1, sum(deals_booked_in_6_months) as deals_booked_6, sum(deals_booked_in_12_months) as deals_booked_12, sum(total_revenue_in_gbp) as total_revenue, sum(xero_operator_net_fees_in_gbp) as operator_revenue, sum(xero_booking_net_fees_in_gbp) as booking_fees, sum(xero_listing_net_fees_in_gbp) as listing_fees, sum(xero_verification_net_fees_in_gbp) as verification_fees, sum(total_guest_revenue_in_gbp) as guest_revenue, sum(xero_waiver_paid_back_to_host_in_gbp) as waiver_paid_back_to_hosts, sum(waiver_net_fees_in_gbp) as waiver_net_fees from intermediate.int_mtd_vs_previous_year_metrics where date in ('2024-01-31') group by 1 order by 1 ``` Production: ![image.png](https://guardhog.visualstudio.com/4148d95f-4b6d-4205-bcff-e9c8e0d2ca65/_apis/git/repositories/54ac356f-aad7-46d2-b62c-e8c5b3bb8ebf/pullRequests/2743/attachments/image.png) vs. Local: ![image (2).png](https://guardhog.visualstudio.com/4148d95f-4b6d-4205-bcff-e9c8e0d2ca65/_apis/git/repositories/54ac356f-aad7-46d2-b62c-e8c5b3bb8ebf/pullRequests/2743/attachments/image%20%282%29.png) Keep in mind that still Global dimension can be greater than any other dimension aggregated since not all users have a deal. Mismatches between the other 2 dimensions might be linked to the dump. Commits are meaningful and help navigate in the changes. # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #20823
2024-09-05 09:53:16 +00:00
int_core__deal as (select * from {{ ref("int_core__deal") }}),
resolution_host_payment as (
Merged PR 2583: Invoicing metrics per customer segmentation # Description Silly change: * Modifies `int_core__mtd_guest_payments_metrics` to apply the proper key on date, dimension and dimension_value. * -> *The weird thing is that the previous dbt test I run worked well. Is it possible that the configuration in the schema file prevails on top of the model configuration? I thought it was the other way around...* Main changes: * Modifies `int_xero__mtd_invoicing_metrics` to include the customer segmentation based on listings. * `schema.yaml` is also affected including new fields and tests * Added the macro to retrieve the production dimension in `int_core__mtd_vs_previous_year_metrics` to avoid propagating this upwards and messing up with the data display. Overall, follows a similar strategy as we did for Booking, Guest Journey, Deal, Accommodation and Guest Payments metrics. For reference, [here's the previous PR on Guest Payments](https://guardhog.visualstudio.com/Data/_git/data-dwh-dbt-project/pullrequest/2580). This is the last PR on the source models for KPIs. Will follow: propagation + exposure # Checklist - [X] The edited models and dependants run properly with production data. * **Important note**: this segmentation provides null values for all API-related KPIs. Makes sense, since the 4 deal id we have for APIs do NOT have, or have had, a listing linked to them. I'd say it's not a blocker. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #19325
2024-08-20 12:05:53 +00:00
{% for dimension in dimensions %}
select
d.date,
{{ dimension.dimension }} as dimension,
{{ dimension.dimension_value }} as dimension_value,
sum(
btli.line_amount_wo_taxes_in_gbp
) as xero_host_resolution_amount_paid_in_gbp,
count(
distinct bt.id_bank_transaction
) as xero_host_resolution_payment_count
from int_dates_mtd d
inner join
int_xero__bank_transactions bt
on date_trunc('month', bt.transaction_date_utc)::date = d.first_day_month
and extract(day from bt.transaction_date_utc) <= d.day
inner join
int_xero__bank_transaction_line_items btli
on bt.id_bank_transaction = btli.id_bank_transaction
and upper(btli.account_name)
in {{ resolutions_host_payment_account_name }}
{% if dimension.dimension == "'by_number_of_listings'" %}
inner join int_xero__contacts c on c.id_contact = bt.id_contact
inner join
int_core__mtd_accommodation_segmentation mas
on c.id_deal = mas.id_deal
and d.date = mas.date
Merged PR 2689: KPIs by Billing Country # Description Adds Billing Country dimension in KPIs, but does not expose them to reporting yet. Silly thing, based on the macros I built, I cannot make incremental changes unless changing all models. This will need to be adapted, happy to hear your thoughts on how we do it. Additionally, I have lack of performance of the model `mtd_guest_payments_metrics`. It takes around 5 min to execute, but technically the end-to-end runs in one shoot without breaking. It's a complex PR because it changes many files, but you will see that: * It mostly changes the join conditions for the dimensions or the schema tests, * I tried to be very careful and add things step-by-step in the commits. Goal is NOT to complete the PR yet until we see how we can improve performance. I can say though that data end-to-end looks ok to me, but would benefit from checking with production data for the new dimension Update 30th Aug * Added a new commit that includes `id_user_host` in `int_core__verification_payments`. Happy to discuss if it makes sense or not. But it changes the execution from ~600 sec to ~6 sec because it avoids a massive repeated join with `verification_requests`. # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [ ] I've picked the right materialization for the affected models. **To check because of performance issues** # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #19082
2024-09-04 10:17:12 +00:00
{% elif dimension.dimension == "'by_billing_country'" %}
inner join int_xero__contacts c on c.id_contact = bt.id_contact
inner join
Merged PR 2743: Fixes deal-based issues on the billing country dimension # Description Before deploying KPIs by Billing Country, we spotted some issues that were basically increases on the volumes of any metric on the by billing country dimension that was based on Deal. This means, `int_core__mtd_deal_metrics` and `int_xero__mtd_invoicing_metrics`. This PR changes the following: * Now the 2 abovementioned models depend on the `int_core__deal` model, instead of `int_core__user_host` (thus removing duplicated stuff) * Now all models use the main billing country at deal level, instead of doing it so at host level. The reason is that some small amount of hosts that share the same deal can have a different billing country. To avoid weird stuff, everything points to this simplification - that in general, it's not a massive change in the output. * In order to do so easily, the 3 main billing country per deal fields have been propagated to `int_core__user_host` To exemplify the solution, find here a snapshot of the differences in behavior: ``` select dimension, sum(deals_booked_in_month) as deals_booked_1, sum(deals_booked_in_6_months) as deals_booked_6, sum(deals_booked_in_12_months) as deals_booked_12, sum(total_revenue_in_gbp) as total_revenue, sum(xero_operator_net_fees_in_gbp) as operator_revenue, sum(xero_booking_net_fees_in_gbp) as booking_fees, sum(xero_listing_net_fees_in_gbp) as listing_fees, sum(xero_verification_net_fees_in_gbp) as verification_fees, sum(total_guest_revenue_in_gbp) as guest_revenue, sum(xero_waiver_paid_back_to_host_in_gbp) as waiver_paid_back_to_hosts, sum(waiver_net_fees_in_gbp) as waiver_net_fees from intermediate.int_mtd_vs_previous_year_metrics where date in ('2024-01-31') group by 1 order by 1 ``` Production: ![image.png](https://guardhog.visualstudio.com/4148d95f-4b6d-4205-bcff-e9c8e0d2ca65/_apis/git/repositories/54ac356f-aad7-46d2-b62c-e8c5b3bb8ebf/pullRequests/2743/attachments/image.png) vs. Local: ![image (2).png](https://guardhog.visualstudio.com/4148d95f-4b6d-4205-bcff-e9c8e0d2ca65/_apis/git/repositories/54ac356f-aad7-46d2-b62c-e8c5b3bb8ebf/pullRequests/2743/attachments/image%20%282%29.png) Keep in mind that still Global dimension can be greater than any other dimension aggregated since not all users have a deal. Mismatches between the other 2 dimensions might be linked to the dump. Commits are meaningful and help navigate in the changes. # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #20823
2024-09-05 09:53:16 +00:00
int_core__deal ud
on c.id_deal = ud.id_deal
and ud.main_billing_country_iso_3_per_deal is not null
Merged PR 2583: Invoicing metrics per customer segmentation # Description Silly change: * Modifies `int_core__mtd_guest_payments_metrics` to apply the proper key on date, dimension and dimension_value. * -> *The weird thing is that the previous dbt test I run worked well. Is it possible that the configuration in the schema file prevails on top of the model configuration? I thought it was the other way around...* Main changes: * Modifies `int_xero__mtd_invoicing_metrics` to include the customer segmentation based on listings. * `schema.yaml` is also affected including new fields and tests * Added the macro to retrieve the production dimension in `int_core__mtd_vs_previous_year_metrics` to avoid propagating this upwards and messing up with the data display. Overall, follows a similar strategy as we did for Booking, Guest Journey, Deal, Accommodation and Guest Payments metrics. For reference, [here's the previous PR on Guest Payments](https://guardhog.visualstudio.com/Data/_git/data-dwh-dbt-project/pullrequest/2580). This is the last PR on the source models for KPIs. Will follow: propagation + exposure # Checklist - [X] The edited models and dependants run properly with production data. * **Important note**: this segmentation provides null values for all API-related KPIs. Makes sense, since the 4 deal id we have for APIs do NOT have, or have had, a listing linked to them. I'd say it's not a blocker. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #19325
2024-08-20 12:05:53 +00:00
{% endif %}
group by 1, 2, 3
{% if not loop.last %}
union all
{% endif %}
{% endfor %}
),
apis_net_fees as (
Merged PR 2583: Invoicing metrics per customer segmentation # Description Silly change: * Modifies `int_core__mtd_guest_payments_metrics` to apply the proper key on date, dimension and dimension_value. * -> *The weird thing is that the previous dbt test I run worked well. Is it possible that the configuration in the schema file prevails on top of the model configuration? I thought it was the other way around...* Main changes: * Modifies `int_xero__mtd_invoicing_metrics` to include the customer segmentation based on listings. * `schema.yaml` is also affected including new fields and tests * Added the macro to retrieve the production dimension in `int_core__mtd_vs_previous_year_metrics` to avoid propagating this upwards and messing up with the data display. Overall, follows a similar strategy as we did for Booking, Guest Journey, Deal, Accommodation and Guest Payments metrics. For reference, [here's the previous PR on Guest Payments](https://guardhog.visualstudio.com/Data/_git/data-dwh-dbt-project/pullrequest/2580). This is the last PR on the source models for KPIs. Will follow: propagation + exposure # Checklist - [X] The edited models and dependants run properly with production data. * **Important note**: this segmentation provides null values for all API-related KPIs. Makes sense, since the 4 deal id we have for APIs do NOT have, or have had, a listing linked to them. I'd say it's not a blocker. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #19325
2024-08-20 12:05:53 +00:00
{% for dimension in dimensions %}
select
d.date,
{{ dimension.dimension }} as dimension,
{{ dimension.dimension_value }} as dimension_value,
sum(
case
when upper(sdm.account_name) in {{ e_deposit_account_name }}
then sdm.line_amount_wo_taxes_in_gbp
else 0
end
) as xero_e_deposit_net_fees_in_gbp,
sum(
case
when upper(sdm.account_name) in {{ guesty_account_name }}
then sdm.line_amount_wo_taxes_in_gbp
else 0
end
) as xero_guesty_net_fees_in_gbp
from int_dates_mtd d
inner join
int_xero__sales_denom_mart sdm
on date_trunc('month', sdm.document_issued_date_utc) = d.first_day_month
and extract(day from sdm.document_issued_date_utc) <= d.day
{% if dimension.dimension == "'by_number_of_listings'" %}
inner join
int_core__mtd_accommodation_segmentation mas
on sdm.id_deal = mas.id_deal
and d.date = mas.date
Merged PR 2689: KPIs by Billing Country # Description Adds Billing Country dimension in KPIs, but does not expose them to reporting yet. Silly thing, based on the macros I built, I cannot make incremental changes unless changing all models. This will need to be adapted, happy to hear your thoughts on how we do it. Additionally, I have lack of performance of the model `mtd_guest_payments_metrics`. It takes around 5 min to execute, but technically the end-to-end runs in one shoot without breaking. It's a complex PR because it changes many files, but you will see that: * It mostly changes the join conditions for the dimensions or the schema tests, * I tried to be very careful and add things step-by-step in the commits. Goal is NOT to complete the PR yet until we see how we can improve performance. I can say though that data end-to-end looks ok to me, but would benefit from checking with production data for the new dimension Update 30th Aug * Added a new commit that includes `id_user_host` in `int_core__verification_payments`. Happy to discuss if it makes sense or not. But it changes the execution from ~600 sec to ~6 sec because it avoids a massive repeated join with `verification_requests`. # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [ ] I've picked the right materialization for the affected models. **To check because of performance issues** # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #19082
2024-09-04 10:17:12 +00:00
{% elif dimension.dimension == "'by_billing_country'" %}
inner join
Merged PR 2743: Fixes deal-based issues on the billing country dimension # Description Before deploying KPIs by Billing Country, we spotted some issues that were basically increases on the volumes of any metric on the by billing country dimension that was based on Deal. This means, `int_core__mtd_deal_metrics` and `int_xero__mtd_invoicing_metrics`. This PR changes the following: * Now the 2 abovementioned models depend on the `int_core__deal` model, instead of `int_core__user_host` (thus removing duplicated stuff) * Now all models use the main billing country at deal level, instead of doing it so at host level. The reason is that some small amount of hosts that share the same deal can have a different billing country. To avoid weird stuff, everything points to this simplification - that in general, it's not a massive change in the output. * In order to do so easily, the 3 main billing country per deal fields have been propagated to `int_core__user_host` To exemplify the solution, find here a snapshot of the differences in behavior: ``` select dimension, sum(deals_booked_in_month) as deals_booked_1, sum(deals_booked_in_6_months) as deals_booked_6, sum(deals_booked_in_12_months) as deals_booked_12, sum(total_revenue_in_gbp) as total_revenue, sum(xero_operator_net_fees_in_gbp) as operator_revenue, sum(xero_booking_net_fees_in_gbp) as booking_fees, sum(xero_listing_net_fees_in_gbp) as listing_fees, sum(xero_verification_net_fees_in_gbp) as verification_fees, sum(total_guest_revenue_in_gbp) as guest_revenue, sum(xero_waiver_paid_back_to_host_in_gbp) as waiver_paid_back_to_hosts, sum(waiver_net_fees_in_gbp) as waiver_net_fees from intermediate.int_mtd_vs_previous_year_metrics where date in ('2024-01-31') group by 1 order by 1 ``` Production: ![image.png](https://guardhog.visualstudio.com/4148d95f-4b6d-4205-bcff-e9c8e0d2ca65/_apis/git/repositories/54ac356f-aad7-46d2-b62c-e8c5b3bb8ebf/pullRequests/2743/attachments/image.png) vs. Local: ![image (2).png](https://guardhog.visualstudio.com/4148d95f-4b6d-4205-bcff-e9c8e0d2ca65/_apis/git/repositories/54ac356f-aad7-46d2-b62c-e8c5b3bb8ebf/pullRequests/2743/attachments/image%20%282%29.png) Keep in mind that still Global dimension can be greater than any other dimension aggregated since not all users have a deal. Mismatches between the other 2 dimensions might be linked to the dump. Commits are meaningful and help navigate in the changes. # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #20823
2024-09-05 09:53:16 +00:00
int_core__deal ud
on sdm.id_deal = ud.id_deal
and ud.main_billing_country_iso_3_per_deal is not null
Merged PR 2583: Invoicing metrics per customer segmentation # Description Silly change: * Modifies `int_core__mtd_guest_payments_metrics` to apply the proper key on date, dimension and dimension_value. * -> *The weird thing is that the previous dbt test I run worked well. Is it possible that the configuration in the schema file prevails on top of the model configuration? I thought it was the other way around...* Main changes: * Modifies `int_xero__mtd_invoicing_metrics` to include the customer segmentation based on listings. * `schema.yaml` is also affected including new fields and tests * Added the macro to retrieve the production dimension in `int_core__mtd_vs_previous_year_metrics` to avoid propagating this upwards and messing up with the data display. Overall, follows a similar strategy as we did for Booking, Guest Journey, Deal, Accommodation and Guest Payments metrics. For reference, [here's the previous PR on Guest Payments](https://guardhog.visualstudio.com/Data/_git/data-dwh-dbt-project/pullrequest/2580). This is the last PR on the source models for KPIs. Will follow: propagation + exposure # Checklist - [X] The edited models and dependants run properly with production data. * **Important note**: this segmentation provides null values for all API-related KPIs. Makes sense, since the 4 deal id we have for APIs do NOT have, or have had, a listing linked to them. I'd say it's not a blocker. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #19325
2024-08-20 12:05:53 +00:00
{% endif %}
where
upper(sdm.document_status) in {{ relevant_document_statuses }}
and (
upper(sdm.account_name) in {{ e_deposit_account_name }}
or upper(sdm.account_name) in {{ guesty_account_name }}
)
group by 1, 2, 3
{% if not loop.last %}
union all
{% endif %}
{% endfor %}
),
host_net_fees as (
Merged PR 2583: Invoicing metrics per customer segmentation # Description Silly change: * Modifies `int_core__mtd_guest_payments_metrics` to apply the proper key on date, dimension and dimension_value. * -> *The weird thing is that the previous dbt test I run worked well. Is it possible that the configuration in the schema file prevails on top of the model configuration? I thought it was the other way around...* Main changes: * Modifies `int_xero__mtd_invoicing_metrics` to include the customer segmentation based on listings. * `schema.yaml` is also affected including new fields and tests * Added the macro to retrieve the production dimension in `int_core__mtd_vs_previous_year_metrics` to avoid propagating this upwards and messing up with the data display. Overall, follows a similar strategy as we did for Booking, Guest Journey, Deal, Accommodation and Guest Payments metrics. For reference, [here's the previous PR on Guest Payments](https://guardhog.visualstudio.com/Data/_git/data-dwh-dbt-project/pullrequest/2580). This is the last PR on the source models for KPIs. Will follow: propagation + exposure # Checklist - [X] The edited models and dependants run properly with production data. * **Important note**: this segmentation provides null values for all API-related KPIs. Makes sense, since the 4 deal id we have for APIs do NOT have, or have had, a listing linked to them. I'd say it's not a blocker. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #19325
2024-08-20 12:05:53 +00:00
{% for dimension in dimensions %}
select
d.date,
{{ dimension.dimension }} as dimension,
{{ dimension.dimension_value }} as dimension_value,
sum(
case
when upper(sdm.item_code) in {{ booking_fee_items }}
then sdm.line_amount_wo_taxes_in_gbp
else 0
end
) as xero_booking_net_fees_in_gbp,
sum(
case
when upper(sdm.item_code) in {{ listing_fee_items }}
then sdm.line_amount_wo_taxes_in_gbp
else 0
end
) as xero_listing_net_fees_in_gbp,
sum(
case
when upper(sdm.item_code) in {{ verification_fee_items }}
then sdm.line_amount_wo_taxes_in_gbp
else 0
end
) as xero_verification_net_fees_in_gbp,
sum(
case
when upper(sdm.item_code) in {{ waiver_items }}
then sdm.line_amount_wo_taxes_in_gbp
else 0
end
) as xero_waiver_paid_back_to_host_in_gbp
from int_dates_mtd d
inner join
int_xero__sales_denom_mart sdm
on date_trunc('month', sdm.document_issued_date_utc) = d.first_day_month
and extract(day from sdm.document_issued_date_utc) <= d.day
{% if dimension.dimension == "'by_number_of_listings'" %}
inner join
int_core__mtd_accommodation_segmentation mas
on sdm.id_deal = mas.id_deal
and d.date = mas.date
Merged PR 2689: KPIs by Billing Country # Description Adds Billing Country dimension in KPIs, but does not expose them to reporting yet. Silly thing, based on the macros I built, I cannot make incremental changes unless changing all models. This will need to be adapted, happy to hear your thoughts on how we do it. Additionally, I have lack of performance of the model `mtd_guest_payments_metrics`. It takes around 5 min to execute, but technically the end-to-end runs in one shoot without breaking. It's a complex PR because it changes many files, but you will see that: * It mostly changes the join conditions for the dimensions or the schema tests, * I tried to be very careful and add things step-by-step in the commits. Goal is NOT to complete the PR yet until we see how we can improve performance. I can say though that data end-to-end looks ok to me, but would benefit from checking with production data for the new dimension Update 30th Aug * Added a new commit that includes `id_user_host` in `int_core__verification_payments`. Happy to discuss if it makes sense or not. But it changes the execution from ~600 sec to ~6 sec because it avoids a massive repeated join with `verification_requests`. # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [ ] I've picked the right materialization for the affected models. **To check because of performance issues** # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #19082
2024-09-04 10:17:12 +00:00
{% elif dimension.dimension == "'by_billing_country'" %}
inner join
Merged PR 2743: Fixes deal-based issues on the billing country dimension # Description Before deploying KPIs by Billing Country, we spotted some issues that were basically increases on the volumes of any metric on the by billing country dimension that was based on Deal. This means, `int_core__mtd_deal_metrics` and `int_xero__mtd_invoicing_metrics`. This PR changes the following: * Now the 2 abovementioned models depend on the `int_core__deal` model, instead of `int_core__user_host` (thus removing duplicated stuff) * Now all models use the main billing country at deal level, instead of doing it so at host level. The reason is that some small amount of hosts that share the same deal can have a different billing country. To avoid weird stuff, everything points to this simplification - that in general, it's not a massive change in the output. * In order to do so easily, the 3 main billing country per deal fields have been propagated to `int_core__user_host` To exemplify the solution, find here a snapshot of the differences in behavior: ``` select dimension, sum(deals_booked_in_month) as deals_booked_1, sum(deals_booked_in_6_months) as deals_booked_6, sum(deals_booked_in_12_months) as deals_booked_12, sum(total_revenue_in_gbp) as total_revenue, sum(xero_operator_net_fees_in_gbp) as operator_revenue, sum(xero_booking_net_fees_in_gbp) as booking_fees, sum(xero_listing_net_fees_in_gbp) as listing_fees, sum(xero_verification_net_fees_in_gbp) as verification_fees, sum(total_guest_revenue_in_gbp) as guest_revenue, sum(xero_waiver_paid_back_to_host_in_gbp) as waiver_paid_back_to_hosts, sum(waiver_net_fees_in_gbp) as waiver_net_fees from intermediate.int_mtd_vs_previous_year_metrics where date in ('2024-01-31') group by 1 order by 1 ``` Production: ![image.png](https://guardhog.visualstudio.com/4148d95f-4b6d-4205-bcff-e9c8e0d2ca65/_apis/git/repositories/54ac356f-aad7-46d2-b62c-e8c5b3bb8ebf/pullRequests/2743/attachments/image.png) vs. Local: ![image (2).png](https://guardhog.visualstudio.com/4148d95f-4b6d-4205-bcff-e9c8e0d2ca65/_apis/git/repositories/54ac356f-aad7-46d2-b62c-e8c5b3bb8ebf/pullRequests/2743/attachments/image%20%282%29.png) Keep in mind that still Global dimension can be greater than any other dimension aggregated since not all users have a deal. Mismatches between the other 2 dimensions might be linked to the dump. Commits are meaningful and help navigate in the changes. # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #20823
2024-09-05 09:53:16 +00:00
int_core__deal ud
on sdm.id_deal = ud.id_deal
and ud.main_billing_country_iso_3_per_deal is not null
Merged PR 2583: Invoicing metrics per customer segmentation # Description Silly change: * Modifies `int_core__mtd_guest_payments_metrics` to apply the proper key on date, dimension and dimension_value. * -> *The weird thing is that the previous dbt test I run worked well. Is it possible that the configuration in the schema file prevails on top of the model configuration? I thought it was the other way around...* Main changes: * Modifies `int_xero__mtd_invoicing_metrics` to include the customer segmentation based on listings. * `schema.yaml` is also affected including new fields and tests * Added the macro to retrieve the production dimension in `int_core__mtd_vs_previous_year_metrics` to avoid propagating this upwards and messing up with the data display. Overall, follows a similar strategy as we did for Booking, Guest Journey, Deal, Accommodation and Guest Payments metrics. For reference, [here's the previous PR on Guest Payments](https://guardhog.visualstudio.com/Data/_git/data-dwh-dbt-project/pullrequest/2580). This is the last PR on the source models for KPIs. Will follow: propagation + exposure # Checklist - [X] The edited models and dependants run properly with production data. * **Important note**: this segmentation provides null values for all API-related KPIs. Makes sense, since the 4 deal id we have for APIs do NOT have, or have had, a listing linked to them. I'd say it's not a blocker. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #19325
2024-08-20 12:05:53 +00:00
{% endif %}
where
upper(sdm.document_status) in {{ relevant_document_statuses }}
and (
upper(sdm.item_code) in {{ booking_fee_items }}
or upper(sdm.item_code) in {{ listing_fee_items }}
or upper(sdm.item_code) in {{ verification_fee_items }}
or upper(sdm.item_code) in {{ waiver_items }}
)
group by 1, 2, 3
{% if not loop.last %}
union all
{% endif %}
{% endfor %}
)
-- Final aggregation of subqueries --
select
d.year,
d.month,
d.day,
d.date,
Merged PR 2583: Invoicing metrics per customer segmentation # Description Silly change: * Modifies `int_core__mtd_guest_payments_metrics` to apply the proper key on date, dimension and dimension_value. * -> *The weird thing is that the previous dbt test I run worked well. Is it possible that the configuration in the schema file prevails on top of the model configuration? I thought it was the other way around...* Main changes: * Modifies `int_xero__mtd_invoicing_metrics` to include the customer segmentation based on listings. * `schema.yaml` is also affected including new fields and tests * Added the macro to retrieve the production dimension in `int_core__mtd_vs_previous_year_metrics` to avoid propagating this upwards and messing up with the data display. Overall, follows a similar strategy as we did for Booking, Guest Journey, Deal, Accommodation and Guest Payments metrics. For reference, [here's the previous PR on Guest Payments](https://guardhog.visualstudio.com/Data/_git/data-dwh-dbt-project/pullrequest/2580). This is the last PR on the source models for KPIs. Will follow: propagation + exposure # Checklist - [X] The edited models and dependants run properly with production data. * **Important note**: this segmentation provides null values for all API-related KPIs. Makes sense, since the 4 deal id we have for APIs do NOT have, or have had, a listing linked to them. I'd say it's not a blocker. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #19325
2024-08-20 12:05:53 +00:00
d.dimension,
d.dimension_value,
d.is_end_of_month,
d.is_current_month,
-- HOST/OPERATOR --
hnf.xero_booking_net_fees_in_gbp,
hnf.xero_listing_net_fees_in_gbp,
hnf.xero_verification_net_fees_in_gbp,
nullif(
coalesce(hnf.xero_booking_net_fees_in_gbp, 0)
+ coalesce(hnf.xero_listing_net_fees_in_gbp, 0)
+ coalesce(hnf.xero_verification_net_fees_in_gbp, 0),
0
) as xero_operator_net_fees_in_gbp,
-- APIs --
anf.xero_e_deposit_net_fees_in_gbp,
anf.xero_guesty_net_fees_in_gbp,
nullif(
coalesce(anf.xero_e_deposit_net_fees_in_gbp, 0)
+ coalesce(anf.xero_guesty_net_fees_in_gbp, 0),
0
) as xero_apis_net_fees_in_gbp,
-- WAIVERS PAID BACK TO HOST --
hnf.xero_waiver_paid_back_to_host_in_gbp,
-- HOST RESOLUTIONS --
rhp.xero_host_resolution_amount_paid_in_gbp,
rhp.xero_host_resolution_payment_count
Merged PR 2583: Invoicing metrics per customer segmentation # Description Silly change: * Modifies `int_core__mtd_guest_payments_metrics` to apply the proper key on date, dimension and dimension_value. * -> *The weird thing is that the previous dbt test I run worked well. Is it possible that the configuration in the schema file prevails on top of the model configuration? I thought it was the other way around...* Main changes: * Modifies `int_xero__mtd_invoicing_metrics` to include the customer segmentation based on listings. * `schema.yaml` is also affected including new fields and tests * Added the macro to retrieve the production dimension in `int_core__mtd_vs_previous_year_metrics` to avoid propagating this upwards and messing up with the data display. Overall, follows a similar strategy as we did for Booking, Guest Journey, Deal, Accommodation and Guest Payments metrics. For reference, [here's the previous PR on Guest Payments](https://guardhog.visualstudio.com/Data/_git/data-dwh-dbt-project/pullrequest/2580). This is the last PR on the source models for KPIs. Will follow: propagation + exposure # Checklist - [X] The edited models and dependants run properly with production data. * **Important note**: this segmentation provides null values for all API-related KPIs. Makes sense, since the 4 deal id we have for APIs do NOT have, or have had, a listing linked to them. I'd say it's not a blocker. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #19325
2024-08-20 12:05:53 +00:00
from int_dates_mtd_by_dimension d
left join
resolution_host_payment rhp
on rhp.date = d.date
and rhp.dimension = d.dimension
and rhp.dimension_value = d.dimension_value
left join
apis_net_fees anf
on anf.date = d.date
and anf.dimension = d.dimension
and anf.dimension_value = d.dimension_value
left join
host_net_fees hnf
on hnf.date = d.date
and hnf.dimension = d.dimension
and hnf.dimension_value = d.dimension_value