From 901be930df108fdd5b63f637c01311df0a218b4c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Oriol=20Roqu=C3=A9=20Paniagua?= Date: Tue, 15 Oct 2024 06:51:41 +0000 Subject: [PATCH] Merged PR 3163: First version of 12m window contribution by deal # Description This PR creates a new model that depends on int_monthly_aggregated_metrics_history_by_deal. The idea is that this is used for Churn computation (Booking Churn, Revenue Churn, Listing Churn) later on. The idea is relatively simple. Measure how much a Deal has been contributing to a Global amount (sum of metric for all deals) over the preceding period of 12 months. You will notice that there's 2 computations, the "additive" and the "average" one. This is because we still need to align with Matt/Suzannah on which approach makes more sense, but we need data for it. I'm not sure the namings are good though so happy to see your suggestions. You will also notice that there's no filter by deal_lifecycle_state = '06-Churning'. This will be done in a separated model, whenever we attribute this model to the mtd computation. The reason is simple - this model stays at deal level, thus meaning we can do the dimension aggregation and even a lifecycle aggregation if needed, depending on the needs. Be aware that this effectively means that MTD KPIs models will depend on the "monthly by deal" models. This has some cons in terms of dependency management but cannot be overcome since we the metric total revenue depends on many subsets. In essence, I don't see another way of doing it unless doing a massive KPIs refactor. I prefer to wait until the Product KPIs discussions are finished and then we see how we approach it. # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #22691 --- ...onthly_12m_window_contribution_by_deal.sql | 140 ++++++++++++++++ models/intermediate/cross/schema.yml | 156 ++++++++++++++++++ 2 files changed, 296 insertions(+) create mode 100644 models/intermediate/cross/int_monthly_12m_window_contribution_by_deal.sql diff --git a/models/intermediate/cross/int_monthly_12m_window_contribution_by_deal.sql b/models/intermediate/cross/int_monthly_12m_window_contribution_by_deal.sql new file mode 100644 index 0000000..89b983d --- /dev/null +++ b/models/intermediate/cross/int_monthly_12m_window_contribution_by_deal.sql @@ -0,0 +1,140 @@ +{{ config(materialized="table", unique_key=["date", "id_deal"]) }} +with + int_monthly_aggregated_metrics_history_by_deal as ( + select * from {{ ref("int_monthly_aggregated_metrics_history_by_deal") }} + ), + aggregated_metrics_by_deal_over_12_months as ( + select + am.date, + am.id_deal, + am.deal_lifecycle_state, + count(*) over ( + partition by am.id_deal + order by am.date + rows between 12 preceding and 1 preceding + ) as preceding_months_count_by_deal, + sum(am.total_revenue_in_gbp) over ( + partition by am.id_deal + order by am.date + rows between 12 preceding and 1 preceding + ) as sum_total_revenue_preceding_12_months, + sum(am.created_bookings) over ( + partition by am.id_deal + order by am.date + rows between 12 preceding and 1 preceding + ) as sum_created_bookings_preceding_12_months, + sum(am.listings_booked_in_month) over ( + partition by am.id_deal + order by am.date + rows between 12 preceding and 1 preceding + ) as sum_listings_booked_in_month_preceding_12_months + from int_monthly_aggregated_metrics_history_by_deal am + ), + global_computation_over_12_months as ( + select + ambd.id_deal, + ambd.date, + ambd.deal_lifecycle_state, + + -- Amount of active months per deal used in the computation + ambd.preceding_months_count_by_deal, + case + when ambd.preceding_months_count_by_deal < 12 then true else false + end as has_deal_been_created_less_than_12_months_ago, + + -- ADDITIVE APPROACH -- + -- 12 months window sum of metrics + ambd.sum_total_revenue_preceding_12_months, + ambd.sum_created_bookings_preceding_12_months, + ambd.sum_listings_booked_in_month_preceding_12_months, + + -- Global amounts by sum of the 12 months window sum of metric on a given + -- date + sum(ambd.sum_total_revenue_preceding_12_months) over ( + partition by ambd.date + ) as global_total_revenue_preceding_12_months, + sum(ambd.sum_created_bookings_preceding_12_months) over ( + partition by ambd.date + ) as global_created_bookings_preceding_12_months, + sum(ambd.sum_listings_booked_in_month_preceding_12_months) over ( + partition by ambd.date + ) as global_listings_booked_in_month_preceding_12_months, + + -- AVERAGE APPROACH -- + -- 12 months window sum of metric divided by amount of active months per + -- deal + ambd.sum_total_revenue_preceding_12_months + / ambd.preceding_months_count_by_deal + as avg_total_revenue_preceding_12_months, + ambd.sum_created_bookings_preceding_12_months + / ambd.preceding_months_count_by_deal + as avg_created_bookings_preceding_12_months, + ambd.sum_listings_booked_in_month_preceding_12_months + / ambd.preceding_months_count_by_deal + as avg_listings_booked_in_month_preceding_12_months, + + -- Global amounts by sum of the 12 months window sum of metric on a given + -- date + sum( + ambd.sum_total_revenue_preceding_12_months + / ambd.preceding_months_count_by_deal + ) over (partition by ambd.date) + as avg_global_total_revenue_preceding_12_months, + sum( + ambd.sum_created_bookings_preceding_12_months + / ambd.preceding_months_count_by_deal + ) over (partition by ambd.date) + as avg_global_created_bookings_preceding_12_months, + sum( + ambd.sum_listings_booked_in_month_preceding_12_months + / ambd.preceding_months_count_by_deal + ) over (partition by ambd.date) + as avg_global_listings_booked_in_month_preceding_12_months + + from aggregated_metrics_by_deal_over_12_months ambd + ) +select + id_deal, + date, + deal_lifecycle_state, + + -- Amount of active months per deal used in the computation + preceding_months_count_by_deal, + + -- ADDITIVE APPROACH -- + -- 12 months window sum of metrics + coalesce( + sum_total_revenue_preceding_12_months + / nullif(global_total_revenue_preceding_12_months, 0), + 0 + ) as total_revenue_12m_additive_contribution, + coalesce( + sum_created_bookings_preceding_12_months + / nullif(global_created_bookings_preceding_12_months, 0), + 0 + ) as created_bookings_12m_additive_contribution, + coalesce( + sum_listings_booked_in_month_preceding_12_months + / nullif(global_listings_booked_in_month_preceding_12_months, 0), + 0 + ) as listings_booked_in_month_12m_additive_contribution, + + -- AVERAGE APPROACH -- + -- 12 months window sum of metric divided by amount of active months per deal + coalesce( + avg_total_revenue_preceding_12_months + / nullif(avg_global_total_revenue_preceding_12_months, 0), + 0 + ) as total_revenue_12m_average_contribution, + coalesce( + avg_created_bookings_preceding_12_months + / nullif(avg_global_created_bookings_preceding_12_months, 0), + 0 + ) as created_bookings_12m_average_contribution, + coalesce( + avg_listings_booked_in_month_preceding_12_months + / nullif(avg_global_listings_booked_in_month_preceding_12_months, 0), + 0 + ) as listings_booked_in_month_12m_average_contribution + +from global_computation_over_12_months diff --git a/models/intermediate/cross/schema.yml b/models/intermediate/cross/schema.yml index cc92a22..ae73fe6 100644 --- a/models/intermediate/cross/schema.yml +++ b/models/intermediate/cross/schema.yml @@ -1106,3 +1106,159 @@ models: - WINNER - TOP WINNER - UNSET + + - name: int_monthly_12m_window_contribution_by_deal + description: | + The main goal of this model is to provide how much a deal + contributes to a given metric on the global amount over a + period of 12 months. + + At the moment, this is only done for 3 metrics: + - total_revenue_in_gbp + - created_bookings + - listings_booked_in_month + + There's 2 different contribution methods, the additive and + the average approach. The additive approach will "penalise" + accounts that have been active for less than 12 months on the + contribution amount, while the average approach "boosts" the + contribution of those accounts that have been active for less + than 12 months. + + Mathematically speaking, it is encouraged to use the additive + approach ;-) + + Find below the details of the computation: + - Additive approach: + Over a period of 12 months, sum the value of a given a metric + for each deal. Sum all the deals to get a global. Divide the + deal value vs. the global one. + - Average approach: + Over a period of 12 months, sum the value of a given a metric + for each deal, and divide it by the amount of months we're considering + for that deal. Sum all the average amounts per deals to get a global. + Divide the avg per deal value vs. the sum of avgs global one. + + tests: + - dbt_utils.unique_combination_of_columns: + combination_of_columns: + - date + - id_deal + + columns: + - name: date + data_type: date + description: | + Date corresponding to the last day of the month. + Metrics are inclusive to this date. Together with id_deal, it + acts as the primary key of this model. + tests: + - not_null + + - name: id_deal + data_type: string + description: | + Unique identifier of a Deal. Together with date, it acts as + the primary key of this model. + tests: + - not_null + + - name: deal_lifecycle_state + data_type: string + description: | + Identifier of the lifecycle state of a given deal + in a given month. + + - name: preceding_months_count_by_deal + data_type: integer + description: | + Number of months preceding to the one given by date + that are used for the historic metric retrieval for + a given deal. In essence it states the amount of + months a given deal has been active before a the month + given by date, capped at 12 months. + tests: + - dbt_expectations.expect_column_values_to_be_between: + min_value: 0 + max_value: 12 + strictly: false + + - name: has_deal_been_created_less_than_12_months_ago + data_type: boolean + description: | + Flag to identify if a given deal has been created less + than 12 months ago (true) or not (false). It's based on the + preceding_months_count_by_deal, and will be true on the first + year of deal activity. + + - name: total_revenue_12m_additive_contribution + data_type: numeric + description: | + Share of the deal contribution on total revenue + vs. the global amount, on the preceding 12 months + with respect to date. It uses the additive approach. + It can be negative. + tests: + - not_null + + - name: created_bookings_12m_additive_contribution + data_type: numeric + description: | + Share of the deal contribution on created bookings + vs. the global amount, on the preceding 12 months + with respect to date. It uses the additive approach. + tests: + - not_null + - dbt_expectations.expect_column_values_to_be_between: + min_value: 0 + max_value: 1 + strictly: false + + - name: listings_booked_in_month_12m_additive_contribution + data_type: numeric + description: | + Share of the deal contribution on listings booked in month + vs. the global amount, on the preceding 12 months + with respect to date. It uses the additive approach. + tests: + - not_null + - dbt_expectations.expect_column_values_to_be_between: + min_value: 0 + max_value: 1 + strictly: false + + - name: total_revenue_12m_average_contribution + data_type: numeric + description: | + Share of the deal contribution on total revenue + vs. the global amount, on the preceding 12 months + with respect to date. It uses the average approach. + It can be negative. + tests: + - not_null + + - name: created_bookings_12m_average_contribution + data_type: numeric + description: | + Share of the deal contribution on created bookings + vs. the global amount, on the preceding 12 months + with respect to date. It uses the average approach. + tests: + - not_null + - dbt_expectations.expect_column_values_to_be_between: + min_value: 0 + max_value: 1 + strictly: false + + - name: listings_booked_in_month_12m_average_contribution + data_type: numeric + description: | + Share of the deal contribution on listings booked in month + vs. the global amount, on the preceding 12 months + with respect to date. It uses the average approach. + tests: + - not_null + - dbt_expectations.expect_column_values_to_be_between: + min_value: 0 + max_value: 1 + strictly: false