Merged PR 3163: First version of 12m window contribution by deal

# Description

This PR creates a new model that depends on int_monthly_aggregated_metrics_history_by_deal. The idea is that this is used for Churn computation (Booking Churn, Revenue Churn, Listing Churn) later on.

The idea is relatively simple. Measure how much a Deal has been contributing to a Global amount (sum of metric for all deals) over the preceding period of 12 months. You will notice that there's 2 computations, the "additive" and the "average" one. This is because we still need to align with Matt/Suzannah on which approach makes more sense, but we need data for it. I'm not sure the namings are good though so happy to see your suggestions.

You will also notice that there's no filter by deal_lifecycle_state = '06-Churning'. This will be done in a separated model, whenever we attribute this model to the mtd computation. The reason is simple - this model stays at deal level, thus meaning we can do the dimension aggregation and even a lifecycle aggregation if needed, depending on the needs.

Be aware that this effectively means that MTD KPIs models will depend on the "monthly by deal" models. This has some cons in terms of dependency management but cannot be overcome since we the metric total revenue depends on many subsets. In essence, I don't see another way of doing it unless doing a massive KPIs refactor. I prefer to wait until the Product KPIs discussions are finished and then we see how we approach it.

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #22691
This commit is contained in:
Oriol Roqué Paniagua 2024-10-15 06:51:41 +00:00
parent ce65cf7fa5
commit 901be930df
2 changed files with 296 additions and 0 deletions

View file

@ -0,0 +1,140 @@
{{ config(materialized="table", unique_key=["date", "id_deal"]) }}
with
int_monthly_aggregated_metrics_history_by_deal as (
select * from {{ ref("int_monthly_aggregated_metrics_history_by_deal") }}
),
aggregated_metrics_by_deal_over_12_months as (
select
am.date,
am.id_deal,
am.deal_lifecycle_state,
count(*) over (
partition by am.id_deal
order by am.date
rows between 12 preceding and 1 preceding
) as preceding_months_count_by_deal,
sum(am.total_revenue_in_gbp) over (
partition by am.id_deal
order by am.date
rows between 12 preceding and 1 preceding
) as sum_total_revenue_preceding_12_months,
sum(am.created_bookings) over (
partition by am.id_deal
order by am.date
rows between 12 preceding and 1 preceding
) as sum_created_bookings_preceding_12_months,
sum(am.listings_booked_in_month) over (
partition by am.id_deal
order by am.date
rows between 12 preceding and 1 preceding
) as sum_listings_booked_in_month_preceding_12_months
from int_monthly_aggregated_metrics_history_by_deal am
),
global_computation_over_12_months as (
select
ambd.id_deal,
ambd.date,
ambd.deal_lifecycle_state,
-- Amount of active months per deal used in the computation
ambd.preceding_months_count_by_deal,
case
when ambd.preceding_months_count_by_deal < 12 then true else false
end as has_deal_been_created_less_than_12_months_ago,
-- ADDITIVE APPROACH --
-- 12 months window sum of metrics
ambd.sum_total_revenue_preceding_12_months,
ambd.sum_created_bookings_preceding_12_months,
ambd.sum_listings_booked_in_month_preceding_12_months,
-- Global amounts by sum of the 12 months window sum of metric on a given
-- date
sum(ambd.sum_total_revenue_preceding_12_months) over (
partition by ambd.date
) as global_total_revenue_preceding_12_months,
sum(ambd.sum_created_bookings_preceding_12_months) over (
partition by ambd.date
) as global_created_bookings_preceding_12_months,
sum(ambd.sum_listings_booked_in_month_preceding_12_months) over (
partition by ambd.date
) as global_listings_booked_in_month_preceding_12_months,
-- AVERAGE APPROACH --
-- 12 months window sum of metric divided by amount of active months per
-- deal
ambd.sum_total_revenue_preceding_12_months
/ ambd.preceding_months_count_by_deal
as avg_total_revenue_preceding_12_months,
ambd.sum_created_bookings_preceding_12_months
/ ambd.preceding_months_count_by_deal
as avg_created_bookings_preceding_12_months,
ambd.sum_listings_booked_in_month_preceding_12_months
/ ambd.preceding_months_count_by_deal
as avg_listings_booked_in_month_preceding_12_months,
-- Global amounts by sum of the 12 months window sum of metric on a given
-- date
sum(
ambd.sum_total_revenue_preceding_12_months
/ ambd.preceding_months_count_by_deal
) over (partition by ambd.date)
as avg_global_total_revenue_preceding_12_months,
sum(
ambd.sum_created_bookings_preceding_12_months
/ ambd.preceding_months_count_by_deal
) over (partition by ambd.date)
as avg_global_created_bookings_preceding_12_months,
sum(
ambd.sum_listings_booked_in_month_preceding_12_months
/ ambd.preceding_months_count_by_deal
) over (partition by ambd.date)
as avg_global_listings_booked_in_month_preceding_12_months
from aggregated_metrics_by_deal_over_12_months ambd
)
select
id_deal,
date,
deal_lifecycle_state,
-- Amount of active months per deal used in the computation
preceding_months_count_by_deal,
-- ADDITIVE APPROACH --
-- 12 months window sum of metrics
coalesce(
sum_total_revenue_preceding_12_months
/ nullif(global_total_revenue_preceding_12_months, 0),
0
) as total_revenue_12m_additive_contribution,
coalesce(
sum_created_bookings_preceding_12_months
/ nullif(global_created_bookings_preceding_12_months, 0),
0
) as created_bookings_12m_additive_contribution,
coalesce(
sum_listings_booked_in_month_preceding_12_months
/ nullif(global_listings_booked_in_month_preceding_12_months, 0),
0
) as listings_booked_in_month_12m_additive_contribution,
-- AVERAGE APPROACH --
-- 12 months window sum of metric divided by amount of active months per deal
coalesce(
avg_total_revenue_preceding_12_months
/ nullif(avg_global_total_revenue_preceding_12_months, 0),
0
) as total_revenue_12m_average_contribution,
coalesce(
avg_created_bookings_preceding_12_months
/ nullif(avg_global_created_bookings_preceding_12_months, 0),
0
) as created_bookings_12m_average_contribution,
coalesce(
avg_listings_booked_in_month_preceding_12_months
/ nullif(avg_global_listings_booked_in_month_preceding_12_months, 0),
0
) as listings_booked_in_month_12m_average_contribution
from global_computation_over_12_months

View file

@ -1106,3 +1106,159 @@ models:
- WINNER
- TOP WINNER
- UNSET
- name: int_monthly_12m_window_contribution_by_deal
description: |
The main goal of this model is to provide how much a deal
contributes to a given metric on the global amount over a
period of 12 months.
At the moment, this is only done for 3 metrics:
- total_revenue_in_gbp
- created_bookings
- listings_booked_in_month
There's 2 different contribution methods, the additive and
the average approach. The additive approach will "penalise"
accounts that have been active for less than 12 months on the
contribution amount, while the average approach "boosts" the
contribution of those accounts that have been active for less
than 12 months.
Mathematically speaking, it is encouraged to use the additive
approach ;-)
Find below the details of the computation:
- Additive approach:
Over a period of 12 months, sum the value of a given a metric
for each deal. Sum all the deals to get a global. Divide the
deal value vs. the global one.
- Average approach:
Over a period of 12 months, sum the value of a given a metric
for each deal, and divide it by the amount of months we're considering
for that deal. Sum all the average amounts per deals to get a global.
Divide the avg per deal value vs. the sum of avgs global one.
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- date
- id_deal
columns:
- name: date
data_type: date
description: |
Date corresponding to the last day of the month.
Metrics are inclusive to this date. Together with id_deal, it
acts as the primary key of this model.
tests:
- not_null
- name: id_deal
data_type: string
description: |
Unique identifier of a Deal. Together with date, it acts as
the primary key of this model.
tests:
- not_null
- name: deal_lifecycle_state
data_type: string
description: |
Identifier of the lifecycle state of a given deal
in a given month.
- name: preceding_months_count_by_deal
data_type: integer
description: |
Number of months preceding to the one given by date
that are used for the historic metric retrieval for
a given deal. In essence it states the amount of
months a given deal has been active before a the month
given by date, capped at 12 months.
tests:
- dbt_expectations.expect_column_values_to_be_between:
min_value: 0
max_value: 12
strictly: false
- name: has_deal_been_created_less_than_12_months_ago
data_type: boolean
description: |
Flag to identify if a given deal has been created less
than 12 months ago (true) or not (false). It's based on the
preceding_months_count_by_deal, and will be true on the first
year of deal activity.
- name: total_revenue_12m_additive_contribution
data_type: numeric
description: |
Share of the deal contribution on total revenue
vs. the global amount, on the preceding 12 months
with respect to date. It uses the additive approach.
It can be negative.
tests:
- not_null
- name: created_bookings_12m_additive_contribution
data_type: numeric
description: |
Share of the deal contribution on created bookings
vs. the global amount, on the preceding 12 months
with respect to date. It uses the additive approach.
tests:
- not_null
- dbt_expectations.expect_column_values_to_be_between:
min_value: 0
max_value: 1
strictly: false
- name: listings_booked_in_month_12m_additive_contribution
data_type: numeric
description: |
Share of the deal contribution on listings booked in month
vs. the global amount, on the preceding 12 months
with respect to date. It uses the additive approach.
tests:
- not_null
- dbt_expectations.expect_column_values_to_be_between:
min_value: 0
max_value: 1
strictly: false
- name: total_revenue_12m_average_contribution
data_type: numeric
description: |
Share of the deal contribution on total revenue
vs. the global amount, on the preceding 12 months
with respect to date. It uses the average approach.
It can be negative.
tests:
- not_null
- name: created_bookings_12m_average_contribution
data_type: numeric
description: |
Share of the deal contribution on created bookings
vs. the global amount, on the preceding 12 months
with respect to date. It uses the average approach.
tests:
- not_null
- dbt_expectations.expect_column_values_to_be_between:
min_value: 0
max_value: 1
strictly: false
- name: listings_booked_in_month_12m_average_contribution
data_type: numeric
description: |
Share of the deal contribution on listings booked in month
vs. the global amount, on the preceding 12 months
with respect to date. It uses the average approach.
tests:
- not_null
- dbt_expectations.expect_column_values_to_be_between:
min_value: 0
max_value: 1
strictly: false