Merged PR 3329: First version of KPIs refactored - created bookings
# Description Creates skeleton for new KPIs data flow for created_bookings metric. Details are accessible [here](https://www.notion.so/knowyourguest-superhog/KPIs-Refactor-Let-s-go-daily-2024-10-23-1280446ff9c980dc87a3dc7453e95f06?pvs=4#12a0446ff9c98085bf4dfc77f6fc22f7) In essence: * Models are created in intermediate in a kpis folder. * Models have a daily segmentation. This includes `created_bookings` models, but also the daily lifecycle per listing and the segmentation. It also adds a `dimension_dates` model specific for KPIs. These have all the dimensions already in place and handle all the crazy logic. * Other time aggregation models simply read from existing daily models which are much easier (`int_kpis__metric_mtd_created_bookings` and `int_kpis__metric_monthly_created_bookings`). * Dimensionality aggregation can be easily added within a given timeframe (daily, mtd, monthly). For instance, I do it for mtd in the `int_kpis__aggregated_mtd_created_bookings` and for monthly in `int_kpis__aggregated_monthly_created_bookings` * Macro configuration for dimensions: Allows to set any specific dimension for `aggregated` models. By default, the subset of global, by billing country, by number of listings and by deal apply - since these are needed for Main KPIs. I added an example with Dash Source, that currently does not exist and it's currently configured as only appearing for created bookings. * Testing `aggregated` models completeness. A new macro called `assert_dimension_completeness` is available that ensures additive metrics are consistent vs. the global result, configurable at schema level. * Testing refactor impact. I'm aware that changing the lifecycle model to daily impacts the volumes for listing segments. For the rest, I added a `tmp` test that checks that the dimension and dimension value per date exactly match comparing new vs. old computation. Latest edits: * Changed naming convention * Split of MTD and Monthly. Now these are 2 different entities, as stated in `int_kpis__dimension_dates`. * Added start_date and end_date for models that contemplate a range (mtd, monthly). * Added a small readme entry in the kpis folders. Mostly it states nomenclature and some first conventions. Dbt docs:  # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [ ] I have checked for DRY opportunities with other models and docs. **Likely we'll be able to add macros for mtd and dim_agg models. We will see later on.** - [ ] I've picked the right materialization for the affected models. **Models run ok except for the daily lifecycle of listings, which lasts several minutes in the first run. Model curr...
This commit is contained in:
parent
450975301a
commit
875f91be26
13 changed files with 1149 additions and 7 deletions
52
tests/tmp_kpis_refactor_equivalent_created_bookings.sql
Normal file
52
tests/tmp_kpis_refactor_equivalent_created_bookings.sql
Normal file
|
|
@ -0,0 +1,52 @@
|
|||
{% set min_date = "2022-01-01" %}
|
||||
{% set dimensions = ("global", "by_billing_country") %}
|
||||
-- "by_number_of_listings" excluded on purpose - there's differences because of daily
|
||||
-- segmentation
|
||||
with
|
||||
new_mtd_created_bookings as (
|
||||
select end_date as date, dimension, dimension_value, created_bookings
|
||||
from {{ ref("int_kpis__aggregated_mtd_created_bookings") }}
|
||||
where
|
||||
end_date >= '{{ min_date }}'
|
||||
and dimension in {{ dimensions }}
|
||||
and dimension_value <> 'UNSET'
|
||||
),
|
||||
new_monthly_created_bookings as (
|
||||
select end_date as date, dimension, dimension_value, created_bookings
|
||||
from {{ ref("int_kpis__aggregated_monthly_created_bookings") }}
|
||||
where
|
||||
end_date >= '{{ min_date }}'
|
||||
and dimension in {{ dimensions }}
|
||||
and dimension_value <> 'UNSET'
|
||||
),
|
||||
new_created_bookings as (
|
||||
select *
|
||||
from new_mtd_created_bookings
|
||||
union all
|
||||
select *
|
||||
from new_monthly_created_bookings
|
||||
),
|
||||
old_created_bookings as (
|
||||
select date, dimension, dimension_value, created_bookings
|
||||
from {{ ref("int_core__mtd_created_bookings_metric") }}
|
||||
where date >= '{{ min_date }}' and dimension in {{ dimensions }}
|
||||
),
|
||||
comparison as (
|
||||
select
|
||||
coalesce(o.date, n.date) as date,
|
||||
coalesce(o.dimension, n.dimension) as dimension,
|
||||
coalesce(o.dimension_value, n.dimension_value) as dimension_value,
|
||||
o.created_bookings as old_created_bookings,
|
||||
n.created_bookings as new_created_bookings,
|
||||
coalesce(o.created_bookings, 0) - coalesce(n.created_bookings, 0) as diff
|
||||
from old_created_bookings o
|
||||
full outer join
|
||||
new_created_bookings n
|
||||
on o.date = n.date
|
||||
and o.dimension = n.dimension
|
||||
and o.dimension_value = n.dimension_value
|
||||
)
|
||||
select *
|
||||
from comparison
|
||||
where diff <> 0
|
||||
order by date desc, abs(diff) desc
|
||||
Loading…
Add table
Add a link
Reference in a new issue