Merged PR 2246: KPIs refactor: naming convention and PBI sources replication

Changing naming to follow convention.
This PR has the following changes:
- the model `int_core__mtd_aggregated_metrics` has been moved to cross and changed the name to `int_mtd_aggregated_metrics`
- the model `int_core__monthly_aggregated_metrics_history_by_deal` has been moved to cross and changed the name to `int_monthly_aggregated_metrics_history_by_deal`
- the reporting models `core__mtd_aggregated_metrics` and `core__monthly_aggregated_metrics_history_by_deal` now source the `int_mtd_aggregated_metrics` and `int_monthly_aggregated_metrics_history_by_deal` to avoid breaking the production dashboard
- the reporting models have been duplicated from core into general with the correct names, i.e., `mtd_aggregated_metrics` and `monthly_aggregated_metrics_history_by_deal`
- Documentation has been moved in intermediate and replicated in reporting, adding comments on the currently in use models that are going to die soon.

This will allow for a transition of the PBI dashboard from one source to another. Exposures file still not touched since technically the report is still sourcing the 'legacy' models. Documentation of the refactor here: https://www.notion.so/knowyourguest-superhog/Refactoring-Business-KPIs-5deb6aadddb34884ae90339402ac16e3

Related work items: #18202
This commit is contained in:
Oriol Roqué Paniagua 2024-07-09 15:14:50 +00:00
parent 976ac70949
commit 20e7220ffe
10 changed files with 350 additions and 138 deletions

View file

@ -143,41 +143,6 @@ models:
- not_null
- unique
- name: int_core__monthly_aggregated_metrics_history_by_deal
description: |
This model aggregates the monthly historic information regarding the different metrics computed
at deal level. The primary sources of data are the `int_core__monthly_XXXXX_history_by_deal`
models which contain the raw metrics data per source.
Unlike the int_core__mtd_aggregated_metrics, this model does not abstract each metric, since
no comparison versus last year is performed. In short, it just gathers the information stored
in the abovementioned models.
To keep in mind: aggregating the information of this model will not necessarily result into
the int_core__mtd_aggregated metrics because 1) the mtd version contains more computing dates
than the by deal version, the latest being a subset of the first, and 2) the deal based model
enforces that a booking/guest journey/listing/etc has a host with a deal assigned, which is
not necessarily the case.
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- date
- id_deal
columns:
- name: date
data_type: date
description: The last day of the month or yesterday for historic metrics.
tests:
- not_null
- name: id_deal
data_type: character varying
description: Id of the deal associated to the host.
tests:
- not_null
- name: int_core__monthly_accommodation_history_by_deal
description: |
This model contains the historic information regarding the accommodations, also known
@ -307,103 +272,6 @@ models:
- not_null
- unique
- name: int_core__mtd_aggregated_metrics
description: |
The `int_core__mtd_aggregated_metrics` model aggregates multiple metrics on a year, month, and day basis.
The primary sources of data are the `int_core__mtd_XXXXX_metrics` models, which contain the raw metrics data per source.
This model uses Jinja templating to dynamically generate SQL code, combining various metrics into a single table.
This approach reduces repetition and enhances maintainability.
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- date
- metric
columns:
- name: year
data_type: int
description: year number of the given date.
tests:
- not_null
- name: month
data_type: int
description: month number of the given date.
tests:
- not_null
- name: day
data_type: int
description: day monthly number of the given date.
tests:
- not_null
- name: is_end_of_month
data_type: boolean
description: is end of month, 1 for yes, 0 for no.
tests:
- not_null
- name: is_current_month
data_type: boolean
description: |
checks if the date is within the current executed month,
1 for yes, 0 for no.
tests:
- not_null
- name: date
data_type: date
description: |
main date for the computation, that is used for filters.
It comes from int_dates_mtd logic.
tests:
- not_null
- name: previous_year_date
data_type: date
description: |
corresponds to the date of the previous year, with respect to the field date.
It comes from int_dates_mtd logic. It's only displayed for information purposes,
should not be needed for reporting.
- name: metric
data_type: text
description: name of the business metric.
tests:
- not_null
- name: order_by
data_type: integer
description: |
order for displaying purposes. Null values are accepted, but keep
in mind that then there's no default controlled display order.
- name: number_format
data_type: text
description: allows for grouping and formatting for displaying purposes.
tests:
- accepted_values:
values: ['integer', 'percentage']
- name: value
data_type: numeric
description: |
numeric value (integer or decimal) that corresponds to the MTD computation of the metric
at a given date.
- name: previous_year_value
data_type: numeric
description: |
numeric value (integer or decimal) that corresponds to the MTD computation of the metric
on the previous year at a given date.
- name: relative_increment
data_type: numeric
description: |
numeric value that corresponds to the relative increment between value and previous year value,
following the computation: value / previous_year_value - 1.
- name: int_core__verification_request_completeness
description: |
The `int_core__verification_request_completeness` model allows to determine if a verification request is

View file

@ -182,6 +182,141 @@ models:
tests:
- not_null
- name: id_deal
data_type: character varying
description: Id of the deal associated to the host.
tests:
- not_null
- name: int_mtd_aggregated_metrics
description: |
The `int_mtd_aggregated_metrics` model aggregates multiple metrics on a year, month, and day basis.
The primary source of data is the `int_mtd_vs_previous_year_metrics` model, which contain the combination
of metrics data per source. This model just changes the display format to unpivot the information into
a set of metric, value, previous_year_value and relative_increment at a given date. It uses Jinja
code to avoid code replication.
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- date
- metric
columns:
- name: year
data_type: int
description: year number of the given date.
tests:
- not_null
- name: month
data_type: int
description: month number of the given date.
tests:
- not_null
- name: day
data_type: int
description: day monthly number of the given date.
tests:
- not_null
- name: is_end_of_month
data_type: boolean
description: is end of month, 1 for yes, 0 for no.
tests:
- not_null
- name: is_current_month
data_type: boolean
description: |
checks if the date is within the current executed month,
1 for yes, 0 for no.
tests:
- not_null
- name: date
data_type: date
description: |
main date for the computation, that is used for filters.
It comes from int_dates_mtd logic.
tests:
- not_null
- name: previous_year_date
data_type: date
description: |
corresponds to the date of the previous year, with respect to the field date.
It comes from int_dates_mtd logic. It's only displayed for information purposes,
should not be needed for reporting.
- name: metric
data_type: text
description: name of the business metric.
tests:
- not_null
- name: order_by
data_type: integer
description: |
order for displaying purposes. Null values are accepted, but keep
in mind that then there's no default controlled display order.
- name: number_format
data_type: text
description: allows for grouping and formatting for displaying purposes.
tests:
- accepted_values:
values: ['integer', 'percentage']
- name: value
data_type: numeric
description: |
numeric value (integer or decimal) that corresponds to the MTD computation of the metric
at a given date.
- name: previous_year_value
data_type: numeric
description: |
numeric value (integer or decimal) that corresponds to the MTD computation of the metric
on the previous year at a given date.
- name: relative_increment
data_type: numeric
description: |
numeric value that corresponds to the relative increment between value and previous year value,
following the computation: value / previous_year_value - 1.
- name: int_monthly_aggregated_metrics_history_by_deal
description: |
This model aggregates the monthly historic information regarding the different metrics computed
at deal level. The primary sources of data are the `int_yyy__monthly_XXXXX_history_by_deal`
models which contain the raw metrics data per source.
Unlike the int_mtd_aggregated_metrics, this model does not abstract each metric, since
no comparison versus last year is performed. In short, it just gathers the information stored
in the abovementioned models.
To keep in mind: aggregating the information of this model will not necessarily result into
the int_mtd_aggregated metrics because 1) the mtd version contains more computing dates
than the by deal version, the latest being a subset of the first, and 2) the deal based model
enforces that a booking/guest journey/listing/etc has a host with a deal assigned, which is
not necessarily the case.
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- date
- id_deal
columns:
- name: date
data_type: date
description: The last day of the month or yesterday for historic metrics.
tests:
- not_null
- name: id_deal
data_type: character varying
description: Id of the deal associated to the host.

View file

@ -1,6 +1,6 @@
with
int_core__monthly_aggregated_metrics_history_by_deal as (
select * from {{ ref("int_core__monthly_aggregated_metrics_history_by_deal") }}
int_monthly_aggregated_metrics_history_by_deal as (
select * from {{ ref("int_monthly_aggregated_metrics_history_by_deal") }}
)
select
@ -25,4 +25,4 @@ select
listings_booked_in_month as listings_booked_in_month,
listings_booked_in_6_months as listings_booked_in_6_months,
listings_booked_in_12_months as listings_booked_in_12_months
from int_core__monthly_aggregated_metrics_history_by_deal
from int_monthly_aggregated_metrics_history_by_deal

View file

@ -1,6 +1,6 @@
with
int_core__mtd_aggregated_metrics as (
select * from {{ ref("int_core__mtd_aggregated_metrics") }}
int_mtd_aggregated_metrics as (
select * from {{ ref("int_mtd_aggregated_metrics") }}
)
select
@ -17,4 +17,4 @@ select
value as value,
previous_year_value as previous_year_value,
relative_increment as relative_increment
from int_core__mtd_aggregated_metrics
from int_mtd_aggregated_metrics

View file

@ -569,6 +569,10 @@ models:
- name: core__mtd_aggregated_metrics
description: |
IMPORTANT: This model has moved to the general tab, into the mtd_aggregated_metrics
Deprecated. This model will be burned to the ground.
This model aggregates the historic information of our business by providing
different metrics computed at global level.
It's the main source of information for the Main KPIs reporting, specifically
@ -668,6 +672,10 @@ models:
- name: core__monthly_aggregated_metrics_history_by_deal
description: |
IMPORTANT: This model has moved to the general tab, into the monthly_aggregated_metrics_history_by_deal
Deprecated. This model will be burned to the ground.
This model aggregates the monthly historic information regarding the different metrics computed
at deal level. The primary sources of data are the `int_core__monthly_XXXXX_history_by_deal`
models which contain the raw metrics data per source.

View file

@ -0,0 +1,28 @@
with
int_monthly_aggregated_metrics_history_by_deal as (
select * from {{ ref("int_monthly_aggregated_metrics_history_by_deal") }}
)
select
year as year,
month as month,
day as day,
date as date,
id_deal as id_deal,
deal_lifecycle_state as deal_lifecycle_state,
created_bookings as created_bookings,
check_out_bookings as check_out_bookings,
cancelled_bookings as cancelled_bookings,
created_guest_journeys as created_guest_journeys,
started_guest_journeys as started_guest_journeys,
completed_guest_journeys as completed_guest_journeys,
start_rate_guest_journey as start_rate_guest_journey,
completion_rate_guest_journey as completion_rate_guest_journey,
incompletion_rate_guest_journey as incompletion_rate_guest_journey,
new_listings as new_listings,
first_time_booked_listings as first_time_booked_listings,
churning_listings as churning_listings,
listings_booked_in_month as listings_booked_in_month,
listings_booked_in_6_months as listings_booked_in_6_months,
listings_booked_in_12_months as listings_booked_in_12_months
from int_monthly_aggregated_metrics_history_by_deal

View file

@ -0,0 +1,20 @@
with
int_mtd_aggregated_metrics as (
select * from {{ ref("int_mtd_aggregated_metrics") }}
)
select
year as year,
month as month,
day as day,
is_end_of_month as is_end_of_month,
is_current_month as is_current_month,
date as date,
previous_year_date as previous_year_date,
order_by as order_by,
number_format as number_format,
metric as metric,
value as value,
previous_year_value as previous_year_value,
relative_increment as relative_increment
from int_mtd_aggregated_metrics

View file

@ -296,5 +296,158 @@ models:
For external sources, this will be the point in time when the
information was obtained from them. For stuff we make up here in the
DWH, this will be the point in time when we made the assumption.
tests:
- not_null
- name: mtd_aggregated_metrics
description: |
This model aggregates the historic information of our business by providing
different metrics computed at global level.
It's the main source of information for the Main KPIs reporting, specifically
on the MTD (Month To Date) and the Monthly Overview.
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- date
- metric
columns:
- name: year
data_type: int
description: year number of the given date.
tests:
- not_null
- name: month
data_type: int
description: month number of the given date.
tests:
- not_null
- name: day
data_type: int
description: day monthly number of the given date.
tests:
- not_null
- name: is_end_of_month
data_type: boolean
description: is end of month, 1 for yes, 0 for no.
tests:
- not_null
- name: is_current_month
data_type: boolean
description: |
checks if the date is within the current executed month,
1 for yes, 0 for no.
tests:
- not_null
- name: date
data_type: date
description: |
main date for the computation, that is used for filters.
It comes from int_dates_mtd logic.
tests:
- not_null
- name: previous_year_date
data_type: date
description: |
corresponds to the date of the previous year, with respect to the field date.
It comes from int_dates_mtd logic. It's only displayed for information purposes,
should not be needed for reporting.
- name: metric
data_type: text
description: name of the business metric.
tests:
- not_null
- name: order_by
data_type: integer
description: |
order for displaying purposes. Null values are accepted, but keep
in mind that then there's no default controlled display order.
- name: number_format
data_type: text
description: allows for grouping and formatting for displaying purposes.
tests:
- accepted_values:
values: ['integer', 'percentage']
- name: value
data_type: numeric
description: |
numeric value (integer or decimal) that corresponds to the MTD computation of the metric
at a given date. Note that if the month is not in progress, then this value corresponds
to the monthly figure.
- name: previous_year_value
data_type: numeric
description: |
numeric value (integer or decimal) that corresponds to the MTD computation of the metric
on the previous year at a given date.
- name: relative_increment
data_type: numeric
description: |
numeric value that corresponds to the relative increment between value and previous year value,
following the computation: value / previous_year_value - 1.
- name: monthly_aggregated_metrics_history_by_deal
description: |
This model aggregates the monthly historic information regarding the different metrics computed
at deal level. The primary source of data is the `int_monthly_XXXXX_history_by_deal`
model which contain the raw metrics data per source.
This table is used to provide "By Deal" metrics in the Business Overview reporting.
Unlike the mtd_aggregated_metrics, this model does not abstract each metric, since
no comparison versus last year is performed. In short, it just gathers the information stored
in the abovementioned models.
To keep in mind: aggregating the information of this model will not necessarily result into
the int_mtd_aggregated metrics because 1) the mtd version contains more computing dates
than the by deal version, the latest being a subset of the first, and 2) the deal based model
enforces that a booking/guest journey/listing/etc has a host with a deal assigned, which is
not necessarily the case.
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- date
- id_deal
columns:
- name: date
data_type: date
description: The last day of the month or yesterday for historic metrics.
tests:
- not_null
- name: id_deal
data_type: character varying
description: Id of the deal associated to the host.
tests:
- not_null
- name: year
data_type: int
description: year number of the given date.
tests:
- not_null
- name: month
data_type: int
description: month number of the given date.
tests:
- not_null
- name: day
data_type: int
description: day monthly number of the given date.
tests:
- not_null