Merged PR 2246: KPIs refactor: naming convention and PBI sources replication

Changing naming to follow convention.
This PR has the following changes:
- the model `int_core__mtd_aggregated_metrics` has been moved to cross and changed the name to `int_mtd_aggregated_metrics`
- the model `int_core__monthly_aggregated_metrics_history_by_deal` has been moved to cross and changed the name to `int_monthly_aggregated_metrics_history_by_deal`
- the reporting models `core__mtd_aggregated_metrics` and `core__monthly_aggregated_metrics_history_by_deal` now source the `int_mtd_aggregated_metrics` and `int_monthly_aggregated_metrics_history_by_deal` to avoid breaking the production dashboard
- the reporting models have been duplicated from core into general with the correct names, i.e., `mtd_aggregated_metrics` and `monthly_aggregated_metrics_history_by_deal`
- Documentation has been moved in intermediate and replicated in reporting, adding comments on the currently in use models that are going to die soon.

This will allow for a transition of the PBI dashboard from one source to another. Exposures file still not touched since technically the report is still sourcing the 'legacy' models. Documentation of the refactor here: https://www.notion.so/knowyourguest-superhog/Refactoring-Business-KPIs-5deb6aadddb34884ae90339402ac16e3

Related work items: #18202
This commit is contained in:
Oriol Roqué Paniagua 2024-07-09 15:14:50 +00:00
parent 976ac70949
commit 20e7220ffe
10 changed files with 350 additions and 138 deletions

View file

@ -143,41 +143,6 @@ models:
- not_null
- unique
- name: int_core__monthly_aggregated_metrics_history_by_deal
description: |
This model aggregates the monthly historic information regarding the different metrics computed
at deal level. The primary sources of data are the `int_core__monthly_XXXXX_history_by_deal`
models which contain the raw metrics data per source.
Unlike the int_core__mtd_aggregated_metrics, this model does not abstract each metric, since
no comparison versus last year is performed. In short, it just gathers the information stored
in the abovementioned models.
To keep in mind: aggregating the information of this model will not necessarily result into
the int_core__mtd_aggregated metrics because 1) the mtd version contains more computing dates
than the by deal version, the latest being a subset of the first, and 2) the deal based model
enforces that a booking/guest journey/listing/etc has a host with a deal assigned, which is
not necessarily the case.
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- date
- id_deal
columns:
- name: date
data_type: date
description: The last day of the month or yesterday for historic metrics.
tests:
- not_null
- name: id_deal
data_type: character varying
description: Id of the deal associated to the host.
tests:
- not_null
- name: int_core__monthly_accommodation_history_by_deal
description: |
This model contains the historic information regarding the accommodations, also known
@ -307,103 +272,6 @@ models:
- not_null
- unique
- name: int_core__mtd_aggregated_metrics
description: |
The `int_core__mtd_aggregated_metrics` model aggregates multiple metrics on a year, month, and day basis.
The primary sources of data are the `int_core__mtd_XXXXX_metrics` models, which contain the raw metrics data per source.
This model uses Jinja templating to dynamically generate SQL code, combining various metrics into a single table.
This approach reduces repetition and enhances maintainability.
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- date
- metric
columns:
- name: year
data_type: int
description: year number of the given date.
tests:
- not_null
- name: month
data_type: int
description: month number of the given date.
tests:
- not_null
- name: day
data_type: int
description: day monthly number of the given date.
tests:
- not_null
- name: is_end_of_month
data_type: boolean
description: is end of month, 1 for yes, 0 for no.
tests:
- not_null
- name: is_current_month
data_type: boolean
description: |
checks if the date is within the current executed month,
1 for yes, 0 for no.
tests:
- not_null
- name: date
data_type: date
description: |
main date for the computation, that is used for filters.
It comes from int_dates_mtd logic.
tests:
- not_null
- name: previous_year_date
data_type: date
description: |
corresponds to the date of the previous year, with respect to the field date.
It comes from int_dates_mtd logic. It's only displayed for information purposes,
should not be needed for reporting.
- name: metric
data_type: text
description: name of the business metric.
tests:
- not_null
- name: order_by
data_type: integer
description: |
order for displaying purposes. Null values are accepted, but keep
in mind that then there's no default controlled display order.
- name: number_format
data_type: text
description: allows for grouping and formatting for displaying purposes.
tests:
- accepted_values:
values: ['integer', 'percentage']
- name: value
data_type: numeric
description: |
numeric value (integer or decimal) that corresponds to the MTD computation of the metric
at a given date.
- name: previous_year_value
data_type: numeric
description: |
numeric value (integer or decimal) that corresponds to the MTD computation of the metric
on the previous year at a given date.
- name: relative_increment
data_type: numeric
description: |
numeric value that corresponds to the relative increment between value and previous year value,
following the computation: value / previous_year_value - 1.
- name: int_core__verification_request_completeness
description: |
The `int_core__verification_request_completeness` model allows to determine if a verification request is

View file

@ -182,6 +182,141 @@ models:
tests:
- not_null
- name: id_deal
data_type: character varying
description: Id of the deal associated to the host.
tests:
- not_null
- name: int_mtd_aggregated_metrics
description: |
The `int_mtd_aggregated_metrics` model aggregates multiple metrics on a year, month, and day basis.
The primary source of data is the `int_mtd_vs_previous_year_metrics` model, which contain the combination
of metrics data per source. This model just changes the display format to unpivot the information into
a set of metric, value, previous_year_value and relative_increment at a given date. It uses Jinja
code to avoid code replication.
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- date
- metric
columns:
- name: year
data_type: int
description: year number of the given date.
tests:
- not_null
- name: month
data_type: int
description: month number of the given date.
tests:
- not_null
- name: day
data_type: int
description: day monthly number of the given date.
tests:
- not_null
- name: is_end_of_month
data_type: boolean
description: is end of month, 1 for yes, 0 for no.
tests:
- not_null
- name: is_current_month
data_type: boolean
description: |
checks if the date is within the current executed month,
1 for yes, 0 for no.
tests:
- not_null
- name: date
data_type: date
description: |
main date for the computation, that is used for filters.
It comes from int_dates_mtd logic.
tests:
- not_null
- name: previous_year_date
data_type: date
description: |
corresponds to the date of the previous year, with respect to the field date.
It comes from int_dates_mtd logic. It's only displayed for information purposes,
should not be needed for reporting.
- name: metric
data_type: text
description: name of the business metric.
tests:
- not_null
- name: order_by
data_type: integer
description: |
order for displaying purposes. Null values are accepted, but keep
in mind that then there's no default controlled display order.
- name: number_format
data_type: text
description: allows for grouping and formatting for displaying purposes.
tests:
- accepted_values:
values: ['integer', 'percentage']
- name: value
data_type: numeric
description: |
numeric value (integer or decimal) that corresponds to the MTD computation of the metric
at a given date.
- name: previous_year_value
data_type: numeric
description: |
numeric value (integer or decimal) that corresponds to the MTD computation of the metric
on the previous year at a given date.
- name: relative_increment
data_type: numeric
description: |
numeric value that corresponds to the relative increment between value and previous year value,
following the computation: value / previous_year_value - 1.
- name: int_monthly_aggregated_metrics_history_by_deal
description: |
This model aggregates the monthly historic information regarding the different metrics computed
at deal level. The primary sources of data are the `int_yyy__monthly_XXXXX_history_by_deal`
models which contain the raw metrics data per source.
Unlike the int_mtd_aggregated_metrics, this model does not abstract each metric, since
no comparison versus last year is performed. In short, it just gathers the information stored
in the abovementioned models.
To keep in mind: aggregating the information of this model will not necessarily result into
the int_mtd_aggregated metrics because 1) the mtd version contains more computing dates
than the by deal version, the latest being a subset of the first, and 2) the deal based model
enforces that a booking/guest journey/listing/etc has a host with a deal assigned, which is
not necessarily the case.
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- date
- id_deal
columns:
- name: date
data_type: date
description: The last day of the month or yesterday for historic metrics.
tests:
- not_null
- name: id_deal
data_type: character varying
description: Id of the deal associated to the host.