# Description Creates skeleton for new KPIs data flow for created_bookings metric. Details are accessible [here](https://www.notion.so/knowyourguest-superhog/KPIs-Refactor-Let-s-go-daily-2024-10-23-1280446ff9c980dc87a3dc7453e95f06?pvs=4#12a0446ff9c98085bf4dfc77f6fc22f7) In essence: * Models are created in intermediate in a kpis folder. * Models have a daily segmentation. This includes `created_bookings` models, but also the daily lifecycle per listing and the segmentation. It also adds a `dimension_dates` model specific for KPIs. These have all the dimensions already in place and handle all the crazy logic. * Other time aggregation models simply read from existing daily models which are much easier (`int_kpis__metric_mtd_created_bookings` and `int_kpis__metric_monthly_created_bookings`). * Dimensionality aggregation can be easily added within a given timeframe (daily, mtd, monthly). For instance, I do it for mtd in the `int_kpis__aggregated_mtd_created_bookings` and for monthly in `int_kpis__aggregated_monthly_created_bookings` * Macro configuration for dimensions: Allows to set any specific dimension for `aggregated` models. By default, the subset of global, by billing country, by number of listings and by deal apply - since these are needed for Main KPIs. I added an example with Dash Source, that currently does not exist and it's currently configured as only appearing for created bookings. * Testing `aggregated` models completeness. A new macro called `assert_dimension_completeness` is available that ensures additive metrics are consistent vs. the global result, configurable at schema level. * Testing refactor impact. I'm aware that changing the lifecycle model to daily impacts the volumes for listing segments. For the rest, I added a `tmp` test that checks that the dimension and dimension value per date exactly match comparing new vs. old computation. Latest edits: * Changed naming convention * Split of MTD and Monthly. Now these are 2 different entities, as stated in `int_kpis__dimension_dates`. * Added start_date and end_date for models that contemplate a range (mtd, monthly). * Added a small readme entry in the kpis folders. Mostly it states nomenclature and some first conventions. Dbt docs:  # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [ ] I have checked for DRY opportunities with other models and docs. **Likely we'll be able to add macros for mtd and dim_agg models. We will see later on.** - [ ] I've picked the right materialization for the affected models. **Models run ok except for the daily lifecycle of listings, which lasts several minutes in the first run. Model curr...
554 lines
17 KiB
YAML
554 lines
17 KiB
YAML
version: 2
|
|
|
|
models:
|
|
- name: int_kpis__dimension_dates
|
|
description: |
|
|
This model provides the daily time dimensionality needed for KPIs.
|
|
It only considers dates up-to-yesterday.
|
|
|
|
columns:
|
|
- name: date
|
|
data_type: date
|
|
description: Specific date. It's the primary key of this model.
|
|
tests:
|
|
- unique
|
|
- not_null
|
|
|
|
- name: year
|
|
data_type: int
|
|
description: Year number of the given date.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: month
|
|
data_type: int
|
|
description: Month number of the given date.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: day
|
|
data_type: int
|
|
description: Day monthly number of the given date.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: first_day_month
|
|
data_type: date
|
|
description: |
|
|
First day of the month correspoding to the date field.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: last_day_month
|
|
data_type: date
|
|
description: |
|
|
Last day of the month correspoding to the date field.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: is_end_of_month
|
|
data_type: boolean
|
|
description: True if it's end of month, false otherwise.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: is_current_month
|
|
data_type: boolean
|
|
description: |
|
|
True if the date is within the current month, false otherwise.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: is_month_to_date
|
|
data_type: boolean
|
|
description: |
|
|
True if the date is within the scope of month-to-date, false otherwise.
|
|
The scope of month-to-date takes into account both 1) a date being in
|
|
the current month or 2) a date corresponding to the same month of the
|
|
previous year, which day number cannot be higher than yesterday's day
|
|
number.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: int_kpis__lifecycle_daily_accommodation
|
|
description: |
|
|
This model computes the daily lifecycle segment for each accommodation, also known as
|
|
listings.
|
|
The information regarding the booking-related time allows for the current status of any listing
|
|
regarding its activity. This information is encapsulated in the following columns:
|
|
|
|
accommodation_lifecycle_state: contains one of the following states
|
|
- 01-New: Listings that have been created in the current month, without bookings
|
|
- 02-Never Booked: Listings that have been created before the current month, without bookings.
|
|
- 03-First Time Booked: Listings that have been booked for the first time in the current month.
|
|
- 04-Active: Listings that have booking activity in the past 12 months (that are not FTB nor reactivated)
|
|
- 05-Churning: Listings that are becoming inactive because of lack of bookings in the past 12 months
|
|
- 06-Inactive: Listings that have not had a booking for more than 12 months.
|
|
- 07-Reactivated: Listings that have had a booking in the current month that were inactive or churning before.
|
|
- Finally, if none of the logic applies, which should not happen, null will be set and a dbt alert will raise.
|
|
|
|
Since the states of Active, First Time Booked and Reactivated indicate certain booking activity and are
|
|
mutually exclusive, the model also provides information of the recency of the bookings by the following
|
|
booleans:
|
|
- has_been_booked_within_current_month: If a listing has had a booking created in the current month
|
|
- has_been_booked_within_last_6_months: If a listing has had a booking created in the past 6 months
|
|
- has_been_booked_within_last_12_months: If a listing has had a booking created in the past 12 months
|
|
Note that if a listing has had a booking created in a given month, all 3 columns will be true. Similarly,
|
|
if the last booking created to a listing was 5 months ago, only the column has_been_booked_in_1_month
|
|
will be false; while the other 2 will be true.
|
|
tests:
|
|
- dbt_utils.unique_combination_of_columns:
|
|
combination_of_columns:
|
|
- date
|
|
- id_accommodation
|
|
|
|
columns:
|
|
- name: date
|
|
data_type: date
|
|
description: Date in which a Listing has a given lifecycle state.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: id_accommodation
|
|
data_type: bigint
|
|
description: Id of the accommodation or listing.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: creation_date_utc
|
|
data_type: date
|
|
description: Date of when the listing was created.
|
|
|
|
- name: first_time_booked_date_utc
|
|
data_type: date
|
|
description: |
|
|
Date of the first booking created for a given listing. Can be null if the listing
|
|
has never had a booking associated with it.
|
|
|
|
- name: last_time_booked_date_utc
|
|
data_type: date
|
|
description: |
|
|
Date of the last booking created for a given listing. Can be null if the listing
|
|
has never had a booking associated with it. Can be the same as first_time_booked_date_utc
|
|
if the listing only had 1 booking in its history.
|
|
|
|
- name: second_to_last_time_booked_date_utc
|
|
data_type: date
|
|
description: |
|
|
Date of the second-to-last booking created for a given listing, meaning the creation
|
|
date of the booking that precedes the last one. It's relevant for the reactivation computation
|
|
on the lifecycle. Can be null if the listing has never had a booking associated with it or if
|
|
the listing only had 1 booking in its history.
|
|
|
|
- name: accommodation_lifecycle_state
|
|
data_type: character varying
|
|
description: |
|
|
Contains the lifecycle state of a Listing. The accepted values are:
|
|
01-New, 02-Never Booked, 03-First Time Booked, 04-Active, 05-Churning, 06-Inactive,
|
|
07-Reactivated. Failing to implement the logic will result in alert.
|
|
tests:
|
|
- not_null
|
|
- accepted_values:
|
|
values:
|
|
- 01-New
|
|
- 02-Never Booked
|
|
- 03-First Time Booked
|
|
- 04-Active
|
|
- 05-Churning
|
|
- 06-Inactive
|
|
- 07-Reactivated
|
|
|
|
- name: has_been_booked_within_current_month
|
|
data_type: boolean
|
|
description: If the listing has had a booking created in the current month.
|
|
|
|
- name: has_been_booked_within_last_6_months
|
|
data_type: boolean
|
|
description: If the listing has had a booking created in the past 6 months.
|
|
|
|
- name: has_been_booked_within_last_12_months
|
|
data_type: boolean
|
|
description: If the listing has had a booking created in the past 12 months.
|
|
|
|
- name: int_kpis__dimension_daily_accommodation
|
|
description: |
|
|
This model computes the deal segmentation per number of
|
|
listings in a daily manner.
|
|
|
|
tests:
|
|
- dbt_utils.unique_combination_of_columns:
|
|
combination_of_columns:
|
|
- date
|
|
- id_deal
|
|
|
|
columns:
|
|
- name: date
|
|
data_type: date
|
|
description: Specific date in which the segmentation applies.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: id_deal
|
|
data_type: string
|
|
description: Unique identifier of an account.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: active_accommodations_per_deal_segmentation
|
|
data_type: string
|
|
description: |
|
|
Segment value based on the number of listings booked in 12 months
|
|
for a given deal and date.
|
|
tests:
|
|
- accepted_values:
|
|
values:
|
|
- "0"
|
|
- "01-05"
|
|
- "06-20"
|
|
- "21-60"
|
|
- "61+"
|
|
- name: accommodations_booked_in_12_months
|
|
data_type: bigint
|
|
description:
|
|
Actual volume of listings that have been booked in the past 12 months
|
|
for a given deal and date.
|
|
|
|
- name: int_kpis__metric_daily_created_bookings
|
|
description: |
|
|
This model computes the Daily Created Bookings at the deepest granularity.
|
|
|
|
The unique key corresponds to the deepest granularity of the model,
|
|
in this case:
|
|
- date,
|
|
- id_deal,
|
|
- dash_source.
|
|
|
|
tests:
|
|
- dbt_utils.unique_combination_of_columns:
|
|
combination_of_columns:
|
|
- date
|
|
- id_deal
|
|
- dash_source
|
|
|
|
columns:
|
|
- name: date
|
|
data_type: date
|
|
description: Date of when Bookings have been created.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: id_deal
|
|
data_type: string
|
|
description: Unique identifier of an account.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: dash_source
|
|
data_type: string
|
|
description: Dashboard source, either old or new.
|
|
tests:
|
|
- not_null
|
|
- accepted_values:
|
|
values:
|
|
- "New Dash"
|
|
- "Old Dash"
|
|
|
|
- name: active_accommodations_per_deal_segmentation
|
|
data_type: string
|
|
description: |
|
|
Segment value based on the number of listings booked in 12 months
|
|
for a given deal and date.
|
|
tests:
|
|
- not_null
|
|
- accepted_values:
|
|
values:
|
|
- "0"
|
|
- "01-05"
|
|
- "06-20"
|
|
- "21-60"
|
|
- "61+"
|
|
- "UNSET"
|
|
|
|
- name: main_billing_country_iso_3_per_deal
|
|
data_type: string
|
|
description: |
|
|
Main billing country of the host aggregated at Deal level.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: created_bookings
|
|
data_type: bigint
|
|
description: |
|
|
Count of daily bookings created in a given date and per specified dimension.
|
|
|
|
- name: int_kpis__metric_monthly_created_bookings
|
|
description: |
|
|
This model computes the Monthly Created Bookings at the
|
|
deepest granularity.
|
|
Be aware that any dimension that can change over the monthly period,
|
|
such as daily segmentations, are included in the primary key of the
|
|
model.
|
|
|
|
The unique key corresponds to:
|
|
- end_date,
|
|
- id_deal,
|
|
- dash_source,
|
|
- active_accommodations_per_deal_segmentation.
|
|
|
|
tests:
|
|
- dbt_utils.unique_combination_of_columns:
|
|
combination_of_columns:
|
|
- end_date
|
|
- id_deal
|
|
- dash_source
|
|
- active_accommodations_per_deal_segmentation
|
|
|
|
columns:
|
|
- name: start_date
|
|
data_type: date
|
|
description: |
|
|
The start date of the time range considered for the metrics in this record.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: end_date
|
|
data_type: date
|
|
description: |
|
|
The end date of the time range considered for the metrics in this record.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: id_deal
|
|
data_type: string
|
|
description: Unique identifier of an account.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: dash_source
|
|
data_type: string
|
|
description: Dashboard source, either old or new.
|
|
tests:
|
|
- not_null
|
|
- accepted_values:
|
|
values:
|
|
- "New Dash"
|
|
- "Old Dash"
|
|
|
|
- name: active_accommodations_per_deal_segmentation
|
|
data_type: string
|
|
description: |
|
|
Segment value based on the number of listings booked in 12 months
|
|
for a given deal and date.
|
|
tests:
|
|
- not_null
|
|
- accepted_values:
|
|
values:
|
|
- "0"
|
|
- "01-05"
|
|
- "06-20"
|
|
- "21-60"
|
|
- "61+"
|
|
- "UNSET"
|
|
|
|
- name: main_billing_country_iso_3_per_deal
|
|
data_type: string
|
|
description: |
|
|
Main billing country of the host aggregated at Deal level.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: created_bookings
|
|
data_type: bigint
|
|
description: |
|
|
Count of accummulated bookings created in a given month up to the
|
|
given date and per specified dimension.
|
|
|
|
- name: int_kpis__metric_mtd_created_bookings
|
|
description: |
|
|
This model computes the Month-To-Date Created Bookings at the
|
|
deepest granularity.
|
|
Be aware that any dimension that can change over the monthly period,
|
|
such as daily segmentations, are included in the primary key of the
|
|
model.
|
|
|
|
The unique key corresponds to:
|
|
- end_date,
|
|
- id_deal,
|
|
- dash_source,
|
|
- active_accommodations_per_deal_segmentation.
|
|
|
|
tests:
|
|
- dbt_utils.unique_combination_of_columns:
|
|
combination_of_columns:
|
|
- end_date
|
|
- id_deal
|
|
- dash_source
|
|
- active_accommodations_per_deal_segmentation
|
|
|
|
columns:
|
|
- name: start_date
|
|
data_type: date
|
|
description: |
|
|
The start date of the time range considered for the metrics in this record.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: end_date
|
|
data_type: date
|
|
description: |
|
|
The end date of the time range considered for the metrics in this record.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: id_deal
|
|
data_type: string
|
|
description: Unique identifier of an account.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: dash_source
|
|
data_type: string
|
|
description: Dashboard source, either old or new.
|
|
tests:
|
|
- not_null
|
|
- accepted_values:
|
|
values:
|
|
- "New Dash"
|
|
- "Old Dash"
|
|
|
|
- name: active_accommodations_per_deal_segmentation
|
|
data_type: string
|
|
description: |
|
|
Segment value based on the number of listings booked in 12 months
|
|
for a given deal and date.
|
|
tests:
|
|
- not_null
|
|
- accepted_values:
|
|
values:
|
|
- "0"
|
|
- "01-05"
|
|
- "06-20"
|
|
- "21-60"
|
|
- "61+"
|
|
- "UNSET"
|
|
|
|
- name: main_billing_country_iso_3_per_deal
|
|
data_type: string
|
|
description: |
|
|
Main billing country of the host aggregated at Deal level.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: created_bookings
|
|
data_type: bigint
|
|
description: |
|
|
Count of accummulated bookings created in a given month up to the
|
|
given date and per specified dimension.
|
|
|
|
- name: int_kpis__aggregated_monthly_created_bookings
|
|
description: |
|
|
This model computes the dimension aggregation for
|
|
Monthly Created Bookings.
|
|
|
|
The primary key of this model is end_date, dimension
|
|
and dimension_value.
|
|
|
|
tests:
|
|
- dbt_utils.unique_combination_of_columns:
|
|
combination_of_columns:
|
|
- end_date
|
|
- dimension
|
|
- dimension_value
|
|
|
|
columns:
|
|
- name: start_date
|
|
data_type: date
|
|
description: |
|
|
The start date of the time range considered for the metrics in this record.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: end_date
|
|
data_type: date
|
|
description: |
|
|
The end date of the time range considered for the metrics in this record.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: dimension
|
|
data_type: string
|
|
description: The dimension or granularity of the metrics.
|
|
tests:
|
|
- assert_dimension_completeness:
|
|
metric_column_name: created_bookings
|
|
- accepted_values:
|
|
values:
|
|
- global
|
|
- by_number_of_listings
|
|
- by_billing_country
|
|
- by_dash_source
|
|
- by_deal
|
|
|
|
- name: dimension_value
|
|
data_type: string
|
|
description: The value or segment available for the selected dimension.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: created_bookings
|
|
data_type: bigint
|
|
description: The month-to-date created bookings for a given date, dimension and value.
|
|
|
|
- name: int_kpis__aggregated_mtd_created_bookings
|
|
description: |
|
|
This model computes the dimension aggregation for
|
|
Month-To-Date Created Bookings.
|
|
|
|
The primary key of this model is end_date, dimension
|
|
and dimension_value.
|
|
|
|
tests:
|
|
- dbt_utils.unique_combination_of_columns:
|
|
combination_of_columns:
|
|
- end_date
|
|
- dimension
|
|
- dimension_value
|
|
|
|
columns:
|
|
- name: start_date
|
|
data_type: date
|
|
description: |
|
|
The start date of the time range considered for the metrics in this record.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: end_date
|
|
data_type: date
|
|
description: |
|
|
The end date of the time range considered for the metrics in this record.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: dimension
|
|
data_type: string
|
|
description: The dimension or granularity of the metrics.
|
|
tests:
|
|
- assert_dimension_completeness:
|
|
metric_column_name: created_bookings
|
|
- accepted_values:
|
|
values:
|
|
- global
|
|
- by_number_of_listings
|
|
- by_billing_country
|
|
- by_dash_source
|
|
- by_deal
|
|
|
|
- name: dimension_value
|
|
data_type: string
|
|
description: The value or segment available for the selected dimension.
|
|
tests:
|
|
- not_null
|
|
|
|
- name: created_bookings
|
|
data_type: bigint
|
|
description: The month-to-date created bookings for a given date, dimension and value.
|