Merged PR 3451: Adds Deal Daily Lifecycle and metrics

# Description

Changes:
* Creates lifecycle_daily_deal, metric_daily_deals and agg_daily_deals. These follow a different strategy due to the nature of the metrics
* Modifies the dimension macro to ensure deal dimension is included in all models except these ones
* Fixes production issue on currently deployed deal lifecycle and metrics

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #23566
This commit is contained in:
Oriol Roqué Paniagua 2024-11-07 10:49:06 +00:00
parent 9ef9a57c03
commit 8c23f91242
7 changed files with 821 additions and 6 deletions

View file

@ -203,6 +203,129 @@ models:
data_type: boolean
description: If the listing has had a booking created in the past 12 months.
- name: int_kpis__lifecycle_daily_deal
description: |
This model computes the daily lifecycle of accounts, at deal level.
The information regarding the booking-related time allows for the current status of any
deal regarding its activity. This information is encapsulated in the following columns:
deal_lifecycle_state: contains one of the following states
- 01-New: Deals that have been created in the current month, without bookings, that are not offboarded.
- 02-Never Booked: Deals that have been created before the current month, without bookings, that are not offboarded.
- 03-First Time Booked: Deals that have been booked for the first time in the current month, that are not offboarded.
- 04-Active: Deals that have booking activity in the past 12 months (that are not FTB nor reactivated), that are not offboarded.
- 05-Churning: Either Deals that are offboarded in that month or Deals that are becoming inactive because of lack of bookings in the past 12 months
- 06-Inactive: Either Deals that have been previously offboarded or Deals that have not had a booking for more than 12 months.
- 07-Reactivated: Deals that have had a booking in the current month that were inactive or churning before, that are not offboarded.
- Finally, if none of the logic applies, which should not happen, null will be set and a dbt alert will raise.
Since the states of Active, First Time Booked and Reactivated indicate certain booking activity and are
mutually exclusive, the model also provides information of the recency of the bookings by the following
booleans:
- has_been_booked_within_current_month: If a deal has had a booking created in the current month
- has_been_booked_within_last_6_months: If a deal has had a booking created in the past 6 months
- has_been_booked_within_last_12_months: If a deal has had a booking created in the past 12 months
Note that if a deal has had a booking created in a given month, all 3 columns will be true. Similarly,
if the last booking created to a deal was 5 months ago, only the column has_been_booked_in_1_month
will be false; while the other 2 will be true.
Some final considerations:
- It's possible but not common that a Deal gets offboarded on the same month that has had some bookings created.
- It shouldn't happen that a Deal that is Inactive has some bookings created. However, there's few cases in which
this happens likely because of misconfiguration between Hubspot and Core. This should be reported to increase
data quality.
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- date
- id_deal
columns:
- name: date
data_type: date
description: Date in which a Deal has a given lifecycle state.
tests:
- not_null
- name: id_deal
data_type: character varying
description: Unique identifier of the Account.
tests:
- not_null
- name: creation_date_utc
data_type: date
description: Date of when the first host associated to that deal was created.
- name: first_time_booked_date_utc
data_type: date
description: |
Date of the first booking created for a given deal. Can be null if the deal
has never had a booking associated with it.
- name: last_time_booked_date_utc
data_type: date
description: |
Date of the last booking created for a given deal. Can be null if the deal
has never had a booking associated with it. Can be the same as first_time_booked_date_utc
if the deal only had 1 booking in its history.
- name: second_to_last_time_booked_date_utc
data_type: date
description: |
Date of the second-to-last booking created for a given deal, meaning the creation
date of the booking that precedes the last one. It's relevant for the reactivation computation
on the lifecycle. Can be null if the deal has never had a booking associated with it or if
the deal only had 1 booking in its history.
- name: cancellation_date_utc
data_type: date
description: |
Date of when the deal was cancelled, according to Hubspot. This is the date we're considering
for hard offboarding. It can be null, meaning the account has not been offboarded.
- name: deal_lifecycle_state
data_type: character varying
description: |
Contains the lifecycle state of a deal. The accepted values are:
01-New, 02-Never Booked, 03-First Time Booked, 04-Active, 05-Churning, 06-Inactive,
07-Reactivated. Failing to implement the logic will result in alert.
tests:
- not_null
- accepted_values:
values:
- 01-New
- 02-Never Booked
- 03-First Time Booked
- 04-Active
- 05-Churning
- 06-Inactive
- 07-Reactivated
- name: has_been_booked_within_current_month
data_type: boolean
description: |
If the deal has had a booking already created in the current month.
Note that if the Booking is created on the 5th day, this column will
be false for the days 1st to 4th, and true from the day 5th onwards.
- name: has_been_booked_within_last_6_months
data_type: boolean
description: |
If the deal has had a booking created in the past 6 months.
- name: has_been_booked_within_last_12_months
data_type: boolean
description: |
If the deal has had a booking created in the past 12 months.
- name: has_been_offboarded
data_type: boolean
description: |
If the deal has been cancelled or not. Note that if the Deal
has been offboarded on the 5th day, this column will be false
for the days 1st to 4th, and true from the day 5th onwards.
- name: int_kpis__dimension_daily_accommodation
description: |
This model computes the deal segmentation per number of
@ -4957,3 +5080,243 @@ models:
description: |
The month-to-date Waiver Amount Paid Back to Hosts, in GBP,
without taxes for a given date, dimension and value.
- name: int_kpis__metric_daily_deals
description: |
This model computes the Daily Deal metrics at the deepest granularity.
Be aware that this Deal entity will differ from how the rest of models
usually operate. This is because we compute Deal metrics, thus it does
not make sense to compute these at Deal level.
Also, Deal metrics at daily level already contain the time dimension
aggregates needed, thus we won't have mtd or monthly equivalent models,
but rather just select from this daily model the needed days to recover
the necessary information.
The unique key corresponds to the deepest granularity of the model,
in this case:
- date,
- main_billing_country_iso_3_per_deal,
- active_accommodations_per_deal_segmentation
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- date
- main_billing_country_iso_3_per_deal
- active_accommodations_per_deal_segmentation
columns:
- name: date
data_type: date
description: Date containing the Deal metrics.
tests:
- not_null
- name: active_accommodations_per_deal_segmentation
data_type: string
description: |
Segment value based on the number of listings booked in 12 months
for a given deal and date.
tests:
- not_null
- accepted_values:
values:
- "0"
- "01-05"
- "06-20"
- "21-60"
- "61+"
- "UNSET"
- name: main_billing_country_iso_3_per_deal
data_type: string
description: |
Main billing country of the host aggregated at Deal level.
tests:
- not_null
- name: new_deals
data_type: bigint
description: |
Count of new deals in a given date and per specified dimension.
- name: never_booked_deals
data_type: bigint
description: |
Count of never booked deals in a given date and per specified dimension.
- name: first_time_booked_deals
data_type: bigint
description: |
Count of first-time booked deals in a given date and per specified dimension.
- name: active_deals
data_type: bigint
description: |
Count of active deals in a given date and per specified dimension.
- name: inactive_deals
data_type: bigint
description: |
Count of inactive deals in a given date and per specified dimension.
- name: churning_deals
data_type: bigint
description: |
Count of churning deals in a given date and per specified dimension.
- name: reactivated_deals
data_type: bigint
description: |
Count of reactivated deals in a given date and per specified dimension.
- name: deals_booked_in_month
data_type: bigint
description: |
Count of deals booked within the month in a given date and per specified dimension.
- name: deals_booked_in_6_months
data_type: bigint
description: |
Count of deals booked within the past 6 months in a given date and per specified dimension.
- name: deals_booked_in_12_months
data_type: bigint
description: |
Count of deals booked within the past 12 months in a given date and per specified dimension.
- name: int_kpis__agg_daily_deals
description: |
This model computes the dimension aggregation for
Daily Deal metrics.
The primary key of this model is date, dimension
and dimension_value.
Be aware that this Deal entity will differ from how the rest of models
usually operate. This is because we compute Deal metrics, thus it does
not make sense to compute these at Deal level.
Also, Deal metrics at daily level already contain the time dimension
aggregates needed, thus we won't have mtd or monthly equivalent models,
but rather just select from this daily model the needed days to recover
the necessary information.
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- date
- dimension
- dimension_value
columns:
- name: date
data_type: date
description: Date containing the Deal metrics.
tests:
- not_null
- name: dimension
data_type: string
description: The dimension or granularity of the metrics.
tests:
- assert_dimension_completeness:
metric_column_names:
- new_deals
- never_booked_deals
- first_time_booked_deals
- active_deals
- churning_deals
- inactive_deals
- reactivated_deals
- deals_booked_in_month
- deals_booked_in_6_months
- deals_booked_in_12_months
- accepted_values:
values:
- global
- by_number_of_listings
- by_billing_country
- name: dimension_value
data_type: string
description: The value or segment available for the selected dimension.
tests:
- not_null
- name: is_end_of_month
data_type: boolean
description: True if it's end of month, false otherwise.
tests:
- not_null
- name: is_current_month
data_type: boolean
description: |
True if the date is within the current month, false otherwise.
tests:
- not_null
- name: is_month_to_date
data_type: boolean
description: |
True if the date is within the scope of month-to-date, false otherwise.
The scope of month-to-date takes into account both 1) a date being in
the current month or 2) a date corresponding to the same month of the
previous year, which day number cannot be higher than yesterday's day
number.
tests:
- not_null
- name: new_deals
data_type: bigint
description: |
Count of new deals for a given date, dimension and value.
- name: never_booked_deals
data_type: bigint
description: |
Count of never booked deals for a given date, dimension and value.
- name: first_time_booked_deals
data_type: bigint
description: |
Count of first-time booked deals for a given date, dimension and value.
- name: active_deals
data_type: bigint
description: |
Count of active deals for a given date, dimension and value.
- name: inactive_deals
data_type: bigint
description: |
Count of inactive deals for a given date, dimension and value.
- name: churning_deals
data_type: bigint
description: |
Count of churning deals for a given date, dimension and value.
- name: reactivated_deals
data_type: bigint
description: |
Count of reactivated deals for a given date, dimension and value.
- name: deals_booked_in_month
data_type: bigint
description: |
Count of deals booked within the month for a given date, dimension and value.
- name: deals_booked_in_6_months
data_type: bigint
description: |
Count of deals booked within the past 6 months for a given date, dimension and value.
- name: deals_booked_in_12_months
data_type: bigint
description: |
Count of deals booked within the past 12 months for a given date, dimension and value.