Merged PR 2090: Adding int_core__mtd_accommodation_lifecycle

Adding int_core__mtd_accommodation_lifecycle. Mainly, it recreates the history of the lifecycle of a listing for each date appearing in the MTD dates (so, last day of month + days for current month + days for current month of the previous year).

Implementation of lag function makes it much faster than self-join. Runs in approx 17 seconds (in local)

The logic behind the lifecycle is the same, and the most-up-to-date results in my local show the same values for the new model and the int_core__accommodation model (see screenshots)

previous model:
![image.png](https://guardhog.visualstudio.com/4148d95f-4b6d-4205-bcff-e9c8e0d2ca65/_apis/git/repositories/54ac356f-aad7-46d2-b62c-e8c5b3bb8ebf/pullRequests/2090/attachments/image.png)

new model:
![image (2).png](https://guardhog.visualstudio.com/4148d95f-4b6d-4205-bcff-e9c8e0d2ca65/_apis/git/repositories/54ac356f-aad7-46d2-b62c-e8c5b3bb8ebf/pullRequests/2090/attachments/image%20%282%29.png)

Following PRs will focus on readapting logic of int_core__accommodation to avoid the replication of lifecycle computation (just re-use the last available date in int_core__mtd_accommodation_lifecycle) and the creation of the desired metrics for the Biz Overview dashboard, including a refactor of the mtd_bookings to remove the listing logic from there.

Related work items: #17312
This commit is contained in:
Oriol Roqué Paniagua 2024-06-21 13:59:14 +00:00
parent fe93f594f5
commit ef80637a9b
2 changed files with 255 additions and 0 deletions

View file

@ -0,0 +1,165 @@
/*
This model provides Month-To-Date (MTD) accommodation lifecycle based on booking metrics.
*/
{{ config(materialized="table", unique_key=["date","id_accommodation"]) }}
with
int_core__bookings as (select * from {{ ref("int_core__bookings") }}),
stg_core__accommodation as (select * from {{ ref("stg_core__accommodation") }}),
int_dates_mtd as (select * from {{ ref("int_dates_mtd") }}),
booked_days_per_accommodation as (
select
id_accommodation,
created_date_utc,
lag(created_date_utc, 1) over (
partition by id_accommodation order by created_date_utc asc
) as previous_booked_date
from int_core__bookings
group by id_accommodation, created_date_utc
),
accommodation_historic_booking_dates as (
select
d.date,
a.id_accommodation,
date(a.created_at_utc) as creation_date_utc,
min(b.created_date_utc) as first_time_booked_date_utc,
max(b.created_date_utc) as last_time_booked_date_utc,
max(b.previous_booked_date) as second_to_last_time_booked_date_utc
from int_dates_mtd d
inner join stg_core__accommodation a on d.date >= date(a.created_at_utc)
left join
booked_days_per_accommodation b
on a.id_accommodation = b.id_accommodation
and d.date >= b.created_date_utc
group by d.date, a.id_accommodation, date(a.created_at_utc)
),
accommodation_historic_booking_features as (
select
ahbd.date,
ahbd.id_accommodation,
ahbd.creation_date_utc,
ahbd.first_time_booked_date_utc,
ahbd.last_time_booked_date_utc,
ahbd.second_to_last_time_booked_date_utc,
case
when ahbd.first_time_booked_date_utc is not null then true else false
end as accommodation_has_at_least_one_booking,
case
when
date_trunc('month', ahbd.creation_date_utc)
= date_trunc('month', ahbd.date)
then true
else false
end as accommodation_was_created_this_month,
case
when
date_trunc('month', ahbd.first_time_booked_date_utc)
= date_trunc('month', ahbd.last_time_booked_date_utc)
then true
else false
end as accommodation_has_first_booking,
case
when
date_trunc('month', ahbd.last_time_booked_date_utc)
= date_trunc('month', ahbd.date)
then true
else false
end as has_been_booked_within_current_month,
case
when
date_trunc('month', ahbd.last_time_booked_date_utc)
+ interval '6 months'
> date_trunc('month', ahbd.date)
then true
else false
end as has_been_booked_within_last_6_months,
case
when
date_trunc('month', ahbd.last_time_booked_date_utc)
+ interval '12 months'
> date_trunc('month', ahbd.date)
then true
else false
end as has_been_booked_within_last_12_months,
case
when
date_trunc('month', ahbd.last_time_booked_date_utc)
+ interval '12 months'
= date_trunc('month', ahbd.date)
then true
else false
end as last_booking_was_12_months_ago,
case
when
date_trunc('month', ahbd.second_to_last_time_booked_date_utc)
+ interval '12 months'
< date_trunc('month', ahbd.last_time_booked_date_utc)
then true
else false
end as had_previous_booking_more_than_12_months_before_the_last
from accommodation_historic_booking_dates ahbd
)
select
date,
id_accommodation,
creation_date_utc,
first_time_booked_date_utc,
last_time_booked_date_utc,
second_to_last_time_booked_date_utc,
case
-- 01-New: The listing has been created this month and has not had any booking
when
not accommodation_has_at_least_one_booking
and accommodation_was_created_this_month
then '01-New'
-- 02-Never Booked: The listing has been created before this month and has not
-- had any booking
when
not accommodation_has_at_least_one_booking
and not accommodation_was_created_this_month
then '02-Never Booked'
-- 03-First Time Booked: The listing has been booked for the first time and it
-- has been created this month
when accommodation_has_first_booking and has_been_booked_within_current_month
then '03-First Time Booked'
-- 04-Active: The listing has had at least 1 booking in its history and it's
-- been less than 12 months since the last booking
-- and is not reactivated and is not FTB
when
accommodation_has_at_least_one_booking
and has_been_booked_within_last_12_months
-- not reactivated
and not (
had_previous_booking_more_than_12_months_before_the_last
and has_been_booked_within_current_month
)
-- not FTB
and not (
accommodation_has_first_booking and has_been_booked_within_current_month
)
then '04-Active'
-- 05-Churning: The listing has been booked at least once and it's been 12
-- months since the last booking
when accommodation_has_at_least_one_booking and last_booking_was_12_months_ago
then '05-Churning'
-- 06-Inactive: The listing has been booked at least once and it's been more
-- than 12 months since the last booking
when
accommodation_has_at_least_one_booking
and not has_been_booked_within_last_12_months
and not last_booking_was_12_months_ago
then '06-Inactive'
-- 07-Reactivated: The listing was churned/inactive but has had a new booking
-- this month
when
had_previous_booking_more_than_12_months_before_the_last
and has_been_booked_within_current_month
then '07-Reactivated'
else null
end as accommodation_lifecycle_state,
has_been_booked_within_current_month,
has_been_booked_within_last_6_months,
has_been_booked_within_last_12_months
from accommodation_historic_booking_features

View file

@ -689,3 +689,93 @@ models:
data_type: timestamp
description: Timestamp of when the accommodation record was extracted from the backend into the DWH.
- name: int_core__mtd_accommodation_lifecycle
description: |
This model contains the historic information regarding the lifecycle of accommodations, also known as
listings. The information regarding the booking-related time allows for the current status of any listing
regarding its activity. This information is encapsulated in the following columns:
accommodation_lifecycle_state: contains one of the following states
- 01-New: Listings that have been created in the current month, without bookings
- 02-Never Booked: Listings that have been created before the current month, without bookings.
- 03-First Time Booked: Listings that have been booked for the first time in the current month.
- 04-Active: Listings that have booking activity in the past 12 months (that are not FTB nor reactivated)
- 05-Churning: Listings that are becoming inactive because of lack of bookings in the past 12 months
- 06-Inactive: Listings that have not had a booking for more than 12 months.
- 07-Reactivated: Listings that have had a booking in the current month that were inactive or churning before.
- Finally, if none of the logic applies, which should not happen, null will be set and a dbt alert will raise.
Since the states of Active, First Time Booked and Reactivated indicate certain booking activity and are
mutually exclusive, the model also provides information of the recency of the bookings by the following
booleans:
- has_been_booked_within_current_month: If a listing has had a booking created in the current month
- has_been_booked_within_last_6_months: If a listing has had a booking created in the past 6 months
- has_been_booked_within_last_12_months: If a listing has had a booking created in the past 12 months
Note that if a listing has had a booking created in a given month, all 3 columns will be true. Similarly,
if the last booking created to a listing was 5 months ago, only the column has_been_booked_in_1_month
will be false; while the other 2 will be true.
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- date
- id_accommodation
columns:
- name: date
data_type: date
description: The date for the month-to-date. Information is inclusive to the date displayed.
tests:
- not_null
- name: id_accommodation
data_type: bigint
description: Id of the accommodation or listing.
tests:
- not_null
- name: creation_date_utc
data_type: date
description: Date of when the listing was created.
- name: first_time_booked_date_utc
data_type: date
description: |
Date of the first booking created for a given listing. Can be null if the listing
has never had a booking associated with it.
- name: last_time_booked_date_utc
data_type: date
description: |
Date of the last booking created for a given listing. Can be null if the listing
has never had a booking associated with it. Can be the same as first_time_booked_date_utc
if the listing only had 1 booking in its history.
- name: second_to_last_time_booked_date_utc
data_type: date
description: |
Date of the second-to-last booking created for a given listing, meaning the creation
date of the booking that precedes the last one. It's relevant for the reactivation computation
on the lifecycle. Can be null if the listing has never had a booking associated with it or if
the listing only had 1 booking in its history.
- name: accommodation_lifecycle_state
data_type: character varying
description: |
Contains the lifecycle state of a Listing. The accepted values are:
01-New, 02-Never Booked, 03-First Time Booked, 04-Active, 05-Churning, 06-Inactive,
07-Reactivated. Failing to implement the logic will result in alert.
tests:
- not_null
- name: has_been_booked_within_current_month
data_type: boolean
description: If the listing has had a booking created in the current month.
- name: has_been_booked_within_last_6_months
data_type: boolean
description: If the listing has had a booking created in the past 6 months.
- name: has_been_booked_within_last_12_months
data_type: boolean
description: If the listing has had a booking created in the past 12 months.