8.8 KiB
Refactoring Business KPIs - 2024-07-05
2024-07-05
Current status
In over a month, the amount of KPIs that we’re providing as Data Team has grown considerably. This goes together with an increase of the dbt modelisation to support the final display in the power bi report.
The current approach is the following:
-
Each metric defined in Reporting Needs , gets grouped into a model with the same typology. For example,
Created Bookings,Cancelled BookingsandCheckout Bookingsare metrics associated toBookings. There’s some more advanced weighted measures, such asGuest Journey Start Rate, that in this case it is included in theGuest Journey. -
For each eligible metric, we currently have 2 views:
-
A global view, which means is not granularized. It contains the prefix
mtd. This one is presented in 2 formats:-
For historical months, we retrieve the metric in a monthly basis. This is effectively the same as considering the end of the month in a month-to-date basis.
-
For the current month, we retrieve the information using a month-to-date approach. This means, if today is 4th of July 2024, we get the information from the 1st of July 2024 to the 3rd of July 2024 and compare with the same time period from the previous year, meaning 1st of July 2023 to 3rd July 2023. During the current month, we are able to retrieve this comparison for all days that have already of the current month. This means that apart from the 3rd of July 2024 comparison, we will also have the 1st and the 2nd. This gets cleaned up every month.
-
Structure:
- The global structure is the following:
-- Retrieval computation some_metrics as ( select d.date, count/sum/etc as aggregated_metric from int_dates_mtd d inner join some_table on something where condition group by 1 ), -- -- Repeat for each different date logic computation -- ), main_kpis as ( -- Final aggregation of subqueries -- select d.year, d.month, d.day, d.date, d.is_end_of_month, d.is_current_month, some_metric.aggregated_metric -- Repeat for other metrics from int_dates_mtd d left join some_metrics on some_metrics.date = d.date -- Repeat for other CTEs ) -- **Pivoting** to get previous year for each line & computing relative increment -- (rel_incr) -- select a.year, a.month, a.day, a.is_end_of_month, a.is_current_month, a.date, b.date as previous_year_date, a.aggregated_metric, b.aggregated_metric as previous_year_aggregated_metric, cast(a.aggregated_metric as decimal) / b.aggregated_metric - 1 as relative_increment_aggregated_metric, -- Repeat triple structure for any other metric from main_kpi a left join main_kpi b on a.month = b.month and a.year = b.year + 1 where (a.is_end_of_month = 1 or (a.is_current_month = 1 and a.day = b.day))
-
-
A by deal view, which extracts the information for each deal id as long as the deal is informed in the users table, and as long as we can trace that metric to a specific deal. This means, for instance, if we get some Guest Journey Completed but the Host User is not set, we won’t be able to attribute it. These model contain a
monthlyandby_dealprefixes and suffixes respectively.- Structure:
- The by deal structure is the following:
-- Retrieval computation some_metrics as ( select date_trunc('month', a_date_field)::date as first_day_month, **id_deal,** count/sum/etc as aggregated_metric from a inner join b on something where condition group by 1**, 2** -- Note that here we do NOT join with int_dates_by_deal because -- 1) we don't compute MTD so it's not needed -- 2) it heavily increases performance ), -- -- Repeat for each different date logic computation -- ) -- Final aggregation of subqueries -- select d.year, d.month, d.day, d.date, **d.id_deal**, some_metric.aggregated_metric -- Repeat for other metrics from int_dates_**by_deal** d left join some_metrics on some_metrics.**first_day_month** = d.**first_day_month** **and some_metrics.id_deal = d.id_deal** -- Repeat for other CTEs -- Note that here all data is at end of month or yesterday -- There's no relative increase vs. last year computation! - Structure:
-
-
Since we were only sourcing core models so far, all KPIs models are placed into core, with the usual
int_core__convention for intermediate andcore__for reporting.
Limitations
- The global view directly computes the
Value,Previous Year ValueandRelative Increment Valuewithin the same typology model. This means that, theBookings Created,Previous Year Created BookingsandRelative Increment Bookings Createdare handled within theBookingmodel.- This worked well so far, because all metrics that correspond to rates were self-contained in the same group. For example,
Guest Journey Completion RateusesGuest Journey CompletedandGuest Journey Started, that are within theGuest Journeymodel. - But it does not work well for the new metrics we aim to implement, for instance,
Revenue per Booking, sinceRevenuewill probably be an aggregation of aGuest Revenuemodel, anOperator Revenuemodel, and an already existingBookingmodel… - Solution → The pivoting logic to retrieve previous year should be abstracted into a superior layer
- This worked well so far, because all metrics that correspond to rates were self-contained in the same group. For example,
- The power bi report is sourcing directly a
core__mtd_aggregated_metricsand acore__monthly_aggregated_metrics_history_by_dealmodels.- This worked well so far, because all metrics were coming from Core.
- But it does not work well for the new metrics we aim to implement, for instance, Host Resolution Amount Paid, since this will come from Xero. The Resolution model could still be called int_xero__ABC but the aggregation would not only contain information from Core
- Solution → The aggregated metrics models and the exposures should migrate to Cross
Strategy
- Keep the reporting always alive by creating many but small changes to the code
- Track progress and details in this Notion page
Step-by-step details:
- Remove previous year comparison logic from
int_core__mtd_booking_metricsand create an abstraction layer inint_mtd_vs_previous_year_metricsmodel that includes the bookings. Makeint_core__aggregated_metricsread Bookings metrics from this table - Remove previous year comparison logic from
int_core__mtd_accommodation_metricsand add it into theint_mtd_vs_previous_year_metricsmodel. Makeint_core__aggregated_metricsread Accommodation metrics from this table - Remove previous year comparison logic from
int_core__mtd_deal_metricsand add it into theint_mtd_vs_previous_year_metricsmodel. Makeint_core__aggregated_metricsread Deal metrics from this table - Remove previous year comparison logic from
int_core__mtd_guest_journey_metricsand add it into theint_mtd_vs_previous_year_metricsmodel. Makeint_core__aggregated_metricsread Guest Journey metrics from this table - Remove previous year comparison logic from
int_mtd_guest_revenue_metricsand add it into theint_mtd_vs_previous_year_metricsmodel. Makeint_core__aggregated_metricsread Guest Revenue metrics from this table - Transition
int_core__mtd_aggregated_metricsandint_core__monthly_aggregated_metrics_history_by_dealintoint_mtd_aggregated_metricsandint_monthly_aggregated_metrics_history_by_dealrespectively. Make the respective reporting models read from the new cross models. - Create the reporting cross models and re-do the reporting
- Once validated, deploy the Power BI reading from cross reporting.
- Once
core__monthly_aggregated_metrics_history_by_dealandcore__mtd_aggregated_metricshave no downstream dependants at all, remove them.