sh-notion/notion_data_team_no_files/Technical Documentation - 2024-11-12 13c0446ff9c980719db3f4c420995f70.md
Pablo Martin a256b48b01 pages
2025-07-11 16:15:17 +02:00

8.6 KiB
Raw Blame History

Technical Documentation - 2024-11-12

Purpose

The intermediate/kpis folder is dedicated to KPIs modelisation, which include mostly any relevant dimension, measure and time aggregation needed for transforming data into business metrics. As Data Team, we should provide the maximum possible quality of KPIs.

Convention

Model names

  • Any model within the folder intermediate/kpis needs to follow this convention: int_kpis__{structure_type}_{time_dimension}_{relevant_entity_name}.
  • Structure types can be the following:
    • lifecycle: any modelling that classifies certain behavior on a given entity that can vary over time. For instance, the listing lifecycle in terms of booking creation could categorise the lifecycle of the listing based on whether a listing being new, active, never booked, inactive, etc.
    • dimension: any modelling that allows to segment or categorise data, so it can provide descriptive context for the measures. Segments resulting from lifecycles would likely have an equivalent dimension model.
    • metric: any model that computes a given metric per different dimensions that is not aggregated. This means that each dimension will have a dedicated column within the model.
    • agg: a model that aggregates the data into a 1) date range, 2) a dimension and 3) a dimension value for any given metric. These will always depend on metrics models.
  • Time dimension can be the following:
    • daily: if the time granularity is daily
    • monthly: if the time granularity is monthly, meaning metrics are aggregated to the month
    • mtd: if the time granularity is month-to-date, meaning metrics are cumulative to a certain date of the current month and so it's the case for the same days on the month of the previous days.
    • others.
  • Relevant entity name needs to easily and uniquely identify the entity being modelled, such as Created Bookings.
  • The only exception is int_kpis__dimension_dates, that even though is granular at daily level, it's simplified on purpose, to avoid the model being int_kpis__dimension_daily_dates.

Logic

  • The model that contains the deepest granularity for each entity should be the one handling the data gathering to compute raw metrics and dimensions. Likely, this model will be in the form of int_kpis__metric_daily_{relevant_entity_name}. In this case, joins outside of the kpis folder are accepted and expected in order to gather dimensions and metrics.
  • Downstream models within kpis folder, indistinctly of these being metric or aggregated models, should not join with other models outside of the kpis folder. Further enrichment can be done with outside models as long as the resulting models are directly located outside the kpis folder, namely into cross/general folders.
  • Downstream models within kpis folder could eventually join with other models within the kpis folder in order to create weighted or converted metrics.

Dimension aggregation

Models that are dimensions aggregates, namely aggregated or agg models, follow a common pattern of date, dimension and dimension_value.

For models that are not daily, such as monthly or mtd, date is substituted by a time range defined within start_date and end_date. Generally, end_date is part of the primary key alongside dimension and dimension_value, while start_date is only displayed for information purposes.

In order to specify which dimensions are considered to be retrieved for each aggregate model, we use the get_kpi_dimensions_per_model macro. This macro only takes as argument the name of the entity that were modelling, such as CREATED_BOOKINGS.

By default, the macro will consider the following base dimensions as the expected ones:

  • global
  • by_billing_country
  • by_number_of_listings

Generally, any model will also receive the by_deal dimension unless strictly removed in the macro configuration. Additional entity-specific dimensions can be configured for the aggregation. For instance, GUEST_PAYMENTS can receive both the 4 abovementioned dimension aggregations as well as by_has_id_check as its required for other purposes.

Lastly, be aware that when creating a new dimension, youd need to create a dedicated macro entry by the name of dim_{name_of_your_dimension}, that should provide 1) the dimension name to be used and 2) the field that contains the dimension_value used to compute the aggregation.

KPIs Products

This is a summary of the Data Products that depend on the KPIs.

Main KPIs

Reporting: Main KPIs

Data Product page: Business Overview Reporting Suite

Computation flows:

→ Note that these are shared within KPIs folder, and get split at cross level.

  • Name: MTD + Monthly per category
    • Downstream tables:
      • cross/int_mtd_vs_previous_year_metrics
        • In turn, this depends on cross/int_monthly_aggregated_metrics_history_by_deal due to the computation of Churn Rate metrics, that are deal-dependant.
      • cross/int_mtd_aggregated_metrics
      • general/mtd_aggregated_metrics
    • Time dimensions used:
      • Monthly (depends on daily)
      • MTD (depends on daily)
    • Dimensions used:
      • global
      • by_billing_country
      • by_number_of_listings
    • Entities used:
      • Created Bookings
      • Check Out Bookings
      • Cancelled Bookings
      • Billable Bookings
      • Created Guest Journeys
      • Started Guest Journeys
      • Completed Guest Journeys
      • Guest Journeys with Payment
      • Guest Payments
      • Invoiced Revenue
      • Host Resolutions
      • Listings
      • Deals
    • Depends on:
      • Flow: Monthly by Deal
        • Table: cross/int_monthly_aggregated_metrics_history_by_deal
  • Name: Monthly by Deal
    • Downstream tables:
      • cross/int_monthly_aggregated_metrics_history_by_deal
      • general/monthly_aggregated_metrics_history_by_deal
    • Time dimensions used:
      • Monthly (depends on daily)
    • Dimensions used:
      • by_deal
    • Entities used:
      • Created Bookings
      • Check Out Bookings
      • Cancelled Bookings
      • Billable Bookings
      • Created Guest Journeys
      • Started Guest Journeys
      • Completed Guest Journeys
      • Guest Journeys with Payment
      • Guest Payments
      • Invoiced Revenue
      • Host Resolutions
      • Listings

Account Managers Reporting

Reporting: Account Managers Reporting

Data Product page: Account Management Reporting Suite

Computation flows:

  • Name: growth score by deal
    • Downstream tables:

      • cross/int_monthly_growth_score_by_deal
      • general/monthly_growth_score_by_deal
    • Time dimensions used:

      • Monthly (depends on daily)
    • Dimensions used:

      • by_deal
    • Entities used:

      → At this stage, uses the same as Monthly by Deal from Main KPIs. In terms of pure business sense, it would only use:

      • Created Bookings
      • Guest Payments
      • Invoiced Revenue
      • Listings
    • Depends on:

      • Flow: Monthly by Deal (Main KPIs)
        • Table: cross/int_monthly_aggregated_metrics_history_by_deal
  • Name: monthly aggregated metrics history by deal by time window
    • Downstream tables:

      • cross/int_monthly_aggregated_metrics_history_by_deal_by_time_window
      • general/monthly_aggregated_metrics_history_by_deal_by_time_window
    • Time dimensions used:

      • Monthly (depends on daily. It aggregates different months to generate larger aggregations, ex.: Previous 6 months).
    • Dimensions used:

      • by_deal
    • Entities used:

      → At this stage, uses the same as Monthly by Deal from Main KPIs. In terms of pure business sense, it would only use:

      • Created Bookings
      • Guest Payments
      • Invoiced Revenue
      • Host Resolutions
      • Listings
    • Depends on:

      • Flow: Monthly by Deal (Main KPIs)
        • Table: cross/int_monthly_aggregated_metrics_history_by_deal