# Description
Modifies `int_core__mtd_deal_metrics` to include the customer segmentation based on listings. `schema.yaml` is also affected including new fields and tests. Hardcoded `int_core__mtd_vs_previous_year_metrics` to avoid propagating this upwards and messing up with the data display.
Overall, follows a similar strategy as we did for Booking and Guest Journey metrics. For reference, [here's the previous PR on GJ](https://guardhog.visualstudio.com/Data/_git/data-dwh-dbt-project/pullrequest/2533).
# Checklist
- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.
# Other
- [ ] Check if a full-refresh is required after this PR is merged.
Related work items: #19325
# Description
Modifies `int_core__mtd_guest_journey_metrics` to include the customer segmentation based on listings. `schema.yaml` is also affected including new fields and tests. Hardcoded `int_core__mtd_vs_previous_year_metrics` to avoid propagating this upwards and messing up with the data display.
Overall, follows a similar strategy as we did for Booking metrics.
# Checklist
- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.
# Other
- [ ] Check if a full-refresh is required after this PR is merged.
Related work items: #19325
# Description
This is a first idea of how I'd like to add dimensionality in the KPIs for the mtd models. For the moment, I keep deal_id apart, so I just touch the "mtd" models, that so far only contained "global" metrics.
In this case I include the listing segmentation (0, 1-5, 6-20, etc) in the bookings. To do this, I created 2 new fields: dimension and dimension_values.
I also created a "master" table with `date` - `dimension` - `dimension_value` called `int_dates_mtd_by_dimension`
Important notes:
- I force a hardcode in `int_mtd_vs_previous_year_metrics`. This is to not break production.
- You will notice how repetitive the code is starting to look. My intention with this PR is that we are happy with this approach on the naming, the strategy for joins, etc. If that's ok, next step is going to be doing macros on top. Think of the state of `int_core__mtd_booking_metrics` as the "compiled version" of the macro that should come afterwards.
# Checklist
- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [ ] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.
# Other
- [ ] Check if a full-refresh is required after this PR is merged.
Related work items: #19325
# Description
Adds 2 new tables:
- `int_core__user_role`: contains the relationship of a given user has a role.
- `int_core__user_host`: based on the previous table, it selects the users and main information from those users that are considered as hosts according to the role they have.
Note: I needed to change the test in stg. A user, generally, can have no role, one role, or multiple roles. Thus we cannot propagate this information in the unified_user model.
# Checklist
- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.
# Other
- [ ] Check if a full-refresh is required after this PR is merged.
Related work items: #19513
# Description
I modified the model of duplicated_bookings, I saw it had some errors and data that was not being used
# Checklist
- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.
# Other
- [ ] Check if a full-refresh is required after this PR is merged.
Related work items: #18875
This PR creates a new model int_core__mtd_accommodation_segmentation that provides the customer segments based on listing activity:
- 0
- 1-5
- 6-20
- 21-60
- 61+
For end of April 2024, the volume distribution on number of deals and total listings booked is:

For information, I estimate that around 3% of listings with bookings are missed, according to the data displayed in the KPIs for 30th April 2024. This is because we enforce deal-based categorisation (same happens with the deal view, anyway)
Related work items: #19325
Small refactor to follow up on last week's PR. We removed from the Guest Revenue models the host-takes-waiver aspect, thus these models are now only depending from Core. We just need to migrate it from cross to core.
One small detail as well, since we do not take into account at these models level the host-takes-waiver, technically, I would not call these models revenue but rather Guest Payments. This is why I also took the opportunity to apply this name.
Changes:
- `int_monthly_guest_revenue_by_deal` is now `int_core__monthly_guest_payments_history_by_deal`, and the location has changed from `intermediate.cross` to `intermediate.core`
- `int_mtd_guest_revenue_metrics` is now `int_core__mtd_guest_payments_metrics`, and the location has changed from `intermediate.cross` to `intermediate.core`
- Schema changes, moving these 2 models' documentation with the new naming from Cross to Core
- Provide continuity in following dependants: `int_mtd_vs_previous_year_metrics` and `int_monthly_aggregated_metrics_history_by_deal` now read from the 2 new models respectively. Additionally, the model alias has changed from `guest_revenue` to `guest_payments` to keep consistency.
This PR does not expose new metrics, but should keep the existing ones unaffected.
Related work items: #18914
Changing naming to follow convention.
This PR has the following changes:
- the model `int_core__mtd_aggregated_metrics` has been moved to cross and changed the name to `int_mtd_aggregated_metrics`
- the model `int_core__monthly_aggregated_metrics_history_by_deal` has been moved to cross and changed the name to `int_monthly_aggregated_metrics_history_by_deal`
- the reporting models `core__mtd_aggregated_metrics` and `core__monthly_aggregated_metrics_history_by_deal` now source the `int_mtd_aggregated_metrics` and `int_monthly_aggregated_metrics_history_by_deal` to avoid breaking the production dashboard
- the reporting models have been duplicated from core into general with the correct names, i.e., `mtd_aggregated_metrics` and `monthly_aggregated_metrics_history_by_deal`
- Documentation has been moved in intermediate and replicated in reporting, adding comments on the currently in use models that are going to die soon.
This will allow for a transition of the PBI dashboard from one source to another. Exposures file still not touched since technically the report is still sourcing the 'legacy' models. Documentation of the refactor here: https://www.notion.so/knowyourguest-superhog/Refactoring-Business-KPIs-5deb6aadddb34884ae90339402ac16e3
Related work items: #18202
New model for guests satisfaction report, I included columns to check what is the guest paying for that might be helpful for analysis as well
Related work items: #16947
Adds the following metrics:
- Guest Journey with Payment
- Guest Journey Payment Rate
by both visions (global and by deal id)
**Important**: it does not expose these metrics to the dashboard, this will be done after we have feedback from Ben R. on the paid GJ without GJ completeness. Missing steps to make them appear is to adapt `int_core__mtd_aggregated_metrics` and `int_core__monthly_aggregated_metrics_history_by_deal` and the respective reporting counterparts.
It adapts:
- `int_core__mtd_guest_journey_metrics`
- `int_core__monthly_guest_journey_history_by_deal`
the approaches are similar in the sense that we join with `int_core__verification_payments` and filter by a PAID status, that has been defined in the `dbt_project.yml` in a similar manner as we did with cancelled bookings. It can happen that the same verification request has multiple payments (see screenshot), which in this case we keep the first date in which the paid payment happens. The volume is quite low anyway.

code for the screenshot:
```
with pre as (
select
id_verification_request,
count(distinct icvp.id_payment) as total_paid_payments
from intermediate.int_core__verification_payments icvp
where icvp.payment_status = 'Paid'
group by 1
)
select
case when total_paid_payments > 2 then 'more than 2'
when total_paid_payments = 2 then '2'
when total_paid_payments = 1 then '1'
end as payment_volume_category,
count(1) as vr_volume
from pre
group by 1
order by 2 desc
```
I also added a missing reference in `schema.yaml` int about `int_core__mtd_guest_journey_metrics`
Related work items: #18105
This PR creates 2 new models:
- `int_core__monthly_aggregated_metrics_history_by_deal`, which just gathers the information of the previously created models that compute the kpis by deal id.
- `core__monthly_aggregated_metrics_history_by_deal`, effectively a copy from intermediate to reporting
It also includes documentation of these 2 models, differences between these and the `mtd_aggregated_metrics` equivalents and references it to exposures. I took the opportunity to update the documentation of the `core__mtd_aggregated_metrics` now that it's a bit more mature.
This should be the last PR for the first draft of 'by deal' metrics.
Related work items: #17689
Adding accommodation metrics by deal id with the model `int_core__monthly_accommodation_history_by_deal`.
With this PR, we have the full set of batch 1 metrics by deal id completed, although separated in different tables. Aggregation will come in a separated PR.
Similarly as the previous PR, this one it's a mix between the logic of `int_core__mtd_accommodation_metrics` and the logic existing for the `int_core__monthly_X_history_by_deal` . It also adds the tests in schema.
Related work items: #17689
Adding the 6 Guest Journey metrics by deal id by creating the model `int_core__monthly_guest_journey_history_by_deal`
The structure for the deal id detail follows yesterday's approach on bookings, namely `int_core__monthly_booking_history_by_deal`, but considering the metric computation of the guest journey, namely `int_core__mtd_guest_journey_metrics`.
It also adds the dbt tests ensuring that date and id_deal are not null and that the combination of both is unique.
Related work items: #17689
This is a first approach to compute some easy metrics for the "deal" based business kpis. At this stage, it contains the information of bookings (created, checkout, cancelled) per deal and month, including both historic months as well as the current one. This do not contain MTD computation because it's overkill to do a MTD at deal level (+ we have 1k deals, so scalability can become a problem in the future)
Models:
- **int_dates_by_deal**: simple model that reads from **int_dates** and just joins it with **unified_users** to retrieve the deals. It will be used as the 'source of truth' for which deals should be considered in a given month, basically, since the first host associated to a deal is created (not necessarily booked)
- **int_core__monthly_booking_history_by_deal**: it contains the history of bookings per deal id in a monthly basis. It should be easy enough to integrate here, in the future and if needed, B2B macro segmentation.
In terms of performance, comparing the model **int_core__monthly_booking_history_by_deal** and **int_core__mtd_booking_metrics** you'll see that I removed the joined with the **int_dates_xxx** in the CTEs. This is because I want to avoid a double join of date & deal that I tried and I stopped after 5 min running. Since this computation is in a monthly basis - no MTD - it's easy enough to just apply the **int_dates_by_deal** on the last part of the query. With this approach, it runs in 7 seconds.
Related work items: #17689
This PR closes the first draft of the first batch of business kpis. Host logic has changed to be applied at deal id level.
It's mostly an adapted copy-paste from the accommodation counterpart, specifically:
- `int_core__mtd_deal_lifecycle`: computes the historic deal lifecycle. One line for each deal and MTD date. **Important**: _Not all hosts have a deal set. This will need a data quality report for business teams to fix_
- `int_core__mtd_deal_metrics`: computes the aggregation at MTD date level of the metrics per lifecycle state and activity state
Additionally, this PR changes:
- `int_core__mtd_aggregated_metrics`: it includes the new 3 deal metrics and changes the source of the already existing 3 deal metrics from `mtd_booking_metrics` to the new `mtd_deal_metrics`
- `int_core__mtd_booking_metrics`: removes all code needed to compute the remaining deal metrics, speeding it up considerably.
After this PR, the mtd models run (locally) at the following speed:
- `int_core__mtd_accommodation_lifecycle`: 47 sec
- `int_core__mtd_deal_lifecycle`: 3 sec
- `int_core__mtd_accommodation_metrics`: 5 sec
- `int_core__mtd_deal_metrics`: < 1 sec
- `int_core__mtd_booking_metrics`: 8 sec (quite a reduction)
- `int_core__mtd_guest_journey_metrics`: 5 sec
- `int_core__mtd_aggregated_metrics` and `core__mtd_aggregated_metrics`: < 1 sec
Related work items: #17312
This PR will compute the listing metrics in an aggregated manner to be displayed in the Main KPIs dashboard, specifically:
- New Listings
- First Time Booked Listings
- Churning Listings
- It also adapts the computation for the already existing metrics of Listings Booked in X months
At code level, it contains the following:
- Adds `int_core__mtd_accommodation_metrics`, which computes the aggregation of the lifecycle of listings at date level (unique), being date the corresponding date from `int_dates_mtd`
- Changes `int_core__mtd_aggregated_metrics` to take the accommodation metrics from the new model. Those 3 already existing (Listings booked in X month) now read from the new model as well.
- Changes `int_core__mtd_booking_metrics` to remove unused computation, making it lighter. Specifically, it removes 1) listing related metrics, since now we have a dedicated model and 2) number of guests booked, since it's not used at all.
The resulting values in local are consistent with what is already reported in the staging report.
Related work items: #17312
Adding int_core__accommodation
Includes both:
- Main information of the accommodation, mostly coming from stg_core__accommodation and int_core__country.
- Listing lifecycle computation, based on the created bookings from stg_core__bookings. It's just the current state, no history.
Some considerations:
- I opted to use stg_core__bookings and not int_core__bookings. Main reason is in case at some point we want to add listing-based information to the booking table, it would avoid cyclic references.
- I opted to keep all the logic of 1) accommodation info and 2) lifecycle in the same model. This could be easily split into: lifecycle first that reads uniquely from staging and then the int_core__accommodation that could read from the staging version to retrieve accommodation attributes + the lifecycle one. Up to you
I'd suggest to review first the documentation in schema since it explains the logic applied.
Notion page linked to this task: https://www.notion.so/knowyourguest-superhog/Listing-lifecycle-4dc0311b21ca44f8859969e419872ebd
Related work items: #17312