Commit graph

26 commits

Author SHA1 Message Date
Oriol Roqué Paniagua
78707ef649 Merged PR 4352: Bulk update dash_source to business_scope
# Description

Changes:
* Switches dash_source to business_scope in all kpis models, schema and macro configuration.

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #27356
2025-02-11 16:19:33 +00:00
Oriol Roqué Paniagua
dffb1d5a1e Merged PR 4336: MTD and Monthly Verification Requests dependant KPIs for Dash Source
# Description

Changes:
* Adds Dash Source in Monthly and MTD Metric models for Verification Request dependant KPIs
* Propagates to Aggregated Monthly and MTD models for VR dependant KPIs
* Eliminates unused Weekly Guest Payment models.

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #27356
2025-02-11 08:47:48 +00:00
Oriol Roqué Paniagua
b569db3468 Merged PR 4298: Split Check-Out bookings by Dash Source
# Description

Adds new category Dash Source for Check-Out bookings KPIs

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #27356
2025-02-06 10:53:24 +00:00
Joaquin Ossa
f3201b3c77 changed to cube instead of rollup 2025-01-16 14:21:00 +01:00
Joaquin Ossa
54df02cac8 Addressed comments 2024-12-02 11:06:09 +01:00
Oriol Roqué Paniagua
42d70f72d5 Merged PR 3667: Creates Chargeable metrics for New Dash KPIs
# Description

Creates the models for KPIs for New Dash - Chargeable metrics. In essence, computes 4 metrics:
- Chargeable Services
- Chargeable Amount (in GBP)
- Chargeable Bookings (unique over a time period and dimension, not additive)
- Chargeable Listings (unique over a time period and dimension, not additive)

This is done by creating:
- A Weekly Metric and Monthly Metric model. Here we keep the granularity of id_booking / id_accommodation to be able to compute the uniqueness.
- A Daily, Weekly and Monthly Aggregated models. Same as usual.
- Integrates everything to the existing model for Product New Dash Agg Metrics
- Exposes everything into reporting

NB: I removed on "by_host" in Created Services - I forgot to clear it out.

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #20809
2024-11-26 14:19:41 +00:00
Oriol Roqué Paniagua
b0bed479c7 Merged PR 3625: Excludes new dash users without id deal
# Description

This PR has 2 commits:
- The first one handles the removal from the computation any user that has not an id deal properly set. I just created a boolean field in int_core__user_host that identifies if the host has no id_deal. Then apply the new condition in the 2 main usages of New Dash info.
- The second one cleans New Dash KPIs. Since we do not have anymore users without deals, it means that the identification of the host/account is going to be exactly the same if done by id_user_host or id_deal. I hated having id_user_host in KPIs so I've removed it :)

Lastly, this should speed up massively the execution. Not because there's improvements on the code but rather the reduction of data.

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #20809
2024-11-21 16:30:47 +00:00
Oriol Roqué Paniagua
c23380583b Merged PR 3605: Beautification of Reporting String values
# Description

Creates a macro for beautification of categorical values.

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #20809
2024-11-20 11:01:22 +00:00
Oriol Roqué Paniagua
86c81c1f21 Merged PR 3599: New Dash KPIs skeleton with Created Services
# Description

This PR handles the computation of KPIs for New Dash, focusing on Created Services.

New dimensions configured in `business_kpis_configuration` and applied in this new models for `NEW_DASH_CREATED_SERVICES`:
* `dim_host`,
* `dim_has_upgraded_service`,
* `dim_new_dash_version`,
* `dim_pricing_service`

New daily metric model `int_kpis__metric_daily_new_dash_created_services`
* Follows a similar pattern as for the rest of daily metric models. The only difference is that is aggregated to `id_booking` to ensure we can handle count distinct of bookings per different time granularities.
* Reads from the new pricing tables `int_core__booking_summary` and `int_core__booking_service_detail`. The main filters applied are selecting only new dash users and only services created after the user move timestamp to new dash.

An additional metric model at monthly level is created `int_kpis__metric_monthly_new_dash_created_services`

These finally go to a dimension aggregated model (`dimension`, `dimension_value`), respectively:
* Daily: `int_kpis__agg_daily_new_dash_created_services`
* Monthly: `int_kpis__agg_monthly_new_dash_created_services`

A final model aims to aggregate the different dimension aggregated metrics for New Dash: `int_kpis__product_new_dash_agg_metrics`
* It computes a `time_granularity` aggregation
* Here I will add additional metrics (such as revenue) once we have them.

A final model reading from the previous is exposed to reporting: `kpis__product_new_dash_agg_metrics`

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #20809
2024-11-20 09:43:30 +00:00
Oriol Roqué Paniagua
9ba0edb82d Merged PR 3482: Adapts date dimension skeleton for Main KPIs
# Description

New model:
* int_kpis__agg_dates_main_kpis - Serves as the skeleton of dates and dimensions for Main KPIs. It's aggregated since it follows a similar aggregation strategy. It's a single model to feed both Main KPIs visualisations. Note boolean fields are real booleans (true/false) while before these were integers (1/0). This also affects downstream models.

Main KPIs flow adaptations to new skeleton model:
* int_monthly_aggregated_metrics_history_by_deal
* int_monthly_churn_metrics - additionally, calls usual KPIs macro instead of old one
* int_mtd_vs_previous_year_metrics

Reporting changes to ensure report is not down:
* mtd_aggregated_metrics - adaptations on booleans (true-1, false-0)

Cleaning:
* get_kpi_dimensions macro is no longer used
* int_dates_by_deal model and schema entry
* int_dates_mtd_by_dimension model and schema entry
* int_dates_mtd model and schema entry

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #23763
2024-11-11 15:57:37 +00:00
Joaquin Ossa
34ccd894de renamed test entity 2024-11-07 12:56:49 +01:00
Joaquin Ossa
c89c72aace commit agg_monthly_guest_payments 2024-11-07 12:09:20 +01:00
Oriol Roqué Paniagua
8c23f91242 Merged PR 3451: Adds Deal Daily Lifecycle and metrics
# Description

Changes:
* Creates lifecycle_daily_deal, metric_daily_deals and agg_daily_deals. These follow a different strategy due to the nature of the metrics
* Modifies the dimension macro to ensure deal dimension is included in all models except these ones
* Fixes production issue on currently deployed deal lifecycle and metrics

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #23566
2024-11-07 10:49:06 +00:00
Joaquin
d952613fd3 Fixed names 2024-11-04 15:28:31 +01:00
Joaquin
64c3c6d1f0 Addressed comments 2024-11-04 14:29:26 +01:00
Joaquin
19337a9918 save all guest journey check in attributed models 2024-11-02 11:28:22 +01:00
Joaquin
3e5fd5a761 save models wip 2024-11-02 07:02:58 +01:00
Oriol Roqué Paniagua
875f91be26 Merged PR 3329: First version of KPIs refactored - created bookings
# Description

Creates skeleton for new KPIs data flow for created_bookings metric. Details are accessible [here](https://www.notion.so/knowyourguest-superhog/KPIs-Refactor-Let-s-go-daily-2024-10-23-1280446ff9c980dc87a3dc7453e95f06?pvs=4#12a0446ff9c98085bf4dfc77f6fc22f7)

In essence:
* Models are created in intermediate in a kpis folder.
* Models have a daily segmentation. This includes `created_bookings` models, but also the daily lifecycle per listing and the segmentation. It also adds a `dimension_dates` model specific for KPIs. These have all the dimensions already in place and handle all the crazy logic.
* Other time aggregation models simply read from existing daily models which are much easier (`int_kpis__metric_mtd_created_bookings` and `int_kpis__metric_monthly_created_bookings`).
* Dimensionality aggregation can be easily added within a given timeframe (daily, mtd, monthly). For instance, I do it for mtd in the `int_kpis__aggregated_mtd_created_bookings` and for monthly in `int_kpis__aggregated_monthly_created_bookings`
* Macro configuration for dimensions: Allows to set any specific dimension for `aggregated` models. By default, the subset of global, by billing country, by number of listings and by deal apply - since these are needed for Main KPIs. I added an example with Dash Source, that currently does not exist and it's currently configured as only appearing for created bookings.
* Testing `aggregated` models completeness. A new macro called `assert_dimension_completeness` is available that ensures additive metrics are consistent vs. the global result, configurable at schema level.
* Testing refactor impact. I'm aware that changing the lifecycle model to daily impacts the volumes for listing segments. For the rest, I added a `tmp` test that checks that the dimension and dimension value per date exactly match comparing new vs. old computation.

Latest edits:
* Changed naming convention
* Split of MTD and Monthly. Now these are 2 different entities, as stated in `int_kpis__dimension_dates`.
* Added start_date and end_date for models that contemplate a range (mtd, monthly).
* Added a small readme entry in the kpis folders. Mostly it states nomenclature and some first conventions.

Dbt docs:
![image (5).png](https://guardhog.visualstudio.com/4148d95f-4b6d-4205-bcff-e9c8e0d2ca65/_apis/git/repositories/54ac356f-aad7-46d2-b62c-e8c5b3bb8ebf/pullRequests/3329/attachments/image%20%285%29.png)

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [ ] I have checked for DRY opportunities with other models and docs. **Likely we'll be able to add macros for mtd and dim_agg models. We will see later on.**
- [ ] I've picked the right materialization for the affected models. **Models run ok except for the daily lifecycle of listings, which lasts several minutes in the first run. Model curr...
2024-10-30 08:55:19 +00:00
uri
836be7903b Fix config - again 2024-09-05 16:34:17 +02:00
uri
05058f9e05 Fix - wrong config 2024-09-05 16:18:49 +02:00
uri
2662527790 Enable By Billing Country dimension in prod 2024-09-05 16:17:05 +02:00
Oriol Roqué Paniagua
435db55c1e Merged PR 2743: Fixes deal-based issues on the billing country dimension
# Description

Before deploying KPIs by Billing Country, we spotted some issues that were basically increases on the volumes of any metric on the by billing country dimension that was based on Deal. This means, `int_core__mtd_deal_metrics` and `int_xero__mtd_invoicing_metrics`.

This PR changes the following:
* Now the 2 abovementioned models depend on the `int_core__deal` model, instead of `int_core__user_host` (thus removing duplicated stuff)
* Now all models use the main billing country at deal level, instead of doing it so at host level. The reason is that some small amount of hosts that share the same deal can have a different billing country. To avoid weird stuff, everything points to this simplification - that in general, it's not a massive change in the output.
* In order to do so easily, the 3 main billing country per deal fields have been propagated to `int_core__user_host`

To exemplify the solution, find here a snapshot of the differences in behavior:

```
select
    dimension,
    sum(deals_booked_in_month) as deals_booked_1,
    sum(deals_booked_in_6_months) as deals_booked_6,
    sum(deals_booked_in_12_months) as deals_booked_12,
    sum(total_revenue_in_gbp) as total_revenue,
    sum(xero_operator_net_fees_in_gbp) as operator_revenue,
    sum(xero_booking_net_fees_in_gbp) as booking_fees,
    sum(xero_listing_net_fees_in_gbp) as listing_fees,
    sum(xero_verification_net_fees_in_gbp) as verification_fees,
    sum(total_guest_revenue_in_gbp) as guest_revenue,
    sum(xero_waiver_paid_back_to_host_in_gbp) as waiver_paid_back_to_hosts,
    sum(waiver_net_fees_in_gbp) as waiver_net_fees
from intermediate.int_mtd_vs_previous_year_metrics
where date in ('2024-01-31')
group by 1
order by 1
```
Production:
![image.png](https://guardhog.visualstudio.com/4148d95f-4b6d-4205-bcff-e9c8e0d2ca65/_apis/git/repositories/54ac356f-aad7-46d2-b62c-e8c5b3bb8ebf/pullRequests/2743/attachments/image.png)

vs.
Local:
![image (2).png](https://guardhog.visualstudio.com/4148d95f-4b6d-4205-bcff-e9c8e0d2ca65/_apis/git/repositories/54ac356f-aad7-46d2-b62c-e8c5b3bb8ebf/pullRequests/2743/attachments/image%20%282%29.png)

Keep in mind that still Global dimension can be greater than any other dimension aggregated since not all users have a deal. Mismatches between the other 2 dimensions might be linked to the dump.

Commits are meaningful and help navigate in the changes.

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #20823
2024-09-05 09:53:16 +00:00
Oriol Roqué Paniagua
556a52e991 Merged PR 2689: KPIs by Billing Country
# Description

Adds Billing Country dimension in KPIs, but does not expose them to reporting yet.
Silly thing, based on the macros I built, I cannot make incremental changes unless changing all models. This will need to be adapted, happy to hear your thoughts on how we do it.
Additionally, I have lack of performance of the model `mtd_guest_payments_metrics`. It takes around 5 min to execute, but technically the end-to-end runs in one shoot without breaking.
It's a complex PR because it changes many files, but you will see that:
* It mostly changes the join conditions for the dimensions or the schema tests,
* I tried to be very careful and add things step-by-step in the commits.

Goal is NOT to complete the PR yet until we see how we can improve performance. I can say though that data end-to-end looks ok to me, but would benefit from checking with production data for the new dimension

Update 30th Aug
* Added a new commit that includes `id_user_host` in `int_core__verification_payments`. Happy to discuss if it makes sense or not. But it changes the execution from ~600 sec to ~6 sec because it avoids a massive repeated join with `verification_requests`.

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [ ] I've picked the right materialization for the affected models. **To check because of performance issues**

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #19082
2024-09-04 10:17:12 +00:00
Oriol Roqué Paniagua
3d09d04068 Merged PR 2624: Configuring by number of listings to prod
# Description

Sets the parameter to display the KPIs by number of listings to prod. I will move forward without the review as we need a simultaneous deployment. The combination of changes were reviewed yesterday in local.

# Checklist

- [ ] The edited models and dependants run properly with production data.
- [ ] The edited models are sufficiently documented.
- [ ] The edited models contain PK tests, and I've ran and passed them.
- [ ] I have checked for DRY opportunities with other models and docs.
- [ ] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #19325
2024-08-22 06:49:15 +00:00
Oriol Roqué Paniagua
85131985d8 Merged PR 2615: Beautification of KPIs dimensions
# Description

Changes:

* Separate 1) the internal naming of dimensions available within DWH vs. 2) the display of the dimensions in the reporting. Mainly it changes the "by_number_of_listings" to display "By # of Listings Booked in 12 Months". I edited the production macro since to me it's linked to when things are available for display.
* Add preceding zeros on the segmentation so it's ordered correctly. Before, the segment 21-60 was displayed before the 6-20.
* Also added some capital letters to the schema config of the reporting model :)

I attach a screenshot of how it looks in PBI in my local development branch to exemplify why this is "Beautification". Be aware that merging this also puts in production the dimensions.

![image.png](https://guardhog.visualstudio.com/4148d95f-4b6d-4205-bcff-e9c8e0d2ca65/_apis/git/repositories/54ac356f-aad7-46d2-b62c-e8c5b3bb8ebf/pullRequests/2615/attachments/image.png)

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #19325
2024-08-21 14:42:05 +00:00
Oriol Roqué Paniagua
997cb85c6c Merged PR 2577: Adding get_kpi_dimensions_for_production
# Description

Takes into account @<Pablo Martín> 's feedback from the previous PR, slightly modified. This PR separates 1) the dimensions while developing vs. 2) the dimensions once these are available for production. This are within the same file of macro configuration for KPIs, namely `business_kpis_configuration`.

End-goal, all CTEs in `int_mtd_vs_previous_year_metrics` will read from this new macro `get_kpi_dimensions_for_production`, so eventually we won't need any hardcode once we want to add new dimensions. In the meantime, I'll be adding this new line for each PR (still 2 missing :D)

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [ ] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [ ] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #19325
2024-08-19 09:57:28 +00:00
Renamed from macros/get_kpi_dimensions.sql (Browse further)