Merged PR 5677: Athena/Guesty high risk clients

# Description

* Adds the new snapshot for Guesty Claims, up to 1st July 2025.
* Creates a model named int_athena__high_risk_client_detector that handles the following logic:

1. The User has been using the agreed services for at least (3) months
2. The aggregated number of claims filed by the User exceeds a total of £2300
3. The User has filed at least (5) claims
4. The User has a claim ration of (7%) or higher throughout their entire use of agreed services, including any claim that has received a guarantee payment

It's heavily opinionated due to lack of clear requirements and lack of data quality, both in athena verifications and guesty claims. Please, check the inline comments for more info.

With these model and conditions, only 2 users would be tagged as high risk.

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #31687
This commit is contained in:
Oriol Roqué Paniagua 2025-07-11 10:28:24 +00:00
parent ddc0a6a3f4
commit bc3a364891
4 changed files with 1810 additions and 0 deletions

View file

@ -427,3 +427,48 @@ seeds:
Name of the hubspot account owner.
data_tests:
- not_null
- name: stg_seed__guesty_resolutions_snapshot_20250701
description: |
A snapshot of Guesty Resolutions data as of 2025-07-01.
This is a static snapshot and we currently have no intent of maintaining up to date.
The data was shared by Chloe from Resolutions in a static file.
The fields described are those that are used in following models.
columns:
- name: id_booking
data_type: character varying
description: |
The internal ID of this booking in Guesty. Matches with the booking ID
in the Guesty verifications table.
It can contain duplicated bookings, and this is out of our scope.
It cannot be null.
data_tests:
- not_null
- name: claim_date
data_type: character varying
description: |
When was the claim received by Truvi, in format dd/mm/yyyy.
It cannot be null.
data_tests:
- not_null
- name: claim_amount
data_type: character varying
description: |
The amount of the claim in the currency specified in claim_currency.
It's text by default since it might contain data quality issues.
The conversion to decimal is done in dependant models.
It cannot be null.
data_tests:
- not_null
- name: claim_currency
data_type: character varying
description: |
The currency specified in the claim amount.
It cannot be null.
data_tests:
- not_null

File diff suppressed because it is too large Load diff