# Description * Adds the new snapshot for Guesty Claims, up to 1st July 2025. * Creates a model named int_athena__high_risk_client_detector that handles the following logic: 1. The User has been using the agreed services for at least (3) months 2. The aggregated number of claims filed by the User exceeds a total of £2300 3. The User has filed at least (5) claims 4. The User has a claim ration of (7%) or higher throughout their entire use of agreed services, including any claim that has received a guarantee payment It's heavily opinionated due to lack of clear requirements and lack of data quality, both in athena verifications and guesty claims. Please, check the inline comments for more info. With these model and conditions, only 2 users would be tagged as high risk. # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #31687
474 lines
16 KiB
YAML
474 lines
16 KiB
YAML
version: 2
|
|
|
|
seeds:
|
|
- name: stg_seed__currencies
|
|
description: |
|
|
A list of valid current currencies according to ISO 4217.
|
|
|
|
The list was obtained from https://www.six-group.com/en/products-services/financial-information/data-standards.html#scrollTo=isin
|
|
config:
|
|
column_types:
|
|
iso_4217_numeric_code: varchar(3)
|
|
columns:
|
|
- name: iso_4217_code
|
|
data_type: character varying
|
|
description: The 3 character ISO 4217 code for this currency, in Uppercase.
|
|
data_tests:
|
|
- not_null
|
|
- dbt_expectations.expect_column_values_to_match_regex:
|
|
regex: "^[A-Z]{3}$"
|
|
- name: iso_4217_numeric_code
|
|
data_type: character varying
|
|
description: The 3 digit ISO 4217 numeric code for this currency.
|
|
data_tests:
|
|
- not_null
|
|
- dbt_expectations.expect_column_values_to_match_regex:
|
|
regex: "^[0-9]{3}$"
|
|
- name: decimal_positions
|
|
data_type: int
|
|
description: |
|
|
The decimal positions that lead to this currency smallest unit.
|
|
|
|
For example: since Japanese Yen (JPY) have no cents, this value is 0.
|
|
|
|
On the other hand, since the US Dollar (USD) is composed of cents, and each dollar equals 100 cent, this value is 2.
|
|
|
|
To convert from normal unit (Dollar) to smallest unit (Cent), multiply by `10^decimal_positions`.
|
|
To convert from smallest unit (Cent) to normal unit (Dollar), divide by `10^decimal_positions`.
|
|
data_tests:
|
|
- not_null
|
|
- dbt_expectations.expect_column_values_to_be_between:
|
|
min_value: 0
|
|
max_value: 8
|
|
strictly: False
|
|
|
|
- name: stg_seed__guest_services_vat_rates_by_country
|
|
description: |
|
|
A list of applicable VAT rates for guest services, by country.
|
|
|
|
The list was provided by the Finance team. A value of 0% does not
|
|
necessarily mean that the country doesn't have VAT, but rather that we
|
|
don't need to charge it to guests from that country.
|
|
|
|
Country names and codes _almost_ follow ISO 3166-1 (https://en.wikipedia.org/wiki/ISO_3166-1).
|
|
The only exception sits in the Kosovo record. Kosovo does not appear as a country in ISO 3166, but is nevertheless
|
|
a valid country in the `Country` table of the Superhog backend database. Because of this, we need to include it.
|
|
The present codes are made up (not truly ISO 3166 codes) and match the ones present in the backend.
|
|
|
|
Read more here: https://www.notion.so/knowyourguest-superhog/Guest-Services-Taxes-How-to-calculate-a5ab4c049d61427fafab669dbbffb3a2?pvs=4
|
|
|
|
config:
|
|
column_types:
|
|
country_code: varchar(3)
|
|
columns:
|
|
- name: country_name
|
|
data_type: character varying
|
|
description: The name of the country.
|
|
|
|
data_tests:
|
|
- not_null
|
|
- unique
|
|
- name: alpha_2
|
|
data_type: character varying
|
|
description: |
|
|
The two characters ISO 3166-1 Alpha-2 code for the country.
|
|
|
|
data_tests:
|
|
- not_null
|
|
- unique
|
|
- dbt_expectations.expect_column_values_to_match_regex:
|
|
regex: "^[A-Za-z]{2}$"
|
|
- name: alpha_3
|
|
data_type: character varying
|
|
description: |
|
|
The three characters ISO 3166-1 Alpha-3 code for the country.
|
|
|
|
data_tests:
|
|
- not_null
|
|
- unique
|
|
- dbt_expectations.expect_column_values_to_match_regex:
|
|
regex: "^[A-Za-z]{3}$"
|
|
- name: country_code
|
|
data_type: character varying
|
|
description: |
|
|
The three digit ISO 3166-1 Numeric code for the country.
|
|
|
|
data_tests:
|
|
- not_null
|
|
- unique
|
|
- dbt_expectations.expect_column_values_to_match_regex:
|
|
regex: "^[0-9]{3}$"
|
|
- name: vat_rate
|
|
data_type: numeric
|
|
description: |
|
|
The Superhog applicable VAT rate for guests of this country. A value
|
|
of 0% does not necessarily mean that the country doesn't have VAT, but
|
|
rather that we don't need to charge it to guests from that country.
|
|
|
|
data_tests:
|
|
- not_null
|
|
- dbt_expectations.expect_column_values_to_be_between:
|
|
min_value: 0
|
|
max_value: 1
|
|
strictly: false
|
|
|
|
- name: stg_seed_guesty_claims_snapshot_20241010
|
|
description: |
|
|
A list of claims that have been paid out within the Athena/Guesty line of
|
|
business.
|
|
|
|
The data was shared on 2024-10-10 by Chloe from Resolutions in a static
|
|
file, and was added to the DWH to support this ticket: https://guardhog.visualstudio.com/Data/_boards/board/t/Data%20Team/Stories/?workitem=22703
|
|
This is a static snapshot and we currently have no intent of maintaining up to date.
|
|
|
|
columns:
|
|
- name: "Booking ID"
|
|
data_type: character varying
|
|
description: |
|
|
The internal ID of this booking in Athena. Matches with the booking ID
|
|
in the Athena verifications table.
|
|
|
|
- name: "Claim Date"
|
|
data_type: timestamp
|
|
description: When was the claim received by Superhog.
|
|
|
|
- name: "Settled Date"
|
|
data_type: timestamp
|
|
description: |
|
|
When was the outcome of the claim decided by Superhog. Do not confuse
|
|
with when was the payment executed or received.
|
|
|
|
- name: "Paid Date"
|
|
data_type: timestamp
|
|
description: |
|
|
When was the settlement amount payment executed by Superhog.
|
|
|
|
- name: Settlement Currency
|
|
data_type: character varying
|
|
description: ISO4217 code of the currency in which the claim was posted.
|
|
|
|
- name: Settlement Amount
|
|
data_type: numeric
|
|
description: |
|
|
How much Superhog decided to pay out to the partner as part of this
|
|
claim, defined in the settlement currency.
|
|
|
|
- name: stg_seed__athena_price_history
|
|
description: |
|
|
A price history for the Athena fee per night.
|
|
|
|
Yes, I know. It's terrible that we keep this here. Oh boy, how I wish it
|
|
wasn't like this!
|
|
|
|
columns:
|
|
- name: start_at_utc
|
|
data_type: timestamp
|
|
description: |
|
|
The start of the time range where this record is applicable.
|
|
|
|
- name: end_at_utc
|
|
data_type: timestamp
|
|
description: The end of the time range where this record is applicable.
|
|
|
|
- name: fee_per_night_gbp
|
|
data_type: numeric
|
|
description: |
|
|
How much we charge per night in this time range.
|
|
|
|
- name: stg_seed__accounting_aggregations
|
|
description: |
|
|
Account codes and their respective aggregations for reporting purposes.
|
|
config:
|
|
column_types:
|
|
account_code: varchar(3)
|
|
|
|
columns:
|
|
- name: account_code
|
|
data_type: character varying
|
|
description: |
|
|
The account code for this aggregation. This is the code that is used
|
|
in the accounting system.
|
|
|
|
data_tests:
|
|
- not_null
|
|
- unique
|
|
- dbt_expectations.expect_column_values_to_match_regex:
|
|
regex: "^[0-9]{3}$"
|
|
|
|
- name: root_aggregation
|
|
data_type: character varying
|
|
description: |
|
|
The root aggregation for this account code. This is the main
|
|
aggregation that is used to retrieve low-level data.
|
|
|
|
data_tests:
|
|
- not_null
|
|
- accepted_values:
|
|
values:
|
|
- Other Invoiced Revenue
|
|
- Verification Fees
|
|
- Listing Fees
|
|
- Old Dashboard Booking Fees
|
|
- Athena API
|
|
- E-Deposit API
|
|
- Check in Hero API
|
|
- Screen and Protect API
|
|
- Guesty Resolutions
|
|
- Basic Protection
|
|
- Waiver Pro
|
|
- Id Verification
|
|
- Protection Plus
|
|
- Screening Plus
|
|
- Sex Offenders Check
|
|
- Protection Pro
|
|
- Basic Screening
|
|
- Damage Host-Waiver Payments
|
|
- Damage Waiver Fees
|
|
- Deposit Fees
|
|
- Check In Cover
|
|
- Resolution Process for Protection Services
|
|
- Resolution Process for Deposit Management Services
|
|
- Basic Waiver
|
|
- Waiver Plus
|
|
- Basic Damage Deposit
|
|
- Host Resolutions Payments
|
|
- Damage Waiver - Truvi Risk
|
|
- Confident Stay
|
|
- Flex API
|
|
|
|
- name: kpis_aggregation
|
|
data_type: character varying
|
|
description: |
|
|
The default macro-aggregation for Invoiced KPIs.
|
|
|
|
data_tests:
|
|
- not_null
|
|
- accepted_values:
|
|
values:
|
|
- Unknown
|
|
- Invoiced Operator Revenue
|
|
- Invoiced API Revenue
|
|
- Damage Host-Waiver Payments
|
|
- Accounting Resolutions
|
|
- Accounting Guest Revenue
|
|
- Host Resolutions Payments
|
|
|
|
- name: financial_l1_aggregation
|
|
data_type: character varying
|
|
description: |
|
|
The Level 1 aggregation for Financial reporting.
|
|
|
|
data_tests:
|
|
- not_null
|
|
- accepted_values:
|
|
values:
|
|
- Unknown
|
|
- 1-Guest Screening and Protection
|
|
- 2-Deposit Management
|
|
- 4-Mediation and Resolution
|
|
- 3-Guest Products
|
|
- 5-Damage Host-Waiver Payments
|
|
|
|
- name: financial_l2_aggregation
|
|
data_type: character varying
|
|
description: |
|
|
The Level 2 aggregation for Financial reporting.
|
|
|
|
data_tests:
|
|
- not_null
|
|
- accepted_values:
|
|
values:
|
|
- Unknown
|
|
- 10-Other Invoiced Revenue
|
|
- 11-Booking Fees
|
|
- 12-Listing Fees
|
|
- 13-Verification Fees
|
|
- 14-Athena API
|
|
- 15-E-Deposit API
|
|
- 16-Screening Services
|
|
- 17-Protection Services
|
|
- 18-Screen and Protect API
|
|
- 19-Waiver Pro
|
|
- 20-Deposit Management Services
|
|
- 21-Deposit Management Services
|
|
- 31-Check In Cover
|
|
- 32-Check in Hero API
|
|
- 34-Flex API
|
|
- 33-Damage Waiver - Truvi Risk
|
|
- 35-Confident Stay
|
|
- 41-Guesty Resolutions
|
|
- 42-Host Resolutions
|
|
- 43-E-Deposit Resolutions
|
|
- 44-Check In Hero Resolutions
|
|
- 45-Screen and Protect API - Resolution
|
|
- 51-Damage Host-Waiver Payments
|
|
|
|
- name: financial_l3_aggregation
|
|
data_type: character varying
|
|
description: |
|
|
The Level 3 aggregation for Financial reporting.
|
|
|
|
data_tests:
|
|
- not_null
|
|
- accepted_values:
|
|
values:
|
|
- Unknown
|
|
- 100-Other Invoiced Revenue
|
|
- 111-Booking Fees
|
|
- 121-Listing Fees
|
|
- 131-Verification Fees
|
|
- 141-Athena API
|
|
- 151-E-Deposit API
|
|
- 161-Basic Screening
|
|
- 162-Screening Plus
|
|
- 163-Id Verification
|
|
- 164-Sex Offenders Checks
|
|
- 171-Basic Protection
|
|
- 172-Protection Plus
|
|
- 173-Protection Pro
|
|
- 174-Resolution Process for Protection Services
|
|
- 181-Screen and Protect API
|
|
- 191-Waiver Pro
|
|
- 201-Deposit Fees
|
|
- 210-Damage Waiver Fees
|
|
- 211-Basic Waiver
|
|
- 212-Waiver Plus
|
|
- 213-Waiver Pro
|
|
- 214-Basic Damage Deposit
|
|
- 215-Resolution Process for Deposit Management Services
|
|
- 311-Check In Cover
|
|
- 321-Check in Hero API
|
|
- 331-Damage Waiver - Truvi Risk
|
|
- 341-Flex API
|
|
- 351-Confident Stay
|
|
- 411-Guesty Resolutions
|
|
- 421-Host Resolutions
|
|
- 431-E-Deposit Resolutions
|
|
- 441-Check In Hero Resolutions
|
|
- 451-Screen and Protect API - Resolution
|
|
- 511-Damage Host-Waiver Payments
|
|
|
|
- name: stg_seed__main_metrics_targets
|
|
description: |
|
|
A list of financial year targets for the main metrics that we track in the company.
|
|
data_tests:
|
|
- dbt_utils.unique_combination_of_columns:
|
|
combination_of_columns:
|
|
- id_metric
|
|
- target_date
|
|
|
|
columns:
|
|
- name: id_metric
|
|
data_type: bigint
|
|
description: The id of the metric used for joining with other tables.
|
|
data_tests:
|
|
- not_null
|
|
|
|
- name: metric_name
|
|
data_type: character varying
|
|
description: The name of the metric for human consumption
|
|
data_tests:
|
|
- not_null
|
|
|
|
- name: target_date
|
|
data_type: date
|
|
description: |
|
|
The date when this target is expected to be achieved.
|
|
data_tests:
|
|
- not_null
|
|
|
|
- name: target_eom_value
|
|
data_type: numeric
|
|
description: |
|
|
The EOM target value for this metric. This is the value that we aim to
|
|
achieve by the end of the month.
|
|
data_tests:
|
|
- not_null
|
|
|
|
- name: target_ytd_value
|
|
data_type: numeric
|
|
description: |
|
|
The YTD target value for this metric. This is the cummulative value that we
|
|
aim to achieve by the end of each month with respect to the beginning of the
|
|
financial year, that will put us to reach the EOFY target.
|
|
data_tests:
|
|
- not_null
|
|
|
|
- name: target_eofy_value
|
|
data_type: numeric
|
|
description: |
|
|
The EOFY target value for this metric. This is the value that we aim to
|
|
achieve by the end of the financial year.
|
|
data_tests:
|
|
- not_null
|
|
|
|
- name: stg_seed__hubspot_account_owner
|
|
description: |
|
|
A seed that converts the id_hubspot_account_owner to the
|
|
person's name for Hubspot onboarding purposes.
|
|
To be revisited; ideally this is a standalone Hubspot
|
|
property.
|
|
config:
|
|
column_types:
|
|
id_hubspot_account_owner: varchar
|
|
|
|
columns:
|
|
- name: id_hubspot_account_owner
|
|
data_type: character varying
|
|
description: |
|
|
ID of the hubspot account owner.
|
|
data_tests:
|
|
- not_null
|
|
- unique
|
|
|
|
- name: hubspot_account_owner
|
|
data_type: character varying
|
|
description: |
|
|
Name of the hubspot account owner.
|
|
data_tests:
|
|
- not_null
|
|
|
|
- name: stg_seed__guesty_resolutions_snapshot_20250701
|
|
description: |
|
|
A snapshot of Guesty Resolutions data as of 2025-07-01.
|
|
This is a static snapshot and we currently have no intent of maintaining up to date.
|
|
The data was shared by Chloe from Resolutions in a static file.
|
|
|
|
The fields described are those that are used in following models.
|
|
|
|
columns:
|
|
- name: id_booking
|
|
data_type: character varying
|
|
description: |
|
|
The internal ID of this booking in Guesty. Matches with the booking ID
|
|
in the Guesty verifications table.
|
|
It can contain duplicated bookings, and this is out of our scope.
|
|
It cannot be null.
|
|
data_tests:
|
|
- not_null
|
|
|
|
- name: claim_date
|
|
data_type: character varying
|
|
description: |
|
|
When was the claim received by Truvi, in format dd/mm/yyyy.
|
|
It cannot be null.
|
|
data_tests:
|
|
- not_null
|
|
|
|
- name: claim_amount
|
|
data_type: character varying
|
|
description: |
|
|
The amount of the claim in the currency specified in claim_currency.
|
|
It's text by default since it might contain data quality issues.
|
|
The conversion to decimal is done in dependant models.
|
|
It cannot be null.
|
|
data_tests:
|
|
- not_null
|
|
|
|
- name: claim_currency
|
|
data_type: character varying
|
|
description: |
|
|
The currency specified in the claim amount.
|
|
It cannot be null.
|
|
data_tests:
|
|
- not_null
|