data-dwh-dbt-project/seeds/schema.yml
Oriol Roqué Paniagua f799a8d30f Merged PR 4727: Bookings fees are now Old Dashboard Booking Fees in Main KPIs
# Description

Booking fees is widely used with different meanings, for old dash, for new dash, for both, etc. This is painful. First step to align on a proper naming is ensure that what we report in Main KPIs is clearly stated, which in this case, Booking Fees are now called Old Dashboard Booking Fees.

Changes:
* Modify `stg_seed__accounting_aggregations` seed to rename Booking Fees to Old Dashboard Booking Fees. This is for us to clarify. This is only applied for KPIs compute. I also added an empty space that I mistakenly forgot in the past for `financial_l3_aggregation`.
* Modify KPIs source, i.e., `int_kpis__metric_daily_invoiced_revenue`. Here I forcefully modify the name of the field to `xero_old_dashboard_booking_net_fees_in_gbp`.
* Propagate changes of downstream usages of `xero_booking_net_fees_in_gbp` to `xero_old_dashboard_booking_net_fees_in_gbp`. This affects all models, including the reporting model. On this one we still have both names to avoid breaking it. I will need to modify the data glossary in PBI anyway so I'll do this change as well.
* Modify displayed metric name from Booking Fees Revenue to Old Dashboard Booking Fees Revenue.
* Modify schema so it reflects the proper names, descriptions, and tests.
* Ensure outlier and completion tests still work after this change.

I confirm the field `xero_booking_net_fees_in_gbp` does not exist anymore in the rest of DWH after these changes, except for the abovementioned comment on the reporting line.

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #28560
2025-03-18 14:55:32 +00:00

375 lines
13 KiB
YAML

version: 2
seeds:
- name: stg_seed__currencies
description: |
A list of valid current currencies according to ISO 4217.
The list was obtained from https://www.six-group.com/en/products-services/financial-information/data-standards.html#scrollTo=isin
config:
column_types:
iso_4217_numeric_code: varchar(3)
columns:
- name: iso_4217_code
data_type: character varying
description: The 3 character ISO 4217 code for this currency, in Uppercase.
data_tests:
- not_null
- dbt_expectations.expect_column_values_to_match_regex:
regex: "^[A-Z]{3}$"
- name: iso_4217_numeric_code
data_type: character varying
description: The 3 digit ISO 4217 numeric code for this currency.
data_tests:
- not_null
- dbt_expectations.expect_column_values_to_match_regex:
regex: "^[0-9]{3}$"
- name: decimal_positions
data_type: int
description: |
The decimal positions that lead to this currency smallest unit.
For example: since Japanese Yen (JPY) have no cents, this value is 0.
On the other hand, since the US Dollar (USD) is composed of cents, and each dollar equals 100 cent, this value is 2.
To convert from normal unit (Dollar) to smallest unit (Cent), multiply by `10^decimal_positions`.
To convert from smallest unit (Cent) to normal unit (Dollar), divide by `10^decimal_positions`.
data_tests:
- not_null
- dbt_expectations.expect_column_values_to_be_between:
min_value: 0
max_value: 8
strictly: False
- name: stg_seed__guest_services_vat_rates_by_country
description: |
A list of applicable VAT rates for guest services, by country.
The list was provided by the Finance team. A value of 0% does not
necessarily mean that the country doesn't have VAT, but rather that we
don't need to charge it to guests from that country.
Country names and codes _almost_ follow ISO 3166-1 (https://en.wikipedia.org/wiki/ISO_3166-1).
The only exception sits in the Kosovo record. Kosovo does not appear as a country in ISO 3166, but is nevertheless
a valid country in the `Country` table of the Superhog backend database. Because of this, we need to include it.
The present codes are made up (not truly ISO 3166 codes) and match the ones present in the backend.
Read more here: https://www.notion.so/knowyourguest-superhog/Guest-Services-Taxes-How-to-calculate-a5ab4c049d61427fafab669dbbffb3a2?pvs=4
config:
column_types:
country_code: varchar(3)
columns:
- name: country_name
data_type: character varying
description: The name of the country.
data_tests:
- not_null
- unique
- name: alpha_2
data_type: character varying
description: |
The two characters ISO 3166-1 Alpha-2 code for the country.
data_tests:
- not_null
- unique
- dbt_expectations.expect_column_values_to_match_regex:
regex: "^[A-Za-z]{2}$"
- name: alpha_3
data_type: character varying
description: |
The three characters ISO 3166-1 Alpha-3 code for the country.
data_tests:
- not_null
- unique
- dbt_expectations.expect_column_values_to_match_regex:
regex: "^[A-Za-z]{3}$"
- name: country_code
data_type: character varying
description: |
The three digit ISO 3166-1 Numeric code for the country.
data_tests:
- not_null
- unique
- dbt_expectations.expect_column_values_to_match_regex:
regex: "^[0-9]{3}$"
- name: vat_rate
data_type: numeric
description: |
The Superhog applicable VAT rate for guests of this country. A value
of 0% does not necessarily mean that the country doesn't have VAT, but
rather that we don't need to charge it to guests from that country.
data_tests:
- not_null
- dbt_expectations.expect_column_values_to_be_between:
min_value: 0
max_value: 1
strictly: false
- name: stg_seed_guesty_claims_snapshot_20241010
description: |
A list of claims that have been paid out within the Athena/Guesty line of
business.
The data was shared on 2024-10-10 by Chloe from Resolutions in a static
file, and was added to the DWH to support this ticket: https://guardhog.visualstudio.com/Data/_boards/board/t/Data%20Team/Stories/?workitem=22703
This is a static snapshot and we currently have no intent of maintaining up to date.
columns:
- name: "Booking ID"
data_type: character varying
description: |
The internal ID of this booking in Athena. Matches with the booking ID
in the Athena verifications table.
- name: "Claim Date"
data_type: timestamp
description: When was the claim received by Superhog.
- name: "Settled Date"
data_type: timestamp
description: |
When was the outcome of the claim decided by Superhog. Do not confuse
with when was the payment executed or received.
- name: "Paid Date"
data_type: timestamp
description: |
When was the settlement amount payment executed by Superhog.
- name: Settlement Currency
data_type: character varying
description: ISO4217 code of the currency in which the claim was posted.
- name: Settlement Amount
data_type: numeric
description: |
How much Superhog decided to pay out to the partner as part of this
claim, defined in the settlement currency.
- name: stg_seed__athena_price_history
description: |
A price history for the Athena fee per night.
Yes, I know. It's terrible that we keep this here. Oh boy, how I wish it
wasn't like this!
columns:
- name: start_at_utc
data_type: timestamp
description: |
The start of the time range where this record is applicable.
- name: end_at_utc
data_type: timestamp
description: The end of the time range where this record is applicable.
- name: fee_per_night_gbp
data_type: numeric
description: |
How much we charge per night in this time range.
- name: stg_seed__accounting_aggregations
description: |
Account codes and their respective aggregations for reporting purposes.
config:
column_types:
account_code: varchar(3)
columns:
- name: account_code
data_type: character varying
description: |
The account code for this aggregation. This is the code that is used
in the accounting system.
data_tests:
- not_null
- unique
- dbt_expectations.expect_column_values_to_match_regex:
regex: "^[0-9]{3}$"
- name: root_aggregation
data_type: character varying
description: |
The root aggregation for this account code. This is the main
aggregation that is used to retrieve low-level data.
data_tests:
- not_null
- accepted_values:
values:
- Other Invoiced Revenue
- Verification Fees
- Listing Fees
- Old Dashboard Booking Fees
- Athena API
- E-Deposit API
- Guesty Resolutions
- Basic Protection
- Waiver Pro
- Id Verification
- Protection Plus
- Screening Plus
- Sex Offenders Check
- Protection Pro
- Basic Screening
- Damage Host-Waiver Payments
- Damage Waiver Fees
- Deposit Fees
- Check In Cover
- Resolution Process for Protection Services
- Resolution Process for Deposit Management Services
- Basic Waiver
- Waiver Plus
- Basic Damage Deposit
- name: kpis_aggregation
data_type: character varying
description: |
The default macro-aggregation for Invoiced KPIs.
data_tests:
- not_null
- accepted_values:
values:
- Unknown
- Invoiced Operator Revenue
- Invoiced API Revenue
- Damage Host-Waiver Payments
- Accounting Resolutions
- Accounting Guest Revenue
- name: financial_l1_aggregation
data_type: character varying
description: |
The Level 1 aggregation for Financial reporting.
data_tests:
- not_null
- accepted_values:
values:
- Unknown
- 1-Guest Screening and Protection
- 2-Deposit Management
- 4-Mediation and Resolution
- 3-Guest Products
- 5-Damage Host-Waiver Payments
- name: financial_l2_aggregation
data_type: character varying
description: |
The Level 2 aggregation for Financial reporting.
data_tests:
- not_null
- accepted_values:
values:
- Unknown
- 10-Other Invoiced Revenue
- 21-Deposit Management Services
- 13-Verification Fees
- 12-Listing Fees
- 11-Booking Fees
- 14-Athena API
- 15-E-Deposit API
- 41-Guesty Resolutions
- 31-Check In Cover
- 17-Protection Services
- 16-Screening Services
- 51-Damage Host-Waiver Payments
- name: financial_l3_aggregation
data_type: character varying
description: |
The Level 3 aggregation for Financial reporting.
data_tests:
- not_null
- accepted_values:
values:
- Unknown
- 100-Other Invoiced Revenue
- 210-Damage Waiver Fees
- 211-Deposit Fees
- 131-Verification Fees
- 121-Listing Fees
- 111-Booking Fees
- 141-Athena API
- 151-E-Deposit API
- 411-Guesty Resolutions
- 311-Check In Cover
- 171-Basic Protection
- 213-Waiver Pro
- 163-Id Verification
- 172-Protection Plus
- 162-Screening Plus
- 164-Sex Offenders Checks
- 173-Protection Pro
- 161-Basic Screening
- 174-Resolution Process for Protection Services
- 211-Basic Waiver
- 212-Waiver Plus
- 214-Basic Damage Deposit
- 215-Resolution Process for Deposit Management Services
- 511-Damage Host-Waiver Payments
- name: stg_seed__main_metrics_targets
description: |
A list of financial year targets for the main metrics that we track in the company.
data_tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- id_metric
- target_date
columns:
- name: id_metric
data_type: bigint
description: The id of the metric used for joining with other tables.
data_tests:
- not_null
- name: metric_name
data_type: character varying
description: The name of the metric for human consumption
data_tests:
- not_null
- name: target_date
data_type: date
description: |
The date when this target is expected to be achieved.
data_tests:
- not_null
- name: target_eom_value
data_type: numeric
description: |
The EOM target value for this metric. This is the value that we aim to
achieve by the end of the month.
data_tests:
- not_null
- name: target_ytd_value
data_type: numeric
description: |
The YTD target value for this metric. This is the cummulative value that we
aim to achieve by the end of each month with respect to the beginning of the
financial year, that will put us to reach the EOFY target.
data_tests:
- not_null
- name: target_eofy_value
data_type: numeric
description: |
The EOFY target value for this metric. This is the value that we aim to
achieve by the end of the financial year.
data_tests:
- not_null