fix schemas in intermediate

This commit is contained in:
Pablo Martin 2024-09-12 15:38:50 +02:00
parent aabb45dbd5
commit 05d5cc6d10
4 changed files with 327 additions and 387 deletions

View file

@ -140,18 +140,18 @@ models:
and computes any necessary weighted metric across different sources. and computes any necessary weighted metric across different sources.
Each metric has a date, dimension and dimension value that defines Each metric has a date, dimension and dimension value that defines
the primary key of this model. the primary key of this model.
Finally, it displays any metric on the current date, the previous year Finally, it displays any metric on the current date, the previous year
date and it computes the relative increment by using the macro: date and it computes the relative increment by using the macro:
- calculate_safe_relative_increment - calculate_safe_relative_increment
tests: tests:
- dbt_utils.unique_combination_of_columns: - dbt_utils.unique_combination_of_columns:
combination_of_columns: combination_of_columns:
- date - date
- dimension - dimension
- dimension_value - dimension_value
columns: columns:
- name: date - name: date
data_type: date data_type: date
@ -164,11 +164,11 @@ models:
description: The dimension or granularity of the metrics. description: The dimension or granularity of the metrics.
tests: tests:
- accepted_values: - accepted_values:
values: values:
- global - global
- by_number_of_listings - by_number_of_listings
- by_billing_country - by_billing_country
- name: dimension_value - name: dimension_value
data_type: string data_type: string
description: The value or segment available for the selected dimension. description: The value or segment available for the selected dimension.
@ -197,7 +197,7 @@ models:
- not_null - not_null
- name: month - name: month
data_type: int data_type: int
description: Month number of the given date. description: Month number of the given date.
tests: tests:
- not_null - not_null
@ -249,7 +249,7 @@ models:
combination_of_columns: combination_of_columns:
- date - date
- id_deal - id_deal
columns: columns:
- name: year - name: year
data_type: int data_type: int
@ -258,7 +258,7 @@ models:
- not_null - not_null
- name: month - name: month
data_type: int data_type: int
description: Month number of the given date. description: Month number of the given date.
tests: tests:
- not_null - not_null
@ -324,14 +324,14 @@ models:
a set of metric, value, previous_year_value and relative_increment at a given date. It uses Jinja a set of metric, value, previous_year_value and relative_increment at a given date. It uses Jinja
code to avoid code replication. code to avoid code replication.
tests: tests:
- dbt_utils.unique_combination_of_columns: - dbt_utils.unique_combination_of_columns:
combination_of_columns: combination_of_columns:
- date - date
- metric - metric
- dimension - dimension
- dimension_value - dimension_value
columns: columns:
- name: year - name: year
data_type: int data_type: int
@ -340,7 +340,7 @@ models:
- not_null - not_null
- name: month - name: month
data_type: int data_type: int
description: month number of the given date. description: month number of the given date.
tests: tests:
- not_null - not_null
@ -386,11 +386,11 @@ models:
description: The dimension or granularity of the metrics. description: The dimension or granularity of the metrics.
tests: tests:
- accepted_values: - accepted_values:
values: values:
- global - global
- by_number_of_listings - by_number_of_listings
- by_billing_country - by_billing_country
- name: dimension_value - name: dimension_value
data_type: string data_type: string
description: The value or segment available for the selected dimension. description: The value or segment available for the selected dimension.
@ -420,35 +420,34 @@ models:
data_type: text data_type: text
description: allows for grouping and formatting for displaying purposes. description: allows for grouping and formatting for displaying purposes.
tests: tests:
- accepted_values: - accepted_values:
values: ['integer', 'percentage', 'currency_gbp'] values: ["integer", "percentage", "currency_gbp"]
- name: value - name: value
data_type: numeric data_type: numeric
description: | description: |
numeric value (integer or decimal) that corresponds to the MTD computation of the metric numeric value (integer or decimal) that corresponds to the MTD computation of the metric
at a given date. at a given date.
- name: previous_year_value - name: previous_year_value
data_type: numeric data_type: numeric
description: | description: |
numeric value (integer or decimal) that corresponds to the MTD computation of the metric numeric value (integer or decimal) that corresponds to the MTD computation of the metric
on the previous year at a given date. on the previous year at a given date.
- name: relative_increment - name: relative_increment
data_type: numeric data_type: numeric
description: | description: |
numeric value that corresponds to the relative increment between value and previous year value, numeric value that corresponds to the relative increment between value and previous year value,
following the computation: value / previous_year_value - 1. following the computation: value / previous_year_value - 1.
- name: relative_increment_with_sign_format - name: relative_increment_with_sign_format
data_type: numeric data_type: numeric
description: | description: |
relative_increment value multiplied by -1 in case this metric's growth doesn't have a relative_increment value multiplied by -1 in case this metric's growth doesn't have a
positive impact for Superhog, otherwise is equal to relative_increment. positive impact for Superhog, otherwise is equal to relative_increment.
This value is specially created for formatting in PBI This value is specially created for formatting in PBI
- name: int_monthly_aggregated_metrics_history_by_deal - name: int_monthly_aggregated_metrics_history_by_deal
description: | description: |
This model aggregates the monthly historic information regarding the different metrics computed This model aggregates the monthly historic information regarding the different metrics computed
@ -463,9 +462,9 @@ models:
the int_mtd_aggregated metrics because 1) the mtd version contains more computing dates the int_mtd_aggregated metrics because 1) the mtd version contains more computing dates
than the by deal version, the latest being a subset of the first, and 2) the deal based model than the by deal version, the latest being a subset of the first, and 2) the deal based model
enforces that a booking/guest journey/listing/etc has a host with a deal assigned, which is enforces that a booking/guest journey/listing/etc has a host with a deal assigned, which is
not necessarily the case. not necessarily the case.
tests: tests:
- dbt_utils.unique_combination_of_columns: - dbt_utils.unique_combination_of_columns:
combination_of_columns: combination_of_columns:
- date - date
@ -480,7 +479,7 @@ models:
- name: id_deal - name: id_deal
data_type: character varying data_type: character varying
description: Id of the deal associated to the host. description: Id of the deal associated to the host.
tests: tests:
- not_null - not_null
@ -496,7 +495,7 @@ models:
description: | description: |
ISO 3166-1 alpha-3 main country code in which the Deal is billed. ISO 3166-1 alpha-3 main country code in which the Deal is billed.
In some cases it's null. In some cases it's null.
- name: int_dates_mtd_by_dimension - name: int_dates_mtd_by_dimension
description: | description: |
This model provides Month-To-Date (MTD) necessary dates, dimension and dimension_values This model provides Month-To-Date (MTD) necessary dates, dimension and dimension_values
@ -504,7 +503,7 @@ models:
It provides the basic "empty" structure from which metrics will be built upon. This is, on It provides the basic "empty" structure from which metrics will be built upon. This is, on
top of the Date that characterises int_dates_mtd, including the dimensions and their top of the Date that characterises int_dates_mtd, including the dimensions and their
respective values that should appear in any mtd metric model. respective values that should appear in any mtd metric model.
Example: Example:
- For the "global" dimension, we will only have the "global" dimension value. - For the "global" dimension, we will only have the "global" dimension value.
- For the "by_number_of_listing" dimension, we will have different values - For the "by_number_of_listing" dimension, we will have different values
@ -513,7 +512,7 @@ models:
... and so on and forth for any available dimension. These combinations should appear ... and so on and forth for any available dimension. These combinations should appear
for each date of the MTD models. for each date of the MTD models.
tests: tests:
- dbt_utils.unique_combination_of_columns: - dbt_utils.unique_combination_of_columns:
combination_of_columns: combination_of_columns:
- date - date
@ -528,7 +527,7 @@ models:
- not_null - not_null
- name: month - name: month
data_type: int data_type: int
description: Month number of the given date. description: Month number of the given date.
tests: tests:
- not_null - not_null
@ -565,7 +564,7 @@ models:
data_type: date data_type: date
description: | description: |
Main date for the computation, metrics include monthly information Main date for the computation, metrics include monthly information
until this date. until this date.
tests: tests:
- not_null - not_null
@ -574,13 +573,13 @@ models:
description: The dimension or granularity of the metrics. description: The dimension or granularity of the metrics.
tests: tests:
- accepted_values: - accepted_values:
values: values:
- global - global
- by_number_of_listings - by_number_of_listings
- by_billing_country - by_billing_country
- name: dimension_value - name: dimension_value
data_type: string data_type: string
description: The value or segment available for the selected dimension. description: The value or segment available for the selected dimension.
tests: tests:
- not_null - not_null

View file

@ -2,28 +2,27 @@ version: 2
models: models:
- name: int_edeposit__verifications - name: int_edeposit__verifications
description: description:
"This table holds records on verifications for e-deposit bookings. "This table holds records on verifications for e-deposit bookings.
It contains details on validations checked on the guests, guest information It contains details on validations checked on the guests, guest information
and some booking details like checkin-checkout date or the status of the verification. and some booking details like checkin-checkout date or the status of the verification.
The id values found here are completely unrelated to the ones found in Core DWH. The id values found here are completely unrelated to the ones found in Core DWH.
Note that id_verifications and booking_id should normally be 1 to 1. Note that id_verifications and booking_id should normally be 1 to 1.
Though there are exception, the API will accept a duplicate booking and the users Though there are exception, the API will accept a duplicate booking and the users
will be charged for it. A duplicate would return a unique id_verification." will be charged for it. A duplicate would return a unique id_verification."
columns: columns:
- name: id_verification - name: id_verification
data_type: text data_type: text
description: "unique Superhog generated id for this verification" description: "unique Superhog generated id for this verification"
tests: tests:
- unique - unique
- not_null - not_null
- name: id_booking - name: id_booking
data_type: text data_type: text
description: description: "unique Superhog generated id for a booking.
"unique Superhog generated id for a booking. note that this could be duplicated and both will be charged,
note that this could be duplicated and both will be charged,
it's up to the user to no generate duplicate verifications" it's up to the user to no generate duplicate verifications"
- name: id_user_partner - name: id_user_partner
@ -38,21 +37,19 @@ models:
- name: version - name: version
data_type: text data_type: text
description: description: "value to identify if it is Guesty (V1) or E-deposit (V2)"
"value to identify if it is Guesty (V1) or E-deposit (V2)"
tests: tests:
- accepted_values: - accepted_values:
values: values:
- V1 - V1
- V2 - V2
- name: verification_source - name: verification_source
data_type: text data_type: text
description: description: "source of the verification for the booking"
"source of the verification for the booking"
tests: tests:
- accepted_values: - accepted_values:
values: values:
- Guesty - Guesty
- Edeposit - Edeposit
@ -190,16 +187,15 @@ models:
- name: athena_creation_at_utc - name: athena_creation_at_utc
data_type: timestamp without time zone data_type: timestamp without time zone
description: description:
"Athena timestamp referring to when the booking was created. "Athena timestamp referring to when the booking was created.
It's provided by Guesty, but is not mandatory. It's provided by Guesty, but is not mandatory.
In case of doubt use created_at_utc or created_date_utc fields" In case of doubt use created_at_utc or created_date_utc fields"
- name: athena_creation_date_utc - name: athena_creation_date_utc
data_type: date data_type: date
description: description: "Athena date referring to when the booking was created.
"Athena date referring to when the booking was created. It's provided by Guesty, but is not mandatory.
It's provided by Guesty, but is not mandatory.
In case of doubt use created_at_utc or created_date_utc fields" In case of doubt use created_at_utc or created_date_utc fields"
- name: created_at_utc - name: created_at_utc
@ -211,7 +207,7 @@ models:
description: "Date of creation of the verification in the system" description: "Date of creation of the verification in the system"
- name: int_edeposit__verification_fees - name: int_edeposit__verification_fees
description: description:
"This table shows all fee charges per verification for E-deposit. "This table shows all fee charges per verification for E-deposit.
Cancellation fee is charged when the monthly rate of cancelled bookings over Cancellation fee is charged when the monthly rate of cancelled bookings over
total booking of the partner surpasses the threshold (currently set at 0.05). total booking of the partner surpasses the threshold (currently set at 0.05).
@ -220,8 +216,7 @@ models:
columns: columns:
- name: id_verification - name: id_verification
data_type: text data_type: text
description: description: "Unique Superhog generated id for this verification.
"Unique Superhog generated id for this verification.
Note that there are some users that have a different id in Cosmos. Note that there are some users that have a different id in Cosmos.
For those users we created a mapping to relate this ids." For those users we created a mapping to relate this ids."
tests: tests:
@ -230,9 +225,8 @@ models:
- name: id_booking - name: id_booking
data_type: text data_type: text
description: description: "unique Superhog generated id for a booking.
"unique Superhog generated id for a booking. note that this could be duplicated and both will be charged,
note that this could be duplicated and both will be charged,
it's up to the user to no generate duplicate verifications" it's up to the user to no generate duplicate verifications"
tests: tests:
- not_null - not_null
@ -281,7 +275,7 @@ models:
- name: cancelled_fee_in_txn_currency - name: cancelled_fee_in_txn_currency
data_type: numeric data_type: numeric
description: "fee charged in used currency for cancelled verifications" description: "fee charged in used currency for cancelled verifications"
tests: tests:
- not_null - not_null
- dbt_expectations.expect_column_values_to_be_between: - dbt_expectations.expect_column_values_to_be_between:
@ -290,7 +284,7 @@ models:
- name: cancelled_fee_in_gbp - name: cancelled_fee_in_gbp
data_type: numeric data_type: numeric
description: "fee charged in gbp for cancelled verifications" description: "fee charged in gbp for cancelled verifications"
tests: tests:
- not_null - not_null
- dbt_expectations.expect_column_values_to_be_between: - dbt_expectations.expect_column_values_to_be_between:
@ -299,21 +293,20 @@ models:
- name: checkout_date_utc - name: checkout_date_utc
data_type: date data_type: date
description: "Date of checkout for the booking" description: "Date of checkout for the booking"
tests: tests:
- not_null - not_null
- name: created_date_utc - name: created_date_utc
data_type: date data_type: date
description: "Date of creation of the verification in the system" description: "Date of creation of the verification in the system"
tests: tests:
- not_null - not_null
- name: int_edeposit__guesty_verifications - name: int_edeposit__guesty_verifications
description: description: "This table shows all verification for Guesty.
"This table shows all verification for Guesty.
The charged fee is 2GBP per booked night if booking is approved The charged fee is 2GBP per booked night if booking is approved
(considered 1 night when the checkin and checkout are on the same day), (considered 1 night when the checkin and checkout are on the same day),
to be charged on checkout." to be charged on checkout."
columns: columns:
- name: id_verification - name: id_verification
@ -325,9 +318,8 @@ models:
- name: id_booking - name: id_booking
data_type: text data_type: text
description: description: "unique Superhog generated id for a booking.
"unique Superhog generated id for a booking. note that this could be duplicated and both will be charged,
note that this could be duplicated and both will be charged,
it's up to the user to not generate or cancel duplicate verifications" it's up to the user to not generate or cancel duplicate verifications"
tests: tests:
- not_null - not_null
@ -344,8 +336,7 @@ models:
- name: ok_status_fee_in_gbp - name: ok_status_fee_in_gbp
data_type: integer data_type: integer
description: description: "total fee charged on checkout, this is only charged for approved verifications"
"total fee charged on checkout, this is only charged for approved verifications"
tests: tests:
- not_null - not_null
- dbt_expectations.expect_column_values_to_be_between: - dbt_expectations.expect_column_values_to_be_between:
@ -355,14 +346,12 @@ models:
- name: created_date_utc - name: created_date_utc
data_type: date data_type: date
description: description: "Date of creation of the verification in the system"
"Date of creation of the verification in the system"
tests: tests:
- not_null - not_null
- name: checkout_date_utc - name: checkout_date_utc
data_type: date data_type: date
description: description: "Date of checkout for the booking"
"Date of checkout for the booking"
tests: tests:
- not_null - not_null