version: 2 models: - name: int_core__duplicate_bookings description: | A list of bookings which are considered duplicates of other bookings. We currently consider two bookings to be duplicate if they have the same: - Guest user id - Accomodation id - Check-in date Bear in mind these bookings do have different booking ids. Out of a duplicated tuple of 2 or more bookings: - Our logic will consider the oldest one to be the "original", not duplicate one. - This table will contain only the duplicates, and not the original. columns: - name: id_booking data_type: bigint description: The unique, Superhog generated id for this booking. - name: is_duplicate_booking data_type: boolean description: | True if the booking is duplicate. If you are thinking that this is redundant, you are right. All records in this table will be true. But we keep this field to make your life easier when joining with other tables. - name: is_duplicating_booking_with_id data_type: bigint description: | Indicates what's the original booking being duplicated. If there is a tuple of duplicate bookings {A, B, C}, where A is the original and the others are the duplicates: - B and C will appear in this table, A will not. - The value of this field for both B and C will be A's id. - name: int_core__booking_charge_events description: | Booking charge events is a fancy word for saying: a booking happened, the related host had a booking fee set up at the right time, hence we need to charge him. The table contains one record per booking and shows the associated booking fee, as well as the point in time in which the charge event was considered. Be wary of the booking fees: they don't have an associated currency. Crazy, I know, but we currently don't store that information in the backend. As for the charge dates: the exact point in time at which we consider that we should be charging a fee depends on billing details of the host customer. For some bookings, this will be the check-in. For others, its when the guest begins the verification process. Not all bookings appear here since we don't charge a fee for all bookings. columns: - name: id_booking data_type: bigint description: The unique, Superhog generated id for this booking. - name: id_price_plan data_type: bigint description: The id of the price plan that relates to this booking. - name: booking_fee_local data_type: numeric description: The fee to apply to the booking, in host currency. - name: booking_fee_charge_at_utc data_type: timestamp without time zone description: | The point in time in which the booking should be invoiced. This could be the check-in date of the booking or the date in which the guest verification started, depending on the billing settings of the host. - name: booking_fee_charge_date_utc data_type: date description: | The date in which the booking should be invoiced. This could be the check-in date of the booking or the date in which the guest verification started, depending on the billing settings of the host. - name: int_core__check_in_cover_prices description: | This table shows the active price and cover for the Check-In Hero product. The prices are obtained through a gross `GROUP BY` thrown at the payment validation sets table. It works this way because the price settings of this product were done with a terrible backend data model design. How could the prices be changed remains a mystery, and the current design does not support any kind of history tracking. When the time comes to adjust prices, we will have a lot of careful work to do to make sure that we keep history and that no downstream dependencies of this model blow up. columns: - name: local_currency_iso_4217 data_type: character varying description: A currency code. - name: checkin_cover_guest_fee_local_curr data_type: numeric description: | The fee that the guest user must pay if he wants to purchase the cover. - name: checkin_cover_cover_amount_local_curr data_type: numeric description: | The amount for which the guest user is covered if he faces problems during check-in. - name: int_core__unified_user columns: - name: id_user data_type: character varying description: The unique ID for the user. tests: - not_null - unique - name: int_core__vr_check_in_cover columns: - name: id_verification_request data_type: character varying description: The unique ID for the verification request. tests: - not_null - unique - name: int_core__mtd_booking_metrics description: | This model contains the historic information regarding the bookings in an aggregated manner. It's used for the business KPIs. Data is aggregated at the last day of the month and in the days necessary for the Month-to-Date computation of the current month. columns: - name: date data_type: date description: The date for the month-to-date booking-related metrics. tests: - not_null - unique - name: int_core__mtd_accommodation_metrics description: | This model contains the historic information regarding the accommodations in an aggregated manner. It's used for the business KPIs. Data is aggregated at the last day of the month and in the days necessary for the Month-to-Date computation of the current month. columns: - name: date data_type: date description: The date for the month-to-date accommodation-related metrics. tests: - not_null - unique - name: int_core__mtd_aggregated_metrics description: | The `int_core__mtd_aggregated_metrics` model aggregates multiple metrics on a year, month, and day basis. The primary sources of data are the `int_core__mtd_XXXXX_metrics` models, which contain the raw metrics data per source. This model uses Jinja templating to dynamically generate SQL code, combining various metrics into a single table. This approach reduces repetition and enhances maintainability. tests: - dbt_utils.unique_combination_of_columns: combination_of_columns: - date - metric columns: - name: year data_type: int description: year number of the given date. tests: - not_null - name: month data_type: int description: month number of the given date. tests: - not_null - name: day data_type: int description: day monthly number of the given date. tests: - not_null - name: is_end_of_month data_type: boolean description: is end of month, 1 for yes, 0 for no. tests: - not_null - name: is_current_month data_type: boolean description: | checks if the date is within the current executed month, 1 for yes, 0 for no. tests: - not_null - name: date data_type: date description: | main date for the computation, that is used for filters. It comes from int_dates_mtd logic. tests: - not_null - name: previous_year_date data_type: date description: | corresponds to the date of the previous year, with respect to the field date. It comes from int_dates_mtd logic. It's only displayed for information purposes, should not be needed for reporting. - name: metric data_type: text description: name of the business metric. tests: - not_null - name: order_by data_type: integer description: | order for displaying purposes. Null values are accepted, but keep in mind that then there's no default controlled display order. - name: number_format data_type: text description: allows for grouping and formatting for displaying purposes. tests: - accepted_values: values: ['integer', 'percentage'] - name: value data_type: numeric description: | numeric value (integer or decimal) that corresponds to the MTD computation of the metric at a given date. - name: previous_year_value data_type: numeric description: | numeric value (integer or decimal) that corresponds to the MTD computation of the metric on the previous year at a given date. - name: relative_increment data_type: numeric description: | numeric value that corresponds to the relative increment between value and previous year value, following the computation: value / previous_year_value - 1. - name: int_core__verification_request_completeness description: | The `int_core__verification_request_completeness` model allows to determine if a verification request is completed or not. To achieve it, it encapsulates the logic to determine the different possibilites. Its main output is the column is_verification_request_complete, but it also provides outputs of the intermediate logic steps to be used for further modeling, such as determining the completion date. columns: - name: id_verification_request data_type: bigint description: id of the verification request. It's the unique key for this model. tests: - not_null - unique - name: expected_verification_count data_type: int description: count of verifications that are expected to be passed in order to complete the request. - name: confirmed_from_same_verification_request_count data_type: int description: count of confirmed verifications that its logic is computed from the same verification request. - name: confirmed_from_previous_verification_requests_count data_type: int description: count of confirmed verifications that its logic is computed from previous verification requests. - name: confirmed_verification_count data_type: int description: | total count of confirmed verifications. Mainly, it's the sum of the confirmed verifications that come from the same verification request plus the ones that come from previous verifications requests. - name: is_verification_request_complete data_type: boolean description: if the verification request can be considered as completed or not. - name: used_verification_from_same_verification_request data_type: boolean description: | if the verification request can be considered as completed and has at least one confirmed verification from the same verification request. - name: used_verification_from_previous_verification_requests data_type: boolean description: | if the verification request can be considered as completed and has at least one confirmed verification from a previous verification request. - name: is_complete_only_from_previous_verification_requests data_type: boolean description: | if the verification request can be considered as completed and all confirmed verifications are from previous verification requests. - name: int_core__verification_request_completed_date description: | The `int_core__verification_request_completed_date` model allows to retrieve the time in which the guest journey, or verification request, was completed. It only considers that a guest journey is completed based on the positive outcome of the is_verification_complete boolean coming from verification_request_completeness model. The completion time is computed as follows: - Only considering verification requests that have been tagged as completed. From here, we have: - If the verification request has, at least, one verification linked; the date will be the creation date of the last verification created linked to that verification request. To keep in mind: for some cases, the last verification can have updates after the creation, but these generally happen with very low time differences with respect to the creation date. However, there are some outliers - mostly linked to admin override - that we're not considering here, since these might not necessarily be linked to the Guest completing the Guest Journey. - If the verification request does not have any verification linked; we assume an automatic completion. In this case, we use the time from which the verification request was created. For some cases, it is possible that this logic still generates some completed times that are actually before a user usage of the link. For these cases, we do an override and we apply the used_link_at_utc as the completed time. To account for this cases, check the boolean column is_completed_at_overriden_with_used_link_at. In summary, the guest journey completion time provided here is an estimation. Finally, this model only contains those request that have been completed, so keep it in mind when joining this table. columns: - name: id_verification_request data_type: bigint description: id of the completed verification request. It's the unique key for this model. tests: - not_null - unique - name: estimated_completed_at_utc data_type: timestamp description: estimated timestamp of when the verification request was completed. - name: estimated_completed_date_utc data_type: date description: estimated date from the timestamp of when the verification request was completed. - name: is_completed_at_overriden_with_used_link_at data_type: boolean description: > boolean indicating if the estimated dates have been overriden with the used link since the initial computation was still considering an end date before a starting date. - name: int_core__verification_payments description: >- A simplified table that holds guest journey payments with details around when they happen, what service was being paid, what was the related verification request, etc. Currency rates are converted to GBP with our simple exchange rates view. columns: - name: id_verification_to_payment data_type: bigint description: Unique ID for the relation between the payment verification and the payment at hand. tests: - unique - not_null - name: id_payment data_type: bigint description: Unique ID for the payment itself. tests: - unique - not_null - name: is_refundable data_type: boolean - name: created_at_utc data_type: timestamp without time zone - name: updated_at_utc data_type: timestamp without time zone - name: payment_due_at_utc data_type: timestamp without time zone tests: - not_null - name: payment_due_date_utc data_type: date tests: - not_null - name: payment_paid_at_utc data_type: timestamp without time zone - name: payment_paid_date_utc data_type: date - name: payment_reference data_type: character varying - name: refund_due_at_utc data_type: timestamp without time zone - name: refund_due_date_utc data_type: date - name: payment_refunded_at_utc data_type: timestamp without time zone - name: payment_refunded_date_utc data_type: date - name: refund_payment_reference data_type: character varying - name: id_guest_user data_type: character varying - name: id_verification data_type: bigint - name: id_verification_request data_type: bigint - name: verification_payment_type data_type: character varying - name: amount_in_txn_currency data_type: numeric tests: - not_null - name: currency data_type: character varying tests: - not_null - name: amount_in_gbp data_type: numeric tests: - not_null - name: payment_status data_type: character varying - name: notes data_type: character varying - name: int_core__country description: | This model contains information regarding countries, such as codes, names and preferred currencies columns: - name: id_country data_type: bigint description: id of the country. It's the unique key for this model. tests: - not_null - unique - name: iso_2 data_type: char(2) description: ISO 3166-1 alpha-2 country code. Cannot be null. tests: - not_null - name: iso_3 data_type: char(3) description: | ISO 3166-1 alpha-3 country code. Some countries can have this value as not set, therefore it's nullable. - name: country_name data_type: character varying description: name of the country. Cannot be null. tests: - not_null - name: iso_num_code data_type: int description: | ISO 3166-1 numeric code. Usually it's 3 digits, but since it's categorised as an integer, the preceding zeros are removed. Nullable. - name: phone_code data_type: int description: | Phone code prefix for a given country. Can contain default / duplicated values. - name: id_preferred_currency data_type: int description: | Id of the preferred currency for a given country. Might not be the only currency used in the country, it's just the preferred one. tests: - not_null - name: preferred_currency_name data_type: character varying description: | Currency name of the preferred currency for a given country. tests: - not_null - name: preferred_iso4217_code data_type: char(3) description: | Three-letter code assigned to the preferred currency for a given country by the ISO. These codes are part of the ISO 4217 standard. tests: - not_null - name: int_core__accommodation description: | This model contains information regarding accommodations, also known as listings. It contains information regarding the host this accommodation is linked to, the geographic details, the preferred currency according to the country, details about the listing itself (floors, bedrooms, etc) and time-related information of when the listing was created. columns: - name: id_accommodation data_type: bigint description: Id of the accommodation or listing. It's the unique key for this model. tests: - not_null - unique - name: id_user_host data_type: character varying description: The unique ID for the host. Can be null. - name: id_payment_validation_set data_type: bigint description: Id of the payment validation set linked to a listing. Can be null. - name: friendly_name data_type: character varying - name: country_iso_2 data_type: char(2) description: ISO 3166-1 alpha-2 country code where the listing is located. - name: country_name data_type: character varying description: Name of the country where the listing is located. - name: country_preferred_currency_code data_type: char(3) description: | Three-letter code assigned to the preferred currency for a given country by the ISO. These codes are part of the ISO 4217 standard. Keep in mind this are preferred, not necessarily the actual currency. - name: is_active data_type: boolean description: | Boolean to indicate if the listing is active or not. If false, this is considered as a hard deactivation - meaning no more bookings can be assigned to this listing. However, even if a listing is active, that does not necessarily mean that it's receiving bookings. Do not confuse this column with the lifecycle activity of a listing that is computed in int_core__mtd_accommodation_lifecycle. - name: town data_type: character varying - name: postcode data_type: character varying - name: address_line_1 data_type: character varying - name: address_line_2 data_type: character varying - name: verification_level data_type: integer - name: floor_area data_type: integer - name: number_of_floors data_type: integer - name: number_of_bedrooms data_type: integer - name: number_of_bathrooms data_type: integer - name: number_of_other_rooms data_type: integer - name: construction_details data_type: character varying - name: created_at_utc data_type: timestamp description: Timestamp of when the listing was created. Cannot be null. tests: - not_null - name: created_date_utc data_type: date description: Date of when the listing was created - name: updated_at_utc data_type: timestamp description: Timestamp of when the listing was last updated according to the backend. - name: updated_date_utc data_type: date description: Date of when the listing was last updated according to the backend. - name: dwh_extracted_at_utc data_type: timestamp description: Timestamp of when the accommodation record was extracted from the backend into the DWH. - name: int_core__mtd_accommodation_lifecycle description: | This model contains the historic information regarding the lifecycle of accommodations, also known as listings. The information regarding the booking-related time allows for the current status of any listing regarding its activity. This information is encapsulated in the following columns: accommodation_lifecycle_state: contains one of the following states - 01-New: Listings that have been created in the current month, without bookings - 02-Never Booked: Listings that have been created before the current month, without bookings. - 03-First Time Booked: Listings that have been booked for the first time in the current month. - 04-Active: Listings that have booking activity in the past 12 months (that are not FTB nor reactivated) - 05-Churning: Listings that are becoming inactive because of lack of bookings in the past 12 months - 06-Inactive: Listings that have not had a booking for more than 12 months. - 07-Reactivated: Listings that have had a booking in the current month that were inactive or churning before. - Finally, if none of the logic applies, which should not happen, null will be set and a dbt alert will raise. Since the states of Active, First Time Booked and Reactivated indicate certain booking activity and are mutually exclusive, the model also provides information of the recency of the bookings by the following booleans: - has_been_booked_within_current_month: If a listing has had a booking created in the current month - has_been_booked_within_last_6_months: If a listing has had a booking created in the past 6 months - has_been_booked_within_last_12_months: If a listing has had a booking created in the past 12 months Note that if a listing has had a booking created in a given month, all 3 columns will be true. Similarly, if the last booking created to a listing was 5 months ago, only the column has_been_booked_in_1_month will be false; while the other 2 will be true. tests: - dbt_utils.unique_combination_of_columns: combination_of_columns: - date - id_accommodation columns: - name: date data_type: date description: The date for the month-to-date. Information is inclusive to the date displayed. tests: - not_null - name: id_accommodation data_type: bigint description: Id of the accommodation or listing. tests: - not_null - name: creation_date_utc data_type: date description: Date of when the listing was created. - name: first_time_booked_date_utc data_type: date description: | Date of the first booking created for a given listing. Can be null if the listing has never had a booking associated with it. - name: last_time_booked_date_utc data_type: date description: | Date of the last booking created for a given listing. Can be null if the listing has never had a booking associated with it. Can be the same as first_time_booked_date_utc if the listing only had 1 booking in its history. - name: second_to_last_time_booked_date_utc data_type: date description: | Date of the second-to-last booking created for a given listing, meaning the creation date of the booking that precedes the last one. It's relevant for the reactivation computation on the lifecycle. Can be null if the listing has never had a booking associated with it or if the listing only had 1 booking in its history. - name: accommodation_lifecycle_state data_type: character varying description: | Contains the lifecycle state of a Listing. The accepted values are: 01-New, 02-Never Booked, 03-First Time Booked, 04-Active, 05-Churning, 06-Inactive, 07-Reactivated. Failing to implement the logic will result in alert. tests: - not_null - name: has_been_booked_within_current_month data_type: boolean description: If the listing has had a booking created in the current month. - name: has_been_booked_within_last_6_months data_type: boolean description: If the listing has had a booking created in the past 6 months. - name: has_been_booked_within_last_12_months data_type: boolean description: If the listing has had a booking created in the past 12 months.