data-dwh-dbt-project/models/intermediate/cross/schema.yml

models:
  - name: int_daily_currency_exchange_rates
    description: >-
      This model holds a lot of data on currency exchange rates.  The time
      granularity is daily. Each record holds a currency pair for a specific
      day, source and version.

      Actual rates are sourced from xe.com data. The `guessed` and `forecast`
      versions are built by simply 'pushing'  the first/last exchange rate on
      record. Basically, wherever we dont' have data for a date, we pick the
      closest actual data point that comes from xe.com. Bear in mind this means
      that `forecast` version records will change on a daily basis as actual
      data moves forwards, meaning you shouldn't assume your money amounts
      converted in the future should always stay put.

      Note that, given the dimensionality, getting a simple time series for a
      currency pair will require a bit of filtering.

      Reverse rates are explicit. This means that, for any given day and any
      given currency pair, you will find two records with opposite from/to
      positions. So, for 2024-01-01, you will find both a EUR->USD record and a
      USD->EUR record with the opposite rate (1/rate).
    columns:
      - name: id_exchange_rate
        data_type: text
        description: A unique ID for the record, derived from concatenating the
          currencies, date, source and version. Currency order is relevant
          (EURUSD != USDEUR).
        tests:
          - not_null
          - unique
      - name: from_currency
        data_type: character
        description: The source currency, represented as an ISO 4217 code.
        tests:
          - not_null
      - name: to_currency
        data_type: character
        description: The target currency, represented as an ISO 4217 code.
        tests:
          - not_null
      - name: rate
        data_type: numeric
        description: >-
          The exchange rate, represented as the units of the target currency
          that one unit of source currency gets you. So, from_currency=USD to
          to_currency=PLN with rate=4.2 should be read as '1 US Dollar buys me
          4.2 Polish Zlotys'.

          For same currency pairs (EUR to EUR, USD to USD, etc). The rate will
          always be one.

          The rate can be smaller than one, but can't be negative.
        tests:
          - not_negative_or_zero
          - not_null
      - name: rate_date_utc
        data_type: date
        description: The date in which the rate record is relevant.
        tests:
          - not_null
      - name: source
        data_type: text
        description:
          Where is the data coming from. Records that are composed from
          making assumptions on real data will contain `_inferred`.
      - name: rate_version
        data_type: text
        description:
          The version of the rate. This can be one of `actual` (the rate is a
          reality fact), `forecast` (the rate sits in the future and is a guess
          in nature) or `guess` (the rate sits in the past and is a guess in
          nature). Note that one currency pair can have multiple rate versions
          on the same date.
        tests:
          - accepted_values:
              values:
                - guess
                - actual
                - forecast
          - not_null
      - name: updated_at_utc
        data_type: timestamp with time zone
        description:
          For external sources, this will be the point in time when the
          information was obtained from them. For stuff we make up here in the
          DWH, this will be the point in time when we made the assumption.
        tests:
          - not_null
  - name: int_simple_exchange_rates
    description: >-
      A simplified vision of exchange rates, derived from
      `int_daily_currency_exchange_rates`. Come here if you don't want to
      understand nuances and complexities and just want to convert rates.

      The time granularity is daily. Each record holds a currency pair for a
      specific day. You will only find one conversion rate per currency pair and
      date.
    tests:
      - dbt_utils.unique_combination_of_columns:
          combination_of_columns:
            - from_currency
            - to_currency
            - rate_date_utc
    columns:
      - name: from_currency
        data_type: character
        description: The source currency, represented as an ISO 4217 code.
        tests:
          - not_null
      - name: to_currency
        data_type: character
        description: The source currency, represented as an ISO 4217 code.
        tests:
          - not_null
      - name: rate
        data_type: numeric
        description: The target currency, represented as an ISO 4217 code.
        tests:
          - not_null
      - name: rate_date_utc
        data_type: date
        description: The date in which the rate record is relevant.
        tests:
          - not_null
      - name: updated_at_utc
        data_type: timestamp with time zone
        description:
          For external sources, this will be the point in time when the
          information was obtained from them. For stuff we make up here in the
          DWH, this will be the point in time when we made the assumption.
        tests:
          - not_null

  - name: int_mtd_vs_previous_year_metrics
    description: |
      This model is used for global KPIs.

      It aggregates all the mtd models with the different metrics per source
      and computes any necessary weighted metric across different sources.
      Finally, it displays any metric on the current date, the previous year
      date and it computes the relative increment by using the macro:
      - calculate_safe_relative_increment

    columns:
      - name: date
        data_type: date
        description: The date for the month-to-date metrics.
        tests:
          - not_null
          - unique

  - name: int_dates_mtd
    description: |
      This model provides Month-To-Date (MTD) necessary dates for MTD-based models to work.
      - For month-to-month complete information, it retrieves all end month dates that have elapsed since 2020.
      - For month-to-date information, it retrieves the days of the current month of this year up to yesterday.
        Additionally, it also gets the days of its equivalent month from last year previous the current day of month of today.

      Example:
      Imagine we have are at 4th June 2024.
      - We will get the dates for 1st, 2nd, 3rd of June 2024.
      - We will also get the dates for 1st, 2nd, 3rd of June 2023.
      - We will get all end of months from 2020 to yesterday,
        i.e., 31st January 2020, 29th February 2020, ..., 30th April 2024, 31st May 2024.

    columns:
      - name: year
        data_type: int
        description: Year number of the given date.
        tests:
          - not_null

      - name: month
        data_type: int
        description: Month number of the given date.
        tests:
          - not_null

      - name: day
        data_type: int
        description: Day monthly number of the given date.
        tests:
          - not_null

      - name: is_end_of_month
        data_type: boolean
        description: Is end of month, 1 for yes, 0 for no.
        tests:
          - not_null

      - name: is_current_month
        data_type: boolean
        description: |
          Checks if the date is within the current executed month,
          1 for yes, 0 for no.
        tests:
          - not_null

      - name: first_day_month
        data_type: date
        description: |
          First day of the month correspoding to the date field.
          It comes from int_dates_mtd logic.
        tests:
          - not_null

      - name: date
        data_type: date
        description: |
          Main date for the computation, that is used for filters.
          It's the primary key for this model.
        tests:
          - not_null
          - unique

  - name: int_dates_by_deal
    description: |
      This model provides the necessary dates for each deal for deal-based KPIs models to work.
      It only considers those dates starting from when the host user of the deal was first available.

    tests:
      - dbt_utils.unique_combination_of_columns:
          combination_of_columns:
            - date
            - id_deal

    columns:
      - name: year
        data_type: int
        description: Year number of the given date.
        tests:
          - not_null

      - name: month
        data_type: int
        description: Month number of the given date.
        tests:
          - not_null

      - name: day
        data_type: int
        description: Day monthly number of the given date.
        tests:
          - not_null

      - name: is_end_of_month
        data_type: boolean
        description: Is end of month, 1 for yes, 0 for no.
        tests:
          - not_null

      - name: is_current_month
        data_type: boolean
        description: |
          Checks if the date is within the current executed month,
          1 for yes, 0 for no.
        tests:
          - not_null

      - name: first_day_month
        data_type: date
        description: |
          First day of the month correspoding to the date field.
          It comes from int_dates_mtd logic.
        tests:
          - not_null

      - name: date
        data_type: date
        description: |
          Main date for the computation, that is used for filters.
          It's the primary key for this model.
        tests:
          - not_null

      - name: id_deal
        data_type: string
        description: |
          Main identifier of the B2B clients. A deal can have multiple hosts.
          A host should usually have a deal, but it does not happen on all cases.
          In this KPI reporting we force that Deal is not null to avoid potential
          data quality issues.
        tests:
          - not_null

  - name: int_mtd_aggregated_metrics
    description: |
      The `int_mtd_aggregated_metrics` model aggregates multiple metrics on a year, month, and day basis.
      The primary source of data is the `int_mtd_vs_previous_year_metrics` model, which contain the combination
      of metrics data per source. This model just changes the display format to unpivot the information into
      a set of metric, value, previous_year_value and relative_increment at a given date. It uses Jinja
      code to avoid code replication.

    tests:
      - dbt_utils.unique_combination_of_columns:
          combination_of_columns:
            - date
            - metric

    columns:
      - name: year
        data_type: int
        description: year number of the given date.
        tests:
          - not_null

      - name: month
        data_type: int
        description: month number of the given date.
        tests:
          - not_null

      - name: day
        data_type: int
        description: day monthly number of the given date.
        tests:
          - not_null

      - name: is_end_of_month
        data_type: boolean
        description: is end of month, 1 for yes, 0 for no.
        tests:
          - not_null

      - name: is_current_month
        data_type: boolean
        description: |
          checks if the date is within the current executed month,
          1 for yes, 0 for no.
        tests:
          - not_null

      - name: first_day_month
        data_type: date
        description: |
          first day of the month correspoding to the date field.
          It comes from int_dates_mtd logic.
        tests:
          - not_null

      - name: date
        data_type: date
        description: |
          main date for the computation, that is used for filters.
          It comes from int_dates_mtd logic.
        tests:
          - not_null

      - name: previous_year_date
        data_type: date
        description: |
          corresponds to the date of the previous year, with respect to the field date.
          It comes from int_dates_mtd logic. It's only displayed for information purposes,
          should not be needed for reporting.

      - name: metric
        data_type: text
        description: name of the business metric.
        tests:
          - not_null

      - name: order_by
        data_type: integer
        description: |
          order for displaying purposes. Null values are accepted, but keep
          in mind that then there's no default controlled display order.

      - name: number_format
        data_type: text
        description: allows for grouping and formatting for displaying purposes.
        tests:
        - accepted_values:
            values: ['integer', 'percentage', 'currency_gbp']

      - name: value
        data_type: numeric
        description: |
          numeric value (integer or decimal) that corresponds to the MTD computation of the metric
          at a given date.

      - name: previous_year_value
        data_type: numeric
        description: |
          numeric value (integer or decimal) that corresponds to the MTD computation of the metric
          on the previous year at a given date.

      - name: relative_increment
        data_type: numeric
        description: |
          numeric value that corresponds to the relative increment between value and previous year value,
          following the computation: value / previous_year_value - 1.


  - name: int_monthly_aggregated_metrics_history_by_deal
    description: |
      This model aggregates the monthly historic information regarding the different metrics computed
      at deal level. The primary sources of data are the `int_yyy__monthly_XXXXX_history_by_deal`
      models which contain the raw metrics data per source.

      Unlike the int_mtd_aggregated_metrics, this model does not abstract each metric, since
      no comparison versus last year is performed. In short, it just gathers the information stored
      in the abovementioned models.

      To keep in mind: aggregating the information of this model will not necessarily result into
      the int_mtd_aggregated metrics because 1) the mtd version contains more computing dates
      than the by deal version, the latest being a subset of the first, and 2) the deal based model
      enforces that a booking/guest journey/listing/etc has a host with a deal assigned, which is
      not necessarily the case.

    tests:
      - dbt_utils.unique_combination_of_columns:
          combination_of_columns:
            - date
            - id_deal

    columns:
      - name: date
        data_type: date
        description: The last day of the month or yesterday for historic metrics.
        tests:
          - not_null

      - name: id_deal
        data_type: character varying
        description: Id of the deal associated to the host.
        tests:
          - not_null

  - name: int_dates_mtd_by_dimension
    description: |
      This model provides Month-To-Date (MTD) necessary dates, dimension and dimension_values
      for MTD-based models to work.
      It provides the basic "empty" structure from which metrics will be built upon. This is, on
      top of the Date that characterises int_dates_mtd, including the dimensions and their
      respective values that should appear in any mtd metric model.

      Example:
       - For the "global" dimension, we will only have the "global" dimension value.
       - For the "by_number_of_listing" dimension, we will have different values
         according to the segments defined, ex: 0, 1-5, 6-20, etc.

      ... and so on and forth for any available dimension. These combinations should appear
      for each date of the MTD models.

    tests:
      - dbt_utils.unique_combination_of_columns:
          combination_of_columns:
            - date
            - dimension
            - dimension_value

    columns:
      - name: year
        data_type: int
        description: Year number of the given date.
        tests:
          - not_null

      - name: month
        data_type: int
        description: Month number of the given date.
        tests:
          - not_null

      - name: day
        data_type: int
        description: Day monthly number of the given date.
        tests:
          - not_null

      - name: is_end_of_month
        data_type: boolean
        description: Is end of month, 1 for yes, 0 for no.
        tests:
          - not_null

      - name: is_current_month
        data_type: boolean
        description: |
          Checks if the date is within the current executed month,
          1 for yes, 0 for no.
        tests:
          - not_null

      - name: first_day_month
        data_type: date
        description: |
          First day of the month correspoding to the date field.
          It comes from int_dates_mtd logic.
        tests:
          - not_null

      - name: date
        data_type: date
        description: |
          Main date for the computation, metrics include monthly information
          until this date.
        tests:
          - not_null

      - name: dimension
        data_type: string
        description: The dimension or granularity of the metrics.
        tests:
          - accepted_values:
                values:
                - global
                - by_number_of_listings

      - name: dimension_value
        data_type: string
        description: The value or segment available for the selected dimension.
        tests:
          - not_null