Merged PR 2760: Adding latest_date_is_yesterday test

# Description

Adds a new dbt test that will fail if the maximum date of a given column is different from yesterday. It uses `current_date`.

Additionally, I changed the test `kpis_global_metrics_outlier_detection` on the name of the parameter `detector_strength` to `detector_tolerance`, as a higher value of the parameter indicates that will be less likely to raise an alert.

Verified in local that tests passes if the execution is normal. Verified in local that the tests fails if manually deleting the latest date in the table.

# Checklist - Does not apply

- [ ] The edited models and dependants run properly with production data.
- [ ] The edited models are sufficiently documented.
- [ ] The edited models contain PK tests, and I've ran and passed them.
- [ ] I have checked for DRY opportunities with other models and docs.
- [ ] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #20824
This commit is contained in:
Oriol Roqué Paniagua 2024-09-09 12:52:53 +00:00 committed by Pablo Martín
commit 599f00093e
3 changed files with 21 additions and 5 deletions

View file

@ -0,0 +1,13 @@
{% test latest_date_is_yesterday(model, column_name) %}
with
model_max_date as (
select max({{ column_name }}) as max_date
from {{ model }}
)
select *
from model_max_date
where max_date <> current_date - 1
{% endtest %}

View file

@ -362,6 +362,7 @@ models:
It comes from int_dates_mtd logic.
tests:
- not_null
- latest_date_is_yesterday
- name: dimension
data_type: string
@ -461,7 +462,8 @@ models:
description: The last day of the month or yesterday for historic metrics.
tests:
- not_null
- latest_date_is_yesterday
- name: id_deal
data_type: character varying
description: Id of the deal associated to the host.

View file

@ -58,10 +58,11 @@ point it becomes too sensitive, just adapt the following parameters.
{% set start_validating_on_this_day_month = 2 %}
-- Specify here the strength of the detector. A higher value
-- means that this test will allow for more variance to be accepted.
-- means that this test will allow for more variance to be accepted,
-- thus it will be more tolerant.
-- A lower value means that the chances of detecting outliers
-- and false positives will be higher. Recommended around 10.
{% set detector_strength = 10 %}
{% set detector_tolerance = 10 %}
-- Specify here the number of days in the past that will be used
-- to compare against. Keep in mind that we only keep the daily
@ -106,11 +107,11 @@ with
stddev(abs_daily_value) as std_daily_value_previous_dates,
greatest(
avg(abs_daily_value)
- {{ detector_strength }} * stddev(abs_daily_value),
- {{ detector_tolerance }} * stddev(abs_daily_value),
0
) as lower_bound,
avg(abs_daily_value)
+ {{ detector_strength }} * stddev(abs_daily_value) as upper_bound
+ {{ detector_tolerance }} * stddev(abs_daily_value) as upper_bound
from metric_data
where is_max_date = 0
group by 1