data-dwh-dbt-project/models/intermediate/xero/int_xero__contacts.sql
Oriol Roqué Paniagua 5928a198b6 Merged PR 4734: Bugfix - Ensures accounts cannot be duplicated in Xero sales monthly trends
# Description

This fixes the duplication error when the same deal has more than one contact. Impact was not massive, it just affected because of the Home to Host account and this has happened during the past week. There were other cases that contained duplicates but these were not invoiced so it didn't affect the total sum.

Changes:
* Contacts now has a boolean field indicating if a deal exists.
* Contacts now has a boolean field indicating if that contact has a deal informed and it's the last updated one. This is opinionated - it could have been done with creation, for instance.
* Sales monthly trends forces the last updated contact to be true.
* Improved robustness by adding tests.

Note that modifying the logic in intermediate on `int_xero__sales_monthly_trends` by changing `id_deal` per `id_contact` would not have worked - we need to do this compute by deal to ensure that any invoicing is reported MoM and YTD.

# Checklist

- [X] The edited models and dependants run properly with production data.
- [X] The edited models are sufficiently documented.
- [X] The edited models contain PK tests, and I've ran and passed them.
- [X] I have checked for DRY opportunities with other models and docs.
- [X] I've picked the right materialization for the affected models.

# Other

- [ ] Check if a full-refresh is required after this PR is merged.

Related work items: #28576
2025-03-19 08:19:20 +00:00

74 lines
2.1 KiB
SQL

with
stg_xero__contacts as (select * from {{ ref("stg_xero__contacts") }}),
contact_with_deal as (
select
id_contact,
case
when account_number ~ '^[0-9]{10,11}$' then account_number else null
end::text as id_deal,
updated_at_utc
from stg_xero__contacts
),
deals_ranked_per_updated_at_utc as (
select
id_contact,
id_deal,
row_number() over (
partition by id_deal order by updated_at_utc desc
) as deal_rank_per_updated_at_utc
from contact_with_deal
-- Ensuring id deal is set
where id_deal is not null
)
select
c.id_contact,
c.account_number,
cwd.id_deal,
case
when cwd.id_deal is null then true else false
end as is_contact_missing_id_deal,
-- The following ensures is the last updated contact that actually contains a deal
case
when dr.deal_rank_per_updated_at_utc = 1 then true else false
end as is_latest_updated_contact_per_deal,
c.contact_name,
c.is_customer,
c.is_supplier,
c.tax_number,
c.contact_first_name,
c.contact_last_name,
c.phones,
c.website,
c.balances,
c.discount,
c.addresses,
c.attachments,
c.email_addresses,
c.payment_terms,
c.batch_payments,
c.branding_theme,
c.contact_groups,
c.contact_number,
c.contact_status,
c.skyper_user_name,
c.contact_persons,
c.has_attachments,
c.updated_at_utc,
c.updated_date_utc,
c.xero_network_key,
c.default_currency_iso_4217,
c.validation_errors,
c.bank_account_details,
c.has_validation_errors,
c.tracking_category_name,
c.account_payable_tax_type,
c.tracking_category_option,
c.sales_default_account_code,
c.sales_tracking_categories,
c.account_receivable_tax_type,
c.purchases_default_account_code,
c.purchases_tracking_categories,
c.dwh_extracted_at_utc
from stg_xero__contacts c
left join contact_with_deal cwd on c.id_contact = cwd.id_contact
left join deals_ranked_per_updated_at_utc dr on c.id_contact = dr.id_contact