Merged PR 4996: First tracking of flagging performance

# Description Creates 2 new models in the scope of flagging: how good are we at identifying "at risk" bookings vs. 1) the number of claims generated and 2) the number of submitted payouts? This only applies for Protected Bookings in New Dash that have been completed (14 days after the check-out) with potential resolutions appearing in Resolutions Center. The first table `int_flagging_booking_categorisation` contains all the heavy logic to categorise the bookings. The second view `int_flagging_performance_analysis` computes standard binary classification scores, for the 2 possible ways of tracking. Tables are already in prod to help you understand while reviewing. You'll see that the figures are still quite low, specially due to small amount of claims/submitted payouts. This makes the true positives being just... 1. There's heavy test and documentation coverage to ensure there's no mistakes on the computation. # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. **Materialising as table the first model despite being just 1 record since otherwise tests takes ages** # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #29284
2025-04-15 10:14:02 +00:00 · 2025-04-15 10:14:02 +00:00 · a2cad661dd
commit a2cad661dd
parent 587661f818
3 changed files with 692 additions and 0 deletions
--- a/models/intermediate/cross/schema.yml
+++ b/models/intermediate/cross/schema.yml
@ -2800,3 +2800,369 @@ models:
                - NONE
                - INVOICING
                - ONGOING_MONTH
+
+  - name: int_flagging_booking_categorisation
+    description: |
+      A model that computes different Booking counts depending whether these 
+      had claims or not, if these were categorised at risk or not, and if there 
+      was a submitted payout or not. 
+      This only applies for Bookings: 
+      - that come from New Dash users 
+      - that are protected, either by a protection or a deposit management service 
+
+      Additionally, we track Completed Bookings as those Bookings which, as of today, 
+      have been checked out for more than natural 14 days.
+
+      From these Bookings, we check if these had an incident related in Resolution 
+      Center:
+      - that is linked to a Booking 
+      - that is not in a duplicated status 
+
+      Since Bookings can be duplicated in the incidents data, we effectively consider:
+      - Bookings with "any" claim 
+      - Bookings with a finished claim, either with a payout or not
+      - Bookings with a finished claim and a submitted amount for payout
+
+    data_tests:
+      - dbt_expectations.expect_table_row_count_to_equal:
+          value: 1
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: total_bookings
+          column_B: completed_bookings + not_completed_bookings
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: total_with_claim_bookings
+          column_B: completed_with_claim_bookings + not_completed_with_claim_bookings
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: completed_bookings
+          column_B: completed_with_claim_bookings + completed_without_claim_bookings
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: completed_bookings
+          column_B: completed_risk_bookings + completed_no_risk_bookings
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: completed_risk_bookings
+          column_B: completed_risk_with_claim_bookings + completed_risk_without_claim_bookings
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: completed_with_claim_bookings
+          column_B: completed_risk_with_claim_bookings + completed_no_risk_with_claim_bookings
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: completed_no_risk_bookings
+          column_B: completed_no_risk_with_claim_bookings + completed_no_risk_without_claim_bookings
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: completed_without_claim_bookings
+          column_B: completed_risk_without_claim_bookings + completed_no_risk_without_claim_bookings
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: completed_bookings
+          column_B: completed_awaiting_resolution_bookings + completed_not_awaiting_resolution_bookings
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: completed_not_awaiting_resolution_bookings
+          column_B: completed_with_submitted_payout_bookings + completed_without_submitted_payout_bookings
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: completed_with_submitted_payout_bookings
+          column_B: completed_risk_with_submitted_payout_bookings + completed_no_risk_with_submitted_payout_bookings
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: completed_without_submitted_payout_bookings
+          column_B: completed_risk_without_submitted_payout_bookings + completed_no_risk_without_submitted_payout_bookings
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: completed_bookings
+          column_B: completed_risk_with_claim_bookings + completed_no_risk_without_claim_bookings + completed_risk_without_claim_bookings + completed_no_risk_with_claim_bookings
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: completed_not_awaiting_resolution_bookings
+          column_B: completed_risk_with_submitted_payout_bookings + completed_no_risk_without_submitted_payout_bookings + completed_risk_without_submitted_payout_bookings + completed_no_risk_with_submitted_payout_bookings
+
+    columns:
+      - name: total_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected Bookings, either a Protection Service 
+          or a Deposit Management service, for reference.
+
+      - name: completed_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected Bookings with a Checkout happening 
+          more than 14 days ago.
+
+      - name: not_completed_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected Bookings with a Checkout happening 
+          between 14 days ago and today, or in the future.
+
+      - name: total_with_claim_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected Bookings that have had a claim, 
+          indistinctly of these bookings being considered as completed or not.
+
+      - name: completed_with_claim_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected and Completed Bookings that have 
+          had a claim.
+
+      - name: not_completed_with_claim_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected, NOT Completed Bookings that have 
+          had a claim.
+
+      - name: completed_without_claim_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected and Completed Bookings that have 
+          NOT had a claim.
+
+      - name: completed_risk_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected and Completed Bookings that have 
+          been flagged as at Risk.
+
+      - name: completed_no_risk_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected and Completed Bookings that have 
+          NOT been flagged as at Risk.
+
+      - name: completed_awaiting_resolution_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected and Completed Bookings that have
+          a claim and are in a resolution status that is not finished. These 
+          Bookings are excluded for the submitted payout-based performance
+          analysis, as we don't know if the claim will be paid out or not.
+
+      - name: completed_not_awaiting_resolution_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected and Completed Bookings that are 
+          not awaiting resolution, either because they have a claim in a finished
+          status or because they don't have a claim at all.
+
+      - name: completed_with_submitted_payout_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected and Completed Bookings that have 
+          had a submitted payout, with the claim being in a finished status.
+
+      - name: completed_without_submitted_payout_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected and Completed Bookings that have 
+          NOT had a submitted payout, either because there's a claim being in 
+          a finished status without a payout or because there's no claim at all.
+
+      - name: completed_risk_with_claim_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected and Completed Bookings that have 
+          been flagged as at Risk AND that have had a claim. 
+          For the claim-based performance analysis, this would be the true positive.
+
+      - name: completed_no_risk_without_claim_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected and Completed Bookings that have 
+          NOT been flagged as at Risk AND that have NOT had a claim. 
+          For the claim-based performance analysis, this would be the true negative.
+
+      - name: completed_risk_without_claim_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected and Completed Bookings that have 
+          been flagged as at Risk AND that have NOT had a claim. 
+          For the claim-based performance analysis, this would be the false positive.
+
+      - name: completed_no_risk_with_claim_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected and Completed Bookings that have 
+          NOT been flagged as at Risk AND that have had a claim. 
+          For the claim-based performance analysis, this would be the false negative.
+
+      - name: completed_risk_with_submitted_payout_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected and Completed Bookings that have 
+          been flagged as at Risk AND that have had a submitted payout, with 
+          the claim being in a finished status.
+          For the submitted payout-based performance analysis, this would be 
+          the true positive.
+
+      - name: completed_no_risk_without_submitted_payout_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected and Completed Bookings that have 
+          NOT been flagged as at Risk AND that have NOT had a submitted payout, 
+          either because there's a claim being in a finished status without a 
+          payout or because there's no claim at all.  
+          For the submitted payout-based performance analysis, this would be 
+          the true negative.
+
+      - name: completed_risk_without_submitted_payout_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected and Completed Bookings that have 
+          been flagged as at Risk AND that have NOT had a submitted payout, 
+          either because there's a claim being in a finished status without a 
+          payout or because there's no claim at all.  
+          For the submitted payout-based performance analysis, this would be 
+          the false positive.
+
+      - name: completed_no_risk_with_submitted_payout_bookings
+        data_type: integer
+        description: |
+          Current count of New Dash Protected and Completed Bookings that have 
+          NOT been flagged as at Risk AND that have had a submitted payout, with 
+          the claim being in a finished status.
+          For the submitted payout-based performance analysis, this would be 
+          the false negative.
+
+  - name: int_flagging_performance_analysis
+    description: |
+      Provides a basic statistical analysis with binary classification metrics 
+      on the flagging performance for New Dash Protected bookings, in the scope
+      of claims raised or submitted payouts.
+    data_tests:
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: count_total
+          column_B: count_true_positive + count_true_negative + count_false_positive + count_false_negative
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: recall_score
+          column_B: 1.0 * count_true_positive / (count_true_positive + count_false_negative)
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: precision_score
+          column_B: 1.0 * count_true_positive / (count_true_positive + count_false_positive)
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: false_positive_rate_score
+          column_B: 1.0 * count_false_positive / (count_false_positive + count_true_negative)
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: f1_score
+          column_B: 2.0 * count_true_positive / (2 * count_true_positive + count_false_negative + count_false_positive)
+      - dbt_expectations.expect_column_pair_values_to_be_equal:
+          column_A: f2_score
+          column_B: 5.0 * count_true_positive / (5 * count_true_positive + 4 * count_false_negative  + count_false_positive)
+
+    columns:
+      - name: flagging_analysis_type
+        data_type: string
+        description: |
+          Type of the analysis conducted, i.e., what do we consider as a 
+          positive - predicted (flagged) vs. actual (claim, payout).
+        data_tests:
+          - not_null
+          - unique
+          - accepted_values:
+              values:
+                - RISK_VS_CLAIM
+                - RISK_VS_SUBMITTED_PAYOUT
+
+      - name: count_total
+        data_type: integer
+        description: |
+          Total count of bookings considered for the flagging performance analysis.
+
+      - name: count_true_positive
+        data_type: integer
+        description: |
+          Count of True Positives: predicted positives that are also an actual positive.
+
+      - name: count_true_negative
+        data_type: integer
+        description: |
+          Count of True Negatives: predicted negatives that are also an actual negative.
+
+      - name: count_false_positive
+        data_type: integer
+        description: |
+          Count of False Positives: predicted positives that are not an actual positive.
+
+      - name: count_false_negative
+        data_type: integer
+        description: |
+          Count of False Negatives: predicted negatives that are not an actual negative.
+
+      - name: true_positive_score
+        data_type: decimal
+        description: |
+          True Positives as a ratio over 1. This is the count of true positives divided 
+          by the total count of bookings considered for the flagging performance analysis.
+
+      - name: true_negative_score
+        data_type: decimal
+        description: |
+          True Negatives, as a ratio over 1. This is the count of true negatives divided
+          by the total count of bookings considered for the flagging performance analysis.
+
+      - name: false_positive_score
+        data_type: decimal
+        description: |
+          False Positives, as a ratio over 1. This is the count of false positives divided
+          by the total count of bookings considered for the flagging performance analysis.
+
+      - name: false_negative_score
+        data_type: decimal
+        description: |
+          False Negative, as a ratio over 1. This is the count of false negatives divided
+          by the total count of bookings considered for the flagging performance analysis.
+
+      - name: recall_score
+        data_type: decimal
+        description: |
+          Recall score, or true positive rate. This corresponds to the proportion of all
+          actual positives that were classified correctly as a positive. It can be seen 
+          as a probability of detection: in our case, it answers the question "what 
+          fraction of claim/payouts were flagged as at risk?".
+          This is the count of true positives divided by the sum of true positives and 
+          false negatives. Recall improves when false negatives decrease.
+          A hypothetical perfect model would have zero false negatives, and thus a 
+          recall of 1.0, or 100% detection rate.
+
+      - name: precision_score
+        data_type: decimal
+        description: |
+          Precision score, or positive predictive value. This corresponds to the 
+          proportion of all predicted positives that were classified correctly as a 
+          positive. In our case, it answers the question "what fraction of 
+          claims/payouts flagged as at risk were actually at risk?".
+          This is the count of true positives divided by the sum of true positives and 
+          false positives. Precision improves when false positives decrease.
+          A hypothetical perfect model would have zero false positives, and thus a 
+          precision of 1.0, or 100% precision rate.
+
+      - name: false_positive_rate_score
+        data_type: decimal
+        description: |
+          False positive rate, or fall-out. This corresponds to the proportion of all
+          actual negatives that were classified incorrectly as a positive. It can be seen 
+          as a probability of false alarm: in our case, it answers the question "what 
+          fraction of non-claims/payouts were flagged as at risk?".
+          This is the count of false positives divided by the sum of true positives and 
+          false positives.
+          A hypothetical perfect model would have zero false positives, and thus a 
+          false positive rate of 0.0, or 0% false alarm rate.
+
+      - name: f1_score
+        data_type: decimal
+        description: |
+          F1 score, which computes the harmonic mean of precision and recall.
+          This metric balances the trade-off between precision and recall, and is useful 
+          when we want to find an optimal balance between the two.
+          It is defined as 2 * (precision * recall) / (precision + recall).
+          A hypothetical perfect model would have an F1 score of 1.0, or 100%.
+          When precision and recall are far apart, the F1 score will be closer to the 
+          lower of the two.
+
+      - name: f2_score
+        data_type: decimal
+        description: |
+          F2 score, which computes the harmonic mean of precision and recall, but 
+          with a twice higher weight on recall. In our case, it effectively means 
+          that we want to reduce the number of false negatives, meaning reducing 
+          the number of claims/payouts that are not flagged as at risk, while still 
+          keeping a good precision.
+          This metric is useful when we want to prioritize recall over precision, 
+          and is defined as 5 * (precision * recall) / (4 * precision + recall).
+          A hypothetical perfect model would have an F2 score of 1.0, or 100%.
+          When precision and recall are far apart, the F2 score will be closer to the 
+          lower of the two.