counterweight/data-dwh-dbt-project

Author	SHA1	Message	Date
Joaquin Ossa	d189aaa797	1st commit	2024-09-04 17:24:55 +02:00
Pablo Martín	dd57c28768	Merged PR 2728: edeposit verifications docs and small refactors # Description This PR: - Adds half-decent docs to `stg_edeposit__verifications` and tests. I say half-decent because I would describe our tests as "as strict as the backend guidance allows". But we can't do miracles, so it stays this way for now. - Shifts a few column operations that were being done in the `int` layer into the `stg` layer. - Also removes a couple of fields from `int` that were marked as deprecated by Ray. Would rather not have them at all beyond `stg`. # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #20123	2024-09-04 11:07:58 +00:00
Oriol Roqué Paniagua	556a52e991	Merged PR 2689: KPIs by Billing Country # Description Adds Billing Country dimension in KPIs, but does not expose them to reporting yet. Silly thing, based on the macros I built, I cannot make incremental changes unless changing all models. This will need to be adapted, happy to hear your thoughts on how we do it. Additionally, I have lack of performance of the model `mtd_guest_payments_metrics`. It takes around 5 min to execute, but technically the end-to-end runs in one shoot without breaking. It's a complex PR because it changes many files, but you will see that: * It mostly changes the join conditions for the dimensions or the schema tests, * I tried to be very careful and add things step-by-step in the commits. Goal is NOT to complete the PR yet until we see how we can improve performance. I can say though that data end-to-end looks ok to me, but would benefit from checking with production data for the new dimension Update 30th Aug * Added a new commit that includes `id_user_host` in `int_core__verification_payments`. Happy to discuss if it makes sense or not. But it changes the execution from ~600 sec to ~6 sec because it avoids a massive repeated join with `verification_requests`. # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [ ] I've picked the right materialization for the affected models. To check because of performance issues # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #19082	2024-09-04 10:17:12 +00:00
Oriol Roqué Paniagua	940896824f	Merged PR 2730: Exposing Billable Bookings metric for KPIs # Description Exposes Billable Bookings metric for KPIs, both in the "global+dimension" view and in the "deal" view. Metrics have already been created for a while. Exposing them now after the changes carried out in the model `int_core__booking_charge_events`. Based on the current quality of the data, I opted for "Est. Billable Bookings" to account for the fact that this is an estimation. If you don't feel comfortable with it, let's remove the "Est. ". # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #18111	2024-09-04 08:15:37 +00:00
Joaquin Ossa	a3e6ad27c2	Set at 5%	2024-09-04 10:06:37 +02:00
Joaquin Ossa	f8131d9111	edeposit_verifications_fees to reporting, this would be the model to use in the business overview report	2024-09-04 09:27:13 +02:00
Pablo Martin	08abcb5373	add date fields to docs	2024-09-04 09:21:22 +02:00
Pablo Martin	443216bfaf	more fixes in docs	2024-09-04 09:15:33 +02:00
Pablo Martin	cfab0b4b33	add missing field	2024-09-04 09:12:21 +02:00
Pablo Martin	475116a32b	rename field in docs	2024-09-04 09:12:17 +02:00
Joaquin Ossa	470e5c7990	Merged PR 2679: edeposit_agg_fee_per_user to reporting # Description edeposit fees per verification I modified the model so it contains only the fees charged per verification so then we can do the grouping and filters in power bi depending on how it needs to be displayed. # Checklist - [x] The edited models and dependants run properly with production data. - [x] The edited models are sufficiently documented. - [x] The edited models contain PK tests, and I've ran and passed them. - [x] I have checked for DRY opportunities with other models and docs. - [x] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #20125	2024-09-04 07:08:38 +00:00
Pablo Martin	e750e288c5	fix typo in schema	2024-09-03 18:19:17 +02:00
Pablo Martin	9f80f0a916	shit date casts left	2024-09-03 18:03:19 +02:00
Pablo Martin	37856ec606	shift renames left, remove deprecated fields from int	2024-09-03 17:55:24 +02:00
Pablo Martin	c35c5cb033	schema and tests	2024-09-03 17:55:24 +02:00
Pablo Martin	9a6490e7fd	minor change in model	2024-09-03 17:55:24 +02:00
Joaquin Ossa	2602be2567	changed athena name	2024-09-03 17:09:52 +02:00
Joaquin Ossa	7d8983e0dd	Fixed model, added created_date and checkout_date	2024-09-03 17:05:13 +02:00
Oriol Roqué Paniagua	6d59e21310	Merged PR 2725: Force id user field to lower in staging # Description Forces lower case to all id_users in staging. Removes hardcoded lower case in intermediate. Adapts readme to contemplate the lowering of id users. I propose to merge, run in prod and run tests in prod as a proper evaluation method. BTW, I only find one id_user_host that was in capital letters, so that's why probably we didn't care that much about this. Still, I prefer have things clean from the start! ``` select * from staging.stg_core__booking scb left join intermediate.int_core__unified_user icuu on lower(scb.id_user_host) = lower(icuu.id_user) where scb.id_user_host <> icuu.id_user ``` # Checklist - [ ] The edited models and dependants run properly with production data. All models run in stg, did not check all the dependants - [ ] The edited models are sufficiently documented. Have not checked - [ ] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [ ] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #20776	2024-09-03 14:36:21 +00:00
Oriol Roqué Paniagua	1b30fbbca9	Merged PR 2724: Removing coalesce from gbp conversion in int_core__host_booking_fees # Description Removing coalesce from gbp conversion in `int_core__host_booking_fees` # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. Message sent in data team channel - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged.	2024-09-03 13:34:08 +00:00
Joaquin Ossa	c24c875336	Fixed test error	2024-09-03 15:19:04 +02:00
Joaquin Ossa	560e2a5994	Modified model to only have fees	2024-09-03 15:19:04 +02:00
Joaquin Ossa	920733fceb	Updating with Ray's comments	2024-09-03 15:19:04 +02:00
Joaquin Ossa	a99d4f622f	Modified model to only have fees	2024-09-03 15:18:10 +02:00
Joaquin Ossa	906bccce0e	modified model	2024-09-03 15:17:43 +02:00
Oriol Roqué Paniagua	4cfc0dcd45	Merged PR 2642: Booking Charge Events to have a similar logic as invoicing # Description Based on the Notion page [here](https://www.notion.so/knowyourguest-superhog/Data-quality-assessment-Billable-Bookings-97008b7f1cbb4beb98295a22528acd03), this PR mainly adds: - Charge at verification depends on when the Guest joined or the VR was updated (depending on if the verification request associated exists does not exists or it does, respectively) - Add the logic to retrieve the last plan that is available at the beginning of each month. - Additional where conditions, relatively documented, to imitate was is available in the invoicing process. This includes removal of duplicated bookings, guest verification and guest user existing. Additional changes: - Remove select star :) - Added dbt tests that didn't exist before - Add informative fields on 1) how many price plans were active in a given month, even though we just keep the last one and 2) cases in which bookings are created after the booking is supposed to be charged. Data quality:´ - I have mixed feelings. This does not correspond 100% to the volumes provided by the exporter, though are quite close. For April, May and June 2024, this logic has more than 95% of accuracy. Still, the fact of using the guest joined, and especially the updated date, I feel like this will make past data "disappear" if the guest has another journey. I don't know for sure since we do not store incremental updates of user information. I'd propose to move forward to have an estimated metric available anyway - with this or a similar logic, even the previous one based on the used link at but fixing the cases in which there's no VR associated. Let's discuss it! # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #18111	2024-09-03 13:15:40 +00:00
Joaquin Ossa	0cb03e0808	Merged PR 2707: E-Deposit users to staging # Description E-Deposit users to staging to have currencies for PBI report # Checklist - [x] The edited models and dependants run properly with production data. - [x] The edited models are sufficiently documented. - [x] The edited models contain PK tests, and I've ran and passed them. - [x] I have checked for DRY opportunities with other models and docs. - [x] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #20125	2024-09-03 12:53:41 +00:00
Joaquin Ossa	322b122925	Added missing tests	2024-09-03 10:21:23 +02:00
Joaquin Ossa	fa6f7f8ff8	Filtering out test users so all tests work correctly	2024-09-03 10:15:20 +02:00
Joaquin Ossa	ccef428020	Merged PR 2694: Basic model changes for edeposit # Description Basic model changes for edeposit # Checklist - [x] The edited models and dependants run properly with production data. - [x] The edited models are sufficiently documented. - [x] The edited models contain PK tests, and I've ran and passed them. - [x] I have checked for DRY opportunities with other models and docs. - [x] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #20125	2024-09-02 15:01:40 +00:00
Joaquin Ossa	89792cf0b7	final comments	2024-09-02 17:01:18 +02:00
Joaquin Ossa	6dedbc04d7	Added more tests but still waiting confirmation on tests users from Ray and Ana	2024-09-02 16:55:56 +02:00
Joaquin Ossa	ffef9e3ff2	Added all date_utc fields	2024-09-02 12:52:16 +02:00
Joaquin Ossa	129b00e29b	Added more description and tests to schema	2024-09-02 12:44:05 +02:00
Joaquin Ossa	ee4d213274	edeposit_users to staging	2024-09-02 11:33:25 +02:00
Joaquin Ossa	b6a752fd74	edeposit_users to staging	2024-09-02 11:33:21 +02:00
Joaquin Ossa	46d5e7c3c5	Updating with Ray's comments	2024-09-02 11:16:51 +02:00
Oriol Roqué Paniagua	d8e6ee3ab0	Merged PR 2704: Adding New Dash exposures # Description Adds New Dash exposures to close the ticket # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [ ] I have checked for DRY opportunities with other models and docs. N/A - [ ] I've picked the right materialization for the affected models. N/A # Other - [ ] Check if a full-refresh is required after this PR is merged. Adding New Dash exposures Related work items: #19570	2024-09-02 07:22:29 +00:00
Joaquin Ossa	7a77691b89	deleted schema from edeposit reporting	2024-08-30 10:58:24 +02:00
Joaquin Ossa	fd98f31fdd	Kept basic models to reduce complexity of models/20125_edeposit_migration_agg_model_reporting	2024-08-30 10:54:00 +02:00
Joaquin Ossa	42510bbb4d	Just committing to save change and create a new branch for basic to push basic changes	2024-08-30 10:33:43 +02:00
Oriol Roqué Paniagua	7ba65999c3	Merged PR 2687: Materialize int_core__verification_payments as a table # Description Just materializes `int_core__verification_payments` as a table instead as a view to enhance compute. # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. Technically I get errors in local but from what I see it's because I have different dumps for the currency conversion and the other sources. There's no such cases in prod from what I observed. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #19082	2024-08-29 14:25:18 +00:00
Joaquin Ossa	6adc424963	addressed Pablo's comments, removed the repetitive casts, added some nut_null tests and fixed some of the names and descriptions discrepancies	2024-08-29 14:25:00 +02:00
Joaquin Ossa	ad2eb2544c	edeposit_agg_fee_per_user to reporting	2024-08-29 11:09:09 +02:00
Joaquin Ossa	951bc70123	Merged PR 2671: New aggregated model for E-deposit report # Description New aggregated model for E-deposit report @<Oriol Roqué Paniagua> not sure if this is what you had in mind with categorizing the cases in a variable, if not let me know so maybe we can check it together # Checklist - [x] The edited models and dependants run properly with production data. - [x] The edited models are sufficiently documented. - [x] The edited models contain PK tests, and I've ran and passed them. - [x] I have checked for DRY opportunities with other models and docs. - [x] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #20125	2024-08-29 08:33:45 +00:00
Oriol Roqué Paniagua	2f77c8eea8	Merged PR 2676: Propagates Billing Country information # Description Propagates Billing Country information in unified_user and user_host intermediate models. This is a necessary step towards providing KPIs segmented by Billing Country. # Checklist - [X] The edited models and dependants run properly with production data. - [X] The edited models are sufficiently documented. - [X] The edited models contain PK tests, and I've ran and passed them. - [X] I have checked for DRY opportunities with other models and docs. - [X] I've picked the right materialization for the affected models. # Other - [ ] Check if a full-refresh is required after this PR is merged. Related work items: #19082	2024-08-29 08:25:05 +00:00
Joaquin Ossa	189e77dd76	fixed variable definitions and added comments for currency-less fees	2024-08-29 09:30:36 +02:00
Joaquin Ossa	be59ab258a	Fixed test	2024-08-28 16:42:10 +02:00
Joaquin Ossa	0b6239e5c2	New aggregated model in intermediate for e-deposit report	2024-08-28 16:38:30 +02:00
Joaquin Ossa	b333b45891	Added some comments to make it clear that ids here are unrelated to core dwh, I will come back to modify the schemas when Ray answers all of our questions related to this data	2024-08-28 15:22:17 +02:00

1 2 3 4 5 ...

647 commits