instructions for local dwh
This commit is contained in:
parent
7e60c6e6e3
commit
ab5902ff1f
2 changed files with 34 additions and 0 deletions
20
README.md
20
README.md
|
|
@ -4,6 +4,8 @@ Welcome to Superhog's DWH dbt project. Here we model the entire DWH.
|
|||
|
||||
## How to set up your environment
|
||||
|
||||
### Basics
|
||||
|
||||
- Pre-requisites
|
||||
- You need a Linux environment. That can be Linux, macOS or WSL.
|
||||
- You need to have Python `>=3.10` installed.
|
||||
|
|
@ -23,6 +25,24 @@ Welcome to Superhog's DWH dbt project. Here we model the entire DWH.
|
|||
- If you are in VSCode, you most probably want to have this extension installed: [dbt Power User](https://marketplace.visualstudio.com/items?itemName=innoverio.vscode-dbt-power-user)
|
||||
- It is advised to use [this autoformatter](https://sqlfmt.com/) and to automatically [run it on save](https://docs.sqlfmt.com/integrations/vs-code).
|
||||
|
||||
### Local DWH
|
||||
|
||||
Having a database where you can run your WIP models is very useful to ease development. But obviously, we can't do that in production. We could do it in a shared dev instance, but then we would step into each others toes when developing.
|
||||
|
||||
To overcome these issues, we rely on local clones of the DWH. The idea is to have a PostgreSQL instance running on your laptop. You perform your `dbt run` statements for testing and you validate the outcome of your work there. When you are confident and have tested properly, you can PR to master.
|
||||
|
||||
You will find a docker compose file named `dev-dwh.docker-compose.yml`. It will simply start a PostgreSQL 16 database in your device. You can raise it, adjust it to your needs, and adapt the `profiles.yml` file to point to it when you are developing locally.
|
||||
|
||||
The only missing bit to make your local deployment be like the production DWH is to have the source data from the source systems. The current policy is to generate a dump from the production database with what you need and restore it in your local postgres. That way, you are using accurate and representative data to do your work.
|
||||
|
||||
For example, if you are working on models that use data from Core, you can dump and restore from your terminal with something roughly like this:
|
||||
|
||||
```bash
|
||||
pg_dump -h superhog-dwh-prd.postgres.database.azure.com -U airbyte_user -W -F t dwh -n sync_core > core.dump
|
||||
|
||||
pg_restore -h localhost -U postgres -W -d dwh core.dump
|
||||
```
|
||||
|
||||
## Branching strategy
|
||||
|
||||
This repo works in a trunk-based-development philosophy (<https://trunkbaseddevelopment.com/>).
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue