galoy-personal-notes/pains.md

70 lines
2.9 KiB
Markdown
Raw Permalink Normal View History

2025-10-24 14:33:36 +02:00
# Pains
## Local app/infra
- Starting up the local app and running E2E tests + running EL takes like 20min.
- Having Airflow UI to run things is nice. Having Airflow schedules trigger automatically is painful locally because it makes controling the state of the environment hard.
- Simply checking the E2E flow takes a lot of steps and tools. Feeling that there's always something breaking.
## Meltano
- Meltano logs are terrible to read, make debugging painful.
- Meltano EL takes a long time
- Meltano configuration is painful:
- Docs are terrible
- Wrong configurations don't raise errors often times, just get ignored
- Meltano handling python environments is sometimes more of an annoyance than help
## Data Pipeline needs data
- Anemic dataset makes development hard (many entities with few or no records).
## Improvable practices
- Loading all backend tables into DW, regardless of whether they are used
- More data, worse performance, no gain
- More cognitive load when working on DW ("What is this table? Where is it used? Can I modify it?")
- "But then I have all backend tables handy" -> Well, let's make adding a backend table trivial
- dbt
- Not documenting models
- Next person has no clue what they are, makes shared ownership hard
- Not using exposures
- Hard to know what models impact what reports
- Hard to know what parts of the DW are truly used
-
## What are our bottlenecks/issue?
- Dealing with a convoluted output definition (understanding laws + unclear validation procedure with Vicky)
- Translating needed report into SQL transformations of the backend data
- Breaking changes in backend events/entities breaking downstream data dependencies
What is NOT
- Data latency
- Scalability of data volume
- Having a pretty interface
## My proposal
- Fallback to a dramatically simple stack that allows team members working on reports to move fast
- Hardcoded Bitfinex CSV
- Hardcoded Sumsub CSV
- Use another PG as DW, move data with a simple, stateless Python script
- Ignore orchestration, UI delivery, monitoring for now
- Work together with backend to find a convenient solution to have good testing data
- Make an exhaustive list of reports and align with Luis/Vicky on a plan to systematically meet and validate. Track it.
Then, once...
- We have ack from Luis/Vicky that all reports are domain-valid
- We have more clarity on integrations we must run with regulators/government systems after audit
- Backend data model is more stable
... we grab our domain rich, deployment poor setup and we discuss what is the optimal tooling and strategies to deliver with production-grade practices.
## North Star Ideas
- Use an asset based orchestrator, like dagster
- Step away from meltano, use a EL framework such as dlt and combine it with orchestrator
- Add a visualization tool to the stack, such as evidence, metabase, lightdash