galoy-personal-notes/open-data-arch-topics.md
2025-08-13 17:40:21 +02:00

1.1 KiB

  • Supported SQL Engine

    • One or more?
    • If one, which? Postgres? Snowflake? BQ?
  • App to DW EL

    • Push all tables by default? Or craft them?
    • Activate incremental loading for all tables?
    • Include some kind of data contract/test to detect breaking changes? In prod? In CI? In both?
    • Current lack of visibility on state of the sync
  • dbt project

    • conventions
      • code style
      • stricter definitions on things allowed/forbidden in each layer
      • add docs in staging and output?
      • translations only happen in output?
      • usage of exposures
    • testing
      • unit testing for logic models?
      • how thorough should be testing that a report "is valid"?
      • do we need a richer sim-bootstrap?
    • e2e testing for reports along with app?
  • file transformation service

    • how much more mature/flexible/robust do we need this to be?
    • how thorough should testing be here since it's mostly glue?
    • are we happy with just incrementally growing this python script? should we switch to a different approach already?