add a few more conventions to readme

2024-05-31 10:35:57 +02:00 · 2024-05-31 10:35:57 +02:00 · d81c0b5e74
commit d81c0b5e74
parent ae10d30cd1
1 changed files with 23 additions and 6 deletions
--- a/README.md
+++ b/README.md
@ -75,12 +75,29 @@ We organize models in four folders:

 ## Conventions

+- dbt practices:
  - Always use CTEs in your models to `source` and `ref` other models.
- We follow [snake case](https://en.wikipedia.org/wiki/Snake_case).
+- Columns and naming
+  - We follow [snake case](https://en.wikipedia.org/wiki/Snake_case) for column names and table names.
  - Identifier columns should begin with `id_`, not finish with `_id`.
  - Use binary question-like column names for binary, bool, and flag columns (i.e. not `active` but `is_active`, not `verified` but `has_been_verified`, not `imported` but `was_imported`)
  - Datetime columns should either finish in `_utc` or `_local`. If they finish in local, the table should contain a `local_timezone` column that contains the [timezone identifier](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones).
  - We work with many currencies and lack a single main once. Hence, any money fields will be ambiguous on their own. To address this, any table that has money related columns should also have a column named `currency`. We currently have no policy for tables where a single record has columns in different currencies. If you face this, assemble the data team and decide on something.
+- Folder structures and naming
+  - All models live in models, and either in staging, intermediate or reporting.
+  - Staging models should be prepended with `stg_` and intermediate with `int_`.
+  - Split schema and domain with double underscode (ie `stg_core__booking`).
+  - Always use sources to read into staging models.
+- SQL formatting should be done with `sqlfmt`.
+
+When in doubt, do what dbt guys would do: <https://docs.getdbt.com/best-practices>
+Or Gitlab: <https://handbook.gitlab.com/handbook/business-technology/data-team/platform/dbt-guide/>
+
+## Testing Standards
+
+- All tables in staging need Primary Key and Null tests.
+- Tables in reporting should have more thorough testing. What to look for is up to you, but it should provide strong confidence in the quality of data.
+- Tests will be ran after every `dbt run`.

 ## How to schedule