diff --git a/README.md b/README.md index b3b73e5..1291dc8 100644 --- a/README.md +++ b/README.md @@ -75,12 +75,29 @@ We organize models in four folders: ## Conventions -- Always use CTEs in your models to `source` and `ref` other models. -- We follow [snake case](https://en.wikipedia.org/wiki/Snake_case). -- Identifier columns should begin with `id_`, not finish with `_id`. -- Use binary question-like column names for binary, bool, and flag columns (i.e. not `active` but `is_active`, not `verified` but `has_been_verified`, not `imported` but `was_imported`) -- Datetime columns should either finish in `_utc` or `_local`. If they finish in local, the table should contain a `local_timezone` column that contains the [timezone identifier](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones). -- We work with many currencies and lack a single main once. Hence, any money fields will be ambiguous on their own. To address this, any table that has money related columns should also have a column named `currency`. We currently have no policy for tables where a single record has columns in different currencies. If you face this, assemble the data team and decide on something. +- dbt practices: + - Always use CTEs in your models to `source` and `ref` other models. +- Columns and naming + - We follow [snake case](https://en.wikipedia.org/wiki/Snake_case) for column names and table names. + - Identifier columns should begin with `id_`, not finish with `_id`. + - Use binary question-like column names for binary, bool, and flag columns (i.e. not `active` but `is_active`, not `verified` but `has_been_verified`, not `imported` but `was_imported`) + - Datetime columns should either finish in `_utc` or `_local`. If they finish in local, the table should contain a `local_timezone` column that contains the [timezone identifier](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones). + - We work with many currencies and lack a single main once. Hence, any money fields will be ambiguous on their own. To address this, any table that has money related columns should also have a column named `currency`. We currently have no policy for tables where a single record has columns in different currencies. If you face this, assemble the data team and decide on something. +- Folder structures and naming + - All models live in models, and either in staging, intermediate or reporting. + - Staging models should be prepended with `stg_` and intermediate with `int_`. + - Split schema and domain with double underscode (ie `stg_core__booking`). + - Always use sources to read into staging models. +- SQL formatting should be done with `sqlfmt`. + +When in doubt, do what dbt guys would do: +Or Gitlab: + +## Testing Standards + +- All tables in staging need Primary Key and Null tests. +- Tables in reporting should have more thorough testing. What to look for is up to you, but it should provide strong confidence in the quality of data. +- Tests will be ran after every `dbt run`. ## How to schedule