Merge branch 'main' of ssh.dev.azure.com:v3/guardhog/Data/data-infra-script

This commit is contained in:
Pablo Martin 2024-06-14 10:59:43 +02:00
commit 409c23691b

View file

@ -410,60 +410,110 @@ Follow this to deploy the entire data infra.
- Validate the deployment by trying to log into the database with the `dwh_admin_<your-env>` user from your favourite SQL client (you can use DBeaver, for example). Be aware that your VPN connection should be active so that the DWH is reachable from your device. - Validate the deployment by trying to log into the database with the `dwh_admin_<your-env>` user from your favourite SQL client (you can use DBeaver, for example). Be aware that your VPN connection should be active so that the DWH is reachable from your device.
### 4.2 Create database and schemas ### 4.2 Create database
- Run the following commands to create a new database and the needed schemas - Run the following commands to create a new database
```sql ```sql
CREATE DATABASE dwh; CREATE DATABASE dwh;
-- Change active DB to dwh
CREATE SCHEMA staging;
CREATE SCHEMA intermediate;
CREATE SCHEMA reporting;
``` ```
### 4.3 Create users and roles - From now on, use this database for everything
### 4.3 Create schemas, roles and users
- Run the following script to create: - Run the following script to create:
- A `modeler` role, owner of the `staging`, `intermediate` and `reporting` schemas. - A `dwh_builder` role, which:
- A `consumer` role, capable of reading the `reporting` schema. - Owns the `staging`, `intermediate` and `reporting` schemas.
- A dbt user, with `modeler` role. - Can read `sync_` schemas.
- An airbyte user, with permission to create new schemas. - Thus, can do as he pleases in them.
- A Power BI user, with `consumer` role. - Designed to support `dbt run`.
- A `modeler` role, which:
- Can read the `staging`, `intermediate` and `reporting` schemas.
- Can read `sync_` schemas.
- Designed for data team members and power users to be able to read everything within the DWH.
- A `consumer` role, which:
- Can read the `reporting` schema.
- A user `dbt_user`, with `dwh_builder` role.
- A user `powerbi_user`, with `consumer` role.
- A user `airbyte user`, with permission to create new schemas.
- *Note: replace the password fields with serious passwords and note them down.* - *Note: replace the password fields with serious passwords and note them down.*
- *Note: replace the name of the admin user* - *Note: replace the name of the admin user*
```sql ```sql
-- Start here, logged in as the dwh admin
CREATE ROLE airbyte_user LOGIN PASSWORD 'password' VALID UNTIL 'infinity'; CREATE ROLE airbyte_user LOGIN PASSWORD 'password' VALID UNTIL 'infinity';
GRANT CREATE ON DATABASE dwh TO airbyte_user; GRANT CREATE ON DATABASE dwh TO airbyte_user;
CREATE ROLE modeler INHERIT; -- Login as the airbyte_user
GRANT USAGE ON SCHEMA staging TO modeler; CREATE SCHEMA sync_default;
GRANT USAGE ON SCHEMA intermediate TO modeler;
GRANT USAGE ON SCHEMA reporting TO modeler;
GRANT ALL ON ALL TABLES IN SCHEMA staging TO modeler;
GRANT ALL ON ALL TABLES IN SCHEMA intermediate TO modeler;
GRANT ALL ON ALL TABLES IN SCHEMA reporting TO modeler;
GRANT modeler TO dwh_admin_<your-env>; -- Login as the dwh admin again
ALTER SCHEMA staging OWNER TO modeler; CREATE SCHEMA staging;
ALTER SCHEMA intermediate OWNER TO modeler; CREATE SCHEMA intermediate;
ALTER SCHEMA reporting OWNER TO modeler; CREATE SCHEMA reporting;
CREATE ROLE dwh_builder INHERIT;
ALTER SCHEMA staging OWNER TO dwh_builder;
ALTER SCHEMA intermediate OWNER TO dwh_builder;
ALTER SCHEMA reporting OWNER TO dwh_builder;
CREATE ROLE dbt_user LOGIN PASSWORD 'password' VALID UNTIL 'infinity'; CREATE ROLE dbt_user LOGIN PASSWORD 'password' VALID UNTIL 'infinity';
GRANT modeler to dbt_user; GRANT dwh_builder to dbt_user;
CREATE ROLE consumer INHERIT; CREATE ROLE consumer INHERIT;
CREATE ROLE powerbi_user LOGIN PASSWORD 'password' VALID UNTIL 'infinity';
GRANT consumer to powerbi_user;
CREATE ROLE modeler INHERIT;
-- You might want to create a first personal user with modeler role here
-- Login as airbyte_user
GRANT USAGE ON SCHEMA sync_default TO dwh_builder;
GRANT SELECT ON ALL TABLES IN SCHEMA sync_default TO dwh_builder;
ALTER DEFAULT PRIVILEGES IN SCHEMA sync_default GRANT SELECT ON TABLES TO dwh_builder;
GRANT USAGE ON SCHEMA sync_default TO modeler;
GRANT SELECT ON ALL TABLES IN SCHEMA sync_default TO modeler;
ALTER DEFAULT PRIVILEGES IN SCHEMA sync_default GRANT SELECT ON TABLES TO modeler;
-- Login as dbt_user
GRANT USAGE ON SCHEMA staging TO modeler;
GRANT SELECT ON ALL TABLES IN SCHEMA staging TO modeler;
ALTER DEFAULT PRIVILEGES IN SCHEMA staging GRANT SELECT ON TABLES TO modeler;
GRANT USAGE ON SCHEMA intermediate TO modeler;
GRANT SELECT ON ALL TABLES IN SCHEMA intermediate TO modeler;
ALTER DEFAULT PRIVILEGES IN SCHEMA intermediate GRANT SELECT ON TABLES TO modeler;
GRANT USAGE ON SCHEMA reporting TO modeler;
GRANT SELECT ON ALL TABLES IN SCHEMA reporting TO modeler;
ALTER DEFAULT PRIVILEGES IN SCHEMA reporting GRANT SELECT ON TABLES TO modeler;
GRANT USAGE ON SCHEMA reporting TO consumer; GRANT USAGE ON SCHEMA reporting TO consumer;
GRANT SELECT ON ALL TABLES IN SCHEMA reporting TO consumer; GRANT SELECT ON ALL TABLES IN SCHEMA reporting TO consumer;
ALTER DEFAULT PRIVILEGES IN SCHEMA reporting GRANT SELECT ON TABLES TO consumer; ALTER DEFAULT PRIVILEGES IN SCHEMA reporting GRANT SELECT ON TABLES TO consumer;
CREATE ROLE powerbi_user LOGIN PASSWORD 'password' VALID UNTIL 'infinity';
GRANT consumer to powerbi_user;
``` ```
- If you want, you might also want to create more users depending on your needs. Typically, date team members should also have the `modeler` role. - On adding new users:
- Typically, you will want to create personal accounts for data team members with `modeler` role so that they can query everywhere in the dwh.
- Any other services or users that need to access the reporting layer can be given the `consumer` role.
- Furthermore, `sync_` schema permissions need to be dynamically managed from this point on. This means that:
- Generally, all `sync_` schemas should be created by the `airbyte_user`.
- Whenever a new `sync_` schema comes to life, both the `modeler` and `dwh_builder` roles should receive access. You can use the following command template:
```sql
-- Login as airbyte_user
GRANT USAGE ON SCHEMA sync_<some-new-source> TO dwh_builder;
GRANT SELECT ON ALL TABLES IN SCHEMA sync_<some-new-source> TO dwh_builder;
ALTER DEFAULT PRIVILEGES IN SCHEMA sync_<some-new-source> GRANT SELECT ON TABLES TO dwh_builder;
GRANT USAGE ON SCHEMA sync_<some-new-source> TO modeler;
GRANT SELECT ON ALL TABLES IN SCHEMA sync_<some-new-source> TO modeler;
ALTER DEFAULT PRIVILEGES IN SCHEMA sync_<some-new-source> GRANT SELECT ON TABLES TO modeler;
```
## 5. Airbyte ## 5. Airbyte