The data infra architecture provides the following services:
- A PostgreSQL server which acts as a DWH.
- A self-hosted Airbyte service that acts as a data integration tool (E and L out of ELT).
- A Power BI Data Gateway to allow the Power BI service to read from the DWH.
- A Power BI Service environment where we build reports and apps for our users.
- A simple scheduled dbt run for a dbt project that runs on top of the DWH.
- A VPN Server + DNS Resolution to allow developers and power users to access the different services.
The infra serves Superhog in the following way:
- Data gets ingested from several sources into our DWH.
- We perform data cleaning and modeling inside the DWH with dbt. This results in tables in a reporting schema that support our data needs.
- Data team members and power users build PBI reports and other data products on top of the reporting schema.
- Data team members and other analysts can also rely on direct access to the DWH to perform ad-hoc analysis and basically cover any data needs that go beyond PBI reports.
The data infra relies on the following main components: