38 lines
1.8 KiB
Markdown
38 lines
1.8 KiB
Markdown
# Architecture Overview
|
|
|
|
Our infrastructure is designed to run on Azure.
|
|
|
|
The data infra architecture provides the following services:
|
|
|
|
- A PostgreSQL server which acts as a DWH.
|
|
- A self-hosted Airbyte service that acts as a data integration tool (E and L out of ELT).
|
|
- A Power BI Data Gateway to allow the Power BI service to read from the DWH.
|
|
- A Power BI Service environment where we build reports and apps for our users.
|
|
- A simple scheduled dbt run for a dbt project that runs on top of the DWH.
|
|
- A VPN Server + DNS Resolution to allow developers and power users to access the different services.
|
|
|
|
The infra serves Superhog in the following way:
|
|
|
|
- Data gets ingested from several sources into our DWH.
|
|
- We perform data cleaning and modeling inside the DWH with dbt. This results in tables in a reporting schema that support our data needs.
|
|
- Data team members and power users build PBI reports and other data products on top of the reporting schema.
|
|
- Data team members and other analysts can also rely on direct access to the DWH to perform ad-hoc analysis and basically cover any data needs that go beyond PBI reports.
|
|
|
|
The data infra relies on the following main components:
|
|
|
|
- A subscription to hold everything.
|
|
- A resource group to hold all resources.
|
|
- A private network.
|
|
- Three subnets.
|
|
- A private DNS zone.
|
|
- A managed PostgreSQL server.
|
|
- Three VMs.
|
|
- Repositories in Azure Devops.
|
|
|
|
More detailed components also get created for some of those (network security groups, disks, network interfaces, etc).
|
|
|
|
The following elements are external to the data infrastructure but important:
|
|
|
|
- Superhog's application SQL Server database + Networking settings for it to be reachable from Airbyte.
|
|
- Superhog's service status.
|
|
- VPN configurations in our laptops to access the data private network.
|