# Platform Overview Our infrastructure is designed to run on Azure. The data platform provides the following services: - A PostgreSQL server which acts as a DWH. - A self-hosted Airbyte service that acts as a data integration tool (E and L out of ELT). - A Power BI Data Gateway to allow the Power BI service to read from the DWH. - A Power BI Service environment where we build reports and apps for our users. - A simple scheduled dbt run for a dbt project that runs on top of the DWH. - A VPN Server + DNS Resolution to allow developers and power users to access the different services. The platform serves Superhog in the following way: - Data gets ingested from several sources into our DWH. Typically with Airbyte, but other options might be needed for specific cases. - We perform data cleaning and modeling inside the DWH with dbt. This results in tables in a reporting schema that support our data needs. - Data team members and power users build PBI reports and other data products on top of the reporting schema. - Data team members and other analysts can also rely on direct access to the DWH to perform ad-hoc analysis and basically cover any data needs that go beyond PBI reports. The data infra relies on the following main Azure components: - A subscription to hold everything. - A resource group to hold all resources. - A private network and three subnets. - A private DNS zone. - A managed PostgreSQL server. - A handful of VMs to host services. - Repositories in Azure Devops. More detailed components also get created for some of those (network security groups, disks, network interfaces, etc). The following elements are external to the data infrastructure but important: - Superhog's application SQL Server database + Networking settings for it to be reachable from Airbyte. - Superhog's service status page. - VPN configurations in our laptops to access the data private network.