diff --git a/architecture-overview.md b/platform-overview.md similarity index 81% rename from architecture-overview.md rename to platform-overview.md index 044ec78..cb19da5 100644 --- a/architecture-overview.md +++ b/platform-overview.md @@ -1,8 +1,8 @@ -# Architecture Overview +# Platform Overview Our infrastructure is designed to run on Azure. -The data infra architecture provides the following services: +The data platform provides the following services: - A PostgreSQL server which acts as a DWH. - A self-hosted Airbyte service that acts as a data integration tool (E and L out of ELT). @@ -11,19 +11,18 @@ The data infra architecture provides the following services: - A simple scheduled dbt run for a dbt project that runs on top of the DWH. - A VPN Server + DNS Resolution to allow developers and power users to access the different services. -The infra serves Superhog in the following way: +The platform serves Superhog in the following way: -- Data gets ingested from several sources into our DWH. +- Data gets ingested from several sources into our DWH. Typically with Airbyte, but other options might be needed for specific cases. - We perform data cleaning and modeling inside the DWH with dbt. This results in tables in a reporting schema that support our data needs. - Data team members and power users build PBI reports and other data products on top of the reporting schema. - Data team members and other analysts can also rely on direct access to the DWH to perform ad-hoc analysis and basically cover any data needs that go beyond PBI reports. -The data infra relies on the following main components: +The data infra relies on the following main Azure components: - A subscription to hold everything. - A resource group to hold all resources. -- A private network. -- Three subnets. +- A private network and three subnets. - A private DNS zone. - A managed PostgreSQL server. - A handful of VMs to host services.