diff --git a/human-script.md b/human-script.md index 5c758f8..324a496 100644 --- a/human-script.md +++ b/human-script.md @@ -204,7 +204,7 @@ Follow this to deploy the entire data infra. - The first VM we must deploy is a jumphost, since that will be our door to all other services inside the virtual network. - Create the VM - Basic settings - - Name it: `jumphost` + - Name it: `jumphost-` - Use Ubuntu Server 22.04 - Use Size: `Standard_B1s` - Use username: `azureuser` @@ -243,6 +243,7 @@ Follow this to deploy the entire data infra. - Connect through SSH - We will now set up a VPN server and client with Wireguard - Run the following script (requires `sudo`) to install wireguard and configure it + - Pay attention: you need to fill in the public IP manually, as well as the network mask of the virtual network - *Note: the IPs chosen for the VPN can absolutely be changed. Just make sure they are consistent across the server and client configurations of the VPN.* ```bash @@ -289,11 +290,12 @@ Follow this to deploy the entire data infra. # Jumphost VPN PrivateKey = ${CLIENT_PRIVATE_KEY} Address = 192.168.70.1/32 - # Uncomment when DNS Server is ready DNS = 192.168.69.1 + # Uncomment when DNS Server is ready + # DNS = 192.168.69.1 [Peer] PublicKey = ${SERVER_PUBLIC_KEY} - AllowedIPs = 192.168.69.1/32 + AllowedIPs = 192.168.69.1/32, Endpoint = :52420 ############################## @@ -373,12 +375,44 @@ Follow this to deploy the entire data infra. - In your client Wireguard configuration, uncomment the DNS server line we left before - Check that the service is running fine by running `dig google.com`. You should see in the output that your laptop has relied on our new DNS to do the name resolution. -### 3.4 Harden the VM +### 3.4 Harden the Jumphost VM + +- In the Jumphost, run the following command to disable password based SSH authentication fully. This way, access can only be granted with SSH key pairs, which is way more secure: `sudo sed -i -e 's/#PasswordAuthentication yes/PasswordAuthentication no/g' /etc/ssh/sshd_config; sudo systemctl restart ssh`. +- Remove the AllowSSHInboundTemporarily rule that you added to the NSG `superhog-data-nsg-jumphost-`. We don't need that anymore since we can SSH through the VPN tunnel. -- First, remove the AllowSSHInboundTemporarily rule that you added ## 4. DWH +### 4.1 Deploy PostgreSQL Server + +- Next, we will deploy a Postgres server to act as the DWH. + - Create a new Azure Database for PostgreSQL flexible servers. + - Basics + - Name it: `superhog-dwh-`. + - On field `PostgreSQL version` pick version 16. + - Adapt the sizing to your needs. Only you know how much this server is going to take. + - For field `Authentication method` pick `PostgreSQL authentication only`. + - Name the user admin: `dwh_admin_`. + - Give it a password and make sure to note it down. + - Networking + - On field `Connectivity method` select `Private access (VNet Integration)` + - Pick the virtual network `superhog-data-vnet-` and the subnet `databases-subnet`. + - Create a new private dns zone. Unfortunately, we can't use `.data.superhog.com` for this service. + - Security + - Defaults are fine + - Add tags: + - `team: data` + - `environment: ` + - `project: dwh` + +- Validate the deployment by trying to log into the database with the `dwh_admin_` user from your favourite SQL client (you can use DBeaver, for example). Be aware that your VPN connection should be active so that the DWH is reachable from your device. + +### 4.2 Create users and roles + +### 4.3 Create schemas + +### 4.4 Create permissions + ## 5. Airbyte ## 6. Power BI