superhog-infra-repo/human-script.md

473 lines
18 KiB
Markdown
Raw Normal View History

2024-02-06 13:53:05 +01:00
# Human Script
Follow this to deploy the entire data infra.
2024-02-06 17:11:27 +01:00
## 0. Pre-requisites and conventions
2024-02-06 13:53:05 +01:00
- You need an Azure subscription and a user with administrator rights in it.
2024-02-06 16:11:50 +01:00
- Whenever you see `<your-env>`, you should replace that with `dev`,`uat`, `prd` or whatever fits your environment.
- We traditionally deploy resources on the `UK South` region. Unless stated otherwise, you should deploy resources there.
2024-02-06 17:11:27 +01:00
- You have an SSH key pair ready to use for access to the different machines. You can always add more pairs later.
2024-02-06 13:53:05 +01:00
2024-02-06 17:11:27 +01:00
## 1. Resource group and SSH Keypair
### 1.1 Create Resource Group
2024-02-06 13:53:05 +01:00
2024-02-06 16:11:50 +01:00
- Create a resource group. This resource group will hold all the resources. For the rest of this guide, assume this is the resource group where you must create resources.
- Name it: `superhog-data-rg-<your-env>`
- Add tags:
- `team: data`
- `environment: <your-env>`
2024-02-06 17:11:27 +01:00
### 1.2 SSH Keypair
- We will create an SSH Keypair for this deployment. It will be used to access VMs, Git repos and other services.
- Create the SSH Key pair
2024-02-06 17:21:47 +01:00
- Name the key: `superhog-data-<your-env>-general-ssh`
2024-02-06 17:11:27 +01:00
- Add tags:
- `team: data`
- `environment: <your-env>`
- Pay attention when storing the private key. You probably want to store it in a safe password manager, like Keeper.
- Optionally, you can also be extra paranoid, generate the SSH key locally and only upload the public key to Azure. Up to you.
2024-02-06 13:53:05 +01:00
## 2. Networking
2024-02-06 16:11:50 +01:00
### 2.1 VNET
- Create a virtual network. This virtual network is where all our infra will live. For the rest of this guide, assume this is the network where you must connect services.
- Name it: `superhog-data-vnet-<your-env>`
- You need to think what the network range should be like. For example, you could decide that the entire vnet will be contained within. For reference, we should be fine with a `/24` space (256 addresses) since we will only have a handful network interfaces connecting.
- As an example, we will use `10.69.0.0/24`. This link might be helpful: <https://www.davidc.net/sites/default/subnets/subnets.html?network=10.69.0.0&mask=24&division=11.f10>
- You need to add three subnets:
- Add no network security groups to any of the subnets still. We will create those later.
- Jumphost subnet
- This subnet is where jumphost boxes will live.
- It will be the only subnet where we allow inbound connections from WAN.
- Name it `jumphost-subnet`.
- For our example, we will make it `10.69.0.0/29` (8 addresses).
- Database subnet
- This subnet is where the DWH database will live.
- Inbound traffic will be allowed from both the jumphost subnet as well as the services subnet.
- Name it `database-subnet`
- For our example, we will make it `10.69.0.8/29` (8 addresses).
- Services subnet
- This subnet is where most VMs dedicated to data services live (Airbyte, dbt, PBI Data Gateway, etc).
- Inbound traffic will only be allowed from the jumphost subnet.
- Name it `services-subnet`
- For our example, we will make it `10.69.0.64/26` (64 addresses)
- Add tags:
- `team: data`
- `environment: <your-env>`
- `project: network`
### 2.2 Network security groups
- You will create three network security groups (NSG)
- Jumphost NSG
- Name it: `superhog-data-nsg-jumphost-<your-env>`
- Purpose: only allow connecting to the VPN server. We deny absolutely any other inbound traffic.
- Add tags:
- `team: data`
- `environment: <your-env>`
- `project: network`
- Add the following inbound rules
- VPN Rule
- Name: AllowWireguardInbound
- Source: Any
- Source port ranges: *
- Destination: the addresss range for the `jumphost-subnet`. In this example, `10.69.0.0/29`.
2024-02-07 14:50:07 +01:00
- Destination port ranges: 52420
2024-02-06 16:11:50 +01:00
- Protocol: UDP
- Action: Allow
- Priority: 100
- Deny Rule
- Name: DenyAllInbound
- Source: Any
- Source port ranges: *
- Destination: Any
- Destination port ranges: *
- Protocol: Any
- Action: Allow
- Priority: 1000
- Services NSG
- Name it: `superhog-data-nsg-services-<your-env>`
- Purpose: only allow the service VMs to be reached from our jumphost subnet. We deny absolutely any other inbound traffic.
- Add tags:
- `team: data`
- `environment: <your-env>`
- `project: network`
- Add the following inbound rules
- SSH Rule
- Name: AllowSSHFromJumphostInbound
- Source: the addresss range for the `jumphost-subnet`. In this example, `10.69.0.0/29`.
- Source port ranges: *
- Destination: the addresss range for the `services-subnet`. In this example, `10.69.0.64/26`.
- Destination port ranges: 22
- Protocol: TCP
- Action: Allow
- Priority: 100
- SSH Rule
- Name: AllowRDPFromJumphostInbound
- Source: the addresss range for the `jumphost-subnet`. In this example, `10.69.0.0/29`.
- Source port ranges: *
- Destination: the addresss range for the `services-subnet`. In this example, `10.69.0.64/26`.
- Destination port ranges: 3389
- Protocol: TCP
- Action: Allow
- Priority: 110
- Airbyte web rule
- Name: AllowAirbyteWebFromJumphostInbound
- Source: the addresss range for the `jumphost-subnet`. In this example, `10.69.0.0/29`.
- Source port ranges: *
- Destination: the addresss range for the `services-subnet`. In this example, `10.69.0.64/26`.
- Destination port ranges: 80
- Protocol: TCP
- Action: Allow
- Priority: 120
- Deny Rule
- Name: DenyAllInbound
- Source: Any
- Source port ranges: *
- Destination: Any
- Destination port ranges: *
- Protocol: Any
- Action: Allow
- Priority: 1000
2024-02-06 16:36:09 +01:00
- Database NSG
- Name it: `superhog-data-nsg-database-<your-env>`
- Purpose: make the databases subnet reachable only from our services subnet and from our jumphost subnet.
2024-02-06 16:11:50 +01:00
- Add tags:
- `team: data`
- `environment: <your-env>`
- `project: network`
2024-02-06 16:36:09 +01:00
- Add the following inbound rules
- Postgres Jumphost Rule
- Name: AllowPostgresFromJumphostInbound
- Source: the addresss range for the `jumphost-subnet`. In this example, `10.69.0.0/29`.
- Source port ranges: *
- Destination: the addresss range for the `databases-subnet`. In this example, `10.69.0.8/29`.
- Destination port ranges: 5432
- Protocol: TCP
- Action: Allow
- Priority: 100
- Postgres Services Rule
2024-02-06 16:55:20 +01:00
- Name: AllowPostgresFromServicesInbound
2024-02-06 16:36:09 +01:00
- Source: the addresss range for the `services-subnet`. In this example, `10.69.0.64/26`.
- Source port ranges: *
- Destination: the addresss range for the `databases-subnet`. In this example, `10.69.0.8/29`.
- Destination port ranges: 5432
- Protocol: TCP
- Action: Allow
- Priority: 110
- Deny Rule
- Name: DenyAllInbound
- Source: Any
- Source port ranges: *
- Destination: Any
- Destination port ranges: *
- Protocol: Any
- Action: Allow
- Priority: 1000
2024-02-07 10:57:23 +01:00
- Finally, you need to attach each NSG to the related subnet
- Visit the virtual network page and look for the subnets list
- For each subnet, select its NSG and attach it
2024-02-06 16:11:50 +01:00
2024-02-06 16:55:20 +01:00
### 2.3 Private DNS Zone
- We will set up a private DNS Zone to avoid using hardcoded IPs to refer to services within the virtual network. This makes integrations more resilient because a service can change its IP and still be reached by other services (as long as other network configs like firewalls are still fine).
- Create the Private DNS Zone
- Name it: `<your-env>.data.superhog.com`
- Add tags:
- `team: data`
- `environment: <your-env>`
- `project: network`
- Add a new virtual network link to the zone
- Name it: `privatelink-<your-env>.data.superhog.com`
- Associate it to the virtual network.
- Enable autoregistration
2024-02-06 17:21:47 +01:00
### 2.4 Public IP
- We will need a public IP for the jumphost.
- Create the public IP
- Name it: `superhog-data-jumphost-ip-<your-env>`
- For setting `Routing preference` select option: `Microsoft Network`
- Add tags:
- `team: data`
- `environment: <your-env>`
- `project: network`
2024-02-06 13:53:05 +01:00
## 3. Jumphost
2024-02-06 17:11:27 +01:00
### 3.1 Deploy Jumphost VM
- The first VM we must deploy is a jumphost, since that will be our door to all other services inside the virtual network.
- Create the VM
2024-02-07 10:57:23 +01:00
- Basic settings
2024-02-09 15:14:10 +01:00
- Name it: `jumphost-<your-env>`
2024-02-07 10:57:23 +01:00
- Use Ubuntu Server 22.04
- Use Size: `Standard_B1s`
- Use username: `azureuser`
- Use the SSH Key: `superhog-data-<your-env>-general-ssh`
- Select the option `None` for Public inbound ports.
- Disk settings
- Defaults are fine. This barely needs any disk.
- Networking
- Attach to the virtual network `superhog-data-vnet-<your-env>`
- Attach to the subnet `jumphost-subnet`
- Attach the public ip `superhog-data-jumphost-ip-<your-env>`
- For setting `NIC network security group` select option `None`
- Management settings
- Defaults are fine.
- Monitoring
- Defaults are fine.
- Advanced
- Defaults are fine.
- Add tags:
- `team: data`
- `environment: <your-env>`
- `project: network`
2024-02-06 17:21:47 +01:00
2024-02-06 17:24:58 +01:00
### 3.2 Configure a VPN Server
2024-02-06 17:11:27 +01:00
2024-02-07 10:57:23 +01:00
- The jumphost we just created is not accessible via SSH from WAN due to the NSG set in the jumphost subnet.
- To make it so, you should temporarily create a new rule like this in the NSG `superhog-data-nsg-jumphost-<your-env>`.
- Name: AllowSSHInboundTemporarily
- Source: your IP.
- Source port ranges: *
- Destination: the addresss range for the `jumphost-subnet`. In this example, `10.69.0.0/29`.
- Destination port ranges: 22
- Protocol: TCP
- Action: Allow
- Priority: 110
- Connect through SSH
- We will now set up a VPN server and client with Wireguard
- Run the following script (requires `sudo`) to install wireguard and configure it
2024-02-09 15:14:10 +01:00
- Pay attention: you need to fill in the public IP manually, as well as the network mask of the virtual network
2024-02-07 14:50:07 +01:00
- *Note: the IPs chosen for the VPN can absolutely be changed. Just make sure they are consistent across the server and client configurations of the VPN.*
2024-02-07 10:57:23 +01:00
```bash
echo "Installing Wireguard."
apt update
apt install wireguard -y
echo "Wireguard installed."
echo "Creating keys."
SERVER_PRIVATE_KEY=$(wg genkey)
SERVER_PUBLIC_KEY=$(echo "$SERVER_PRIVATE_KEY" | wg pubkey)
CLIENT_PRIVATE_KEY=$(wg genkey)
CLIENT_PUBLIC_KEY=$(echo "$CLIENT_PRIVATE_KEY" | wg pubkey)
echo "Keys created."
echo "Writing server config file."
touch /etc/wireguard/wg0.conf
cat > /etc/wireguard/wg0.conf << EOL
[Interface]
PrivateKey = ${SERVER_PRIVATE_KEY}
Address = 192.168.69.1/32
ListenPort = 52420
# IP forwarding
PreUp = sysctl -w net.ipv4.ip_forward=1
# IP masquerading
PreUp = iptables -t mangle -A PREROUTING -i wg0 -j MARK --set-mark 0x30
PreUp = iptables -t nat -A POSTROUTING ! -o wg0 -m mark --mark 0x30 -j MASQUERADE
PostDown = iptables -t mangle -D PREROUTING -i wg0 -j MARK --set-mark 0x30
PostDOwn = iptables -t nat -D POSTROUTING ! -o wg0 -m mark --mark 0x30 -j MASQUERADE
[Peer]
PublicKey = ${CLIENT_PUBLIC_KEY}
AllowedIPs = 192.168.70.1/32
EOL
echo "Server config file written."
echo "Configuration for client, copy paste in your machine."
cat << EOF
2024-02-07 14:50:07 +01:00
##############################
2024-02-07 10:57:23 +01:00
[Interface]
# Jumphost VPN
PrivateKey = ${CLIENT_PRIVATE_KEY}
Address = 192.168.70.1/32
2024-02-09 15:14:10 +01:00
# Uncomment when DNS Server is ready
# DNS = 192.168.69.1
2024-02-07 10:57:23 +01:00
[Peer]
PublicKey = ${SERVER_PUBLIC_KEY}
2024-02-09 15:14:10 +01:00
AllowedIPs = 192.168.69.1/32,<network-mask-for-vnet>
2024-02-07 10:57:23 +01:00
Endpoint = <fill-public-ip-here>:52420
2024-02-07 14:50:07 +01:00
##############################
2024-02-07 10:57:23 +01:00
EOF
2024-02-07 14:50:07 +01:00
echo "Setting the Wireguard server as a system service."
systemctl enable wg-quick@wg0.service
echo "Starting Wireguard server."
systemctl start wg-quick@wg0.service
2024-02-07 10:57:23 +01:00
echo "Finished."
```
2024-02-07 14:50:07 +01:00
2024-02-07 18:43:10 +01:00
- You should copy the client config that the script will produce and set up the Wireguard config on your local machine.
- Once you've done so, start Wireguard on the client and try to ping the server from the client with the Wireguard VPN IP. If it reaches, the VPN is working fine.
- Now, validate your setup by SSHing from your local device into the jumphost by referencing the VPN IP of the jumphost instead of the public IP.
- Once you verify everything works, you should go to the NSG of the jumphost and remove rule AllowSSHInboundTemporarily. From this point on, the only entrypoint from WAN to the virtual network is the VPN port in the jumphost machine.
- Next, we must allow IP forwarding on Azure.
- Look for the jumphost VM Network Interface.
- In the `IP configurations` session, activate the flag `Enable IP forwarding`.
2024-02-07 10:57:23 +01:00
2024-02-06 17:24:58 +01:00
### 3.3 Configure a DNS Server
2024-02-06 17:11:27 +01:00
2024-02-09 11:36:33 +01:00
- The jumphost is now ready. When the VPN is active on our local device, we can access the services within the virtual network.
- There is one issue, though: we would like to access services through names, not IPs.
- Our Private DNS Zone takes care of providing names to services within the virtual network. But these resolution only happens within the virtual network itself, so our external device can't rely on it.
- To solve this, we need to force DNS resolution of our laptops to happen from within the virtual network itself.
- To do so, we will set up a DNS server in the jumphost, and set up our VPN configuration to use it when the VPN connection in our device is active.
- Connect to the jumphost through SSH
- Run the following script as `sudo` from the home folder of `azureuser`
```bash
echo "Installing dependencies."
apt install dpkg-dev debhelper jq -y
echo "Cloning coredns."
git clone https://github.com/coredns/deployment.git coredns/deployment
cd coredns/deployment
echo "Building package."
dpkg-buildpackage -us -uc -b
cd ..
echo "Installing package."
dpkg -i coredns*.deb
echo "Disabling Stub resolver."
sed -i -e 's/#DNSStubListener=yes/DNSStubListener=no/g' /etc/systemd/resolved.conf
systemctl restart systemd-resolved
echo "Writing config file".
rm /etc/coredns/Corefile
cat > /etc/coredns/Corefile << EOL
. {
hosts {
log
# If you want to make custom mappings, place them here
# Format is
# xxx.xxx.xxx.xxx your.domain.name
# By default, we delegate on Azure
fallthrough
}
forward . 168.63.129.16 # This IP is Azure's DNS service
errors
}
EOL
echo "Restarting coredns to pick up new config."
systemctl restart coredns.service
```
- In your client Wireguard configuration, uncomment the DNS server line we left before
- Check that the service is running fine by running `dig google.com`. You should see in the output that your laptop has relied on our new DNS to do the name resolution.
2024-02-09 15:14:10 +01:00
### 3.4 Harden the Jumphost VM
- In the Jumphost, run the following command to disable password based SSH authentication fully. This way, access can only be granted with SSH key pairs, which is way more secure: `sudo sed -i -e 's/#PasswordAuthentication yes/PasswordAuthentication no/g' /etc/ssh/sshd_config; sudo systemctl restart ssh`.
- Remove the AllowSSHInboundTemporarily rule that you added to the NSG `superhog-data-nsg-jumphost-<your-env>`. We don't need that anymore since we can SSH through the VPN tunnel.
2024-02-06 16:55:20 +01:00
2024-02-06 13:53:05 +01:00
## 4. DWH
2024-02-09 15:14:10 +01:00
### 4.1 Deploy PostgreSQL Server
- Next, we will deploy a Postgres server to act as the DWH.
- Create a new Azure Database for PostgreSQL flexible servers.
- Basics
- Name it: `superhog-dwh-<your-env>`.
- On field `PostgreSQL version` pick version 16.
- Adapt the sizing to your needs. Only you know how much this server is going to take.
- For field `Authentication method` pick `PostgreSQL authentication only`.
- Name the user admin: `dwh_admin_<your-env>`.
- Give it a password and make sure to note it down.
- Networking
- On field `Connectivity method` select `Private access (VNet Integration)`
- Pick the virtual network `superhog-data-vnet-<your-env>` and the subnet `databases-subnet`.
- Create a new private dns zone. Unfortunately, we can't use `<your-env>.data.superhog.com` for this service.
- Security
- Defaults are fine
- Add tags:
- `team: data`
- `environment: <your-env>`
- `project: dwh`
- Validate the deployment by trying to log into the database with the `dwh_admin_<your-env>` user from your favourite SQL client (you can use DBeaver, for example). Be aware that your VPN connection should be active so that the DWH is reachable from your device.
2024-02-09 15:47:26 +01:00
### 4.2 Create database and schemas
- Run the following script to create a new database and the needed schemas
```sql
CREATE DATABASE dwh;
\connect dwh;
CREATE SCHEMA staging;
CREATE SCHEMA intermediate;
CREATE SCHEMA reporting;
```
### 4.3 Create users and roles
- Run the following script to create:
- A `modeler` role, owner of the `staging`, `intermediate` and `reporting` schemas.
- A `consumer` role, capable of reading the `reporting` schema.
- A dbt user, with `modeler` role.
- An airbyte user, with permission to create new schemas.
- A Power BI user, with `consumer` role.
- *Note: replace the password fields with serious passwords and note them down.*
2024-02-09 15:14:10 +01:00
2024-02-09 15:47:26 +01:00
```bash
GRANT pg_read_all_data TO dwh_admin_infratest;
CREATE ROLE airbyte_user LOGIN PASSWORD 'password' VALID UNTIL 'infinity';
GRANT CREATE ON DATABASE dwh TO airbyte_user;
CREATE ROLE modeler INHERIT;
GRANT USAGE ON SCHEMA staging TO modeler;
GRANT USAGE ON SCHEMA intermediate TO modeler;
GRANT USAGE ON SCHEMA reporting TO modeler;
GRANT ALL ON ALL TABLES IN SCHEMA staging TO modeler;
GRANT ALL ON ALL TABLES IN SCHEMA intermediate TO modeler;
GRANT ALL ON ALL TABLES IN SCHEMA reporting TO modeler;
ALTER SCHEMA staging OWNER TO modeler;
ALTER SCHEMA intermediate OWNER TO modeler;
ALTER SCHEMA reporting OWNER TO modeler;
CREATE ROLE dbt_user LOGIN PASSWORD 'password' VALID UNTIL 'infinity';
GRANT modeler to dbt_user;
CREATE ROLE consumer INHERIT;
GRANT USAGE ON SCHEMA reporting TO consumer;
GRANT SELECT ON ALL TABLES IN SCHEMA reporting TO consumer;
ALTER DEFAULT PRIVILEGES IN SCHEMA reporting GRANT SELECT ON TABLES TO consumer;
CREATE ROLE powerbi_user LOGIN PASSWORD 'password' VALID UNTIL 'infinity';
GRANT consumer to powerbi_user;
```
2024-02-09 15:14:10 +01:00
2024-02-09 15:47:26 +01:00
- If you want, you might also want to create more users depending on your needs. Typically, date team members should also have the `modeler` role.
2024-02-09 15:14:10 +01:00
2024-02-06 13:53:05 +01:00
## 5. Airbyte
## 6. Power BI
## 7. dbt
## 8. Status monitoring
## 9. Backups
- If you are working on a dev or staging environment, you might want to skip this section.