personal_infra/01_infra_setup.md
2025-12-01 11:17:02 +01:00

10 KiB
Raw Blame History

01 Infra Setup

This describes how to prepare each machine before deploying services on them.

First steps

  • Create an ssh key or pick an existing one. We'll refer to it as the personal_ssh_key.
  • Deploy ansible on the laptop (Lapy), which will act as the ansible control node. To do so:
    • Create a venv: python3 -m venv venv
    • Activate it: source venv/bin/activate
    • Install the listed ansible requirements with pip install -r requirements.txt
  • Keep in mind you should activate this venv from now on when running ansible commands.

Domain

  • Some services are designed to be accessible through WAN through a friendly URL.
  • You'll need to have a domain where you can set DNS records and have the ability to create different subdomains, as the guide assumes each service will get its own subdomain.
  • Getting and configuring the domain is outside the scope of this repo. Whenever a service needs you to set up a subdomain, it will be mentioned explictly.
  • You should add the domain to the var root_domain in ansible/infra_vars.yml.

Prepare the VPSs (vipy, watchtower and spacey)

Source the VPSs

  • The guide is agnostic to which provider you pick, but has been tested with VMs from https://99stack.com and contains some operations that are specifically relevant to their VPSs.
  • The expectations are that the VPS ticks the following boxes:
    • Runs Debian 12/13 bookworm.
    • Has a public IP4 and starts out with SSH listening on port 22.
    • Boots with one of your SSH keys already authorized. If this is not the case, you'll have to manually drop the pubkey there before using the playbooks.
  • You will need three VPSs:
    • One to host most services,
    • Another tiny one to monitor Uptime. We use a different one to prevent the monitoring service from falling down with the main machine.
    • A final one to run the headscale server, since the main VPS needs to be part of the mesh network and can't do so while also running the coordination server.
  • Move on once your VPSs are running and satisfies the prerequisites.

Prepare Ansible vars

  • You have an example ansible/example.inventory.ini. Copy it with cp ansible/example.inventory.ini ansible/inventory.ini and fill in the [vps] group with host entries for each machine (vipy for services, watchtower for uptime monitoring, spacey for headscale).
  • A few notes:
    • The guides assume you'll only have one vipy host entry. Stuff will break if you have multiple, so avoid that.

Create user and secure VPS access

  • Ansible will create a user on the first playbook 01_basic_vps_setup_playbook.yml. This is the user that will get used regularly. But, since this user doesn't exist, you obviosuly need to first run this playbook from some other user. We assume your VPS provider has given you a root user, which is what you need to define as the running user in the next command.
  • cd into ansible
  • Run ansible-playbook -i inventory.ini infra/01_user_and_access_setup_playbook.yml -e 'ansible_user="your root user here"'
  • Then, configure firewall access, fail2ban and auditd with ansible-playbook -i inventory.ini infra/02_firewall_and_fail2ban_playbook.yml. Since the user we will use is now present, there is no need to specify the user anymore.

Note that, by applying these playbooks, both the root user and the counterweight user will use the same SSH pubkey for auth.

Checklist:

  • All 3 VPS are accessible with the counterweight user
  • All 3 VPS have UFW up and running

Prepare Nodito Server

Source the Nodito Server

  • This setup is designed for a local Nodito server running in your home environment.
  • The expectations are that the Nodito server:
    • Runs Proxmox VE (based on Debian).
    • Has a predictable local IP address.
    • Has root user with password authentication enabled (default Proxmox state).
    • SSH is accessible on port 22.

Prepare Ansible vars for Nodito

  • Ensure your inventory contains a [nodito_host] group and the nodito host entry (copy the example inventory if needed) and fill in with values.

Bootstrap SSH Key Access and Create User

  • Nodito starts with password authentication enabled and no SSH keys configured. We need to bootstrap SSH key access first.
  • Run the complete setup with: ansible-playbook -i inventory.ini infra/nodito/30_proxmox_bootstrap_playbook.yml -e 'ansible_user=root'
  • This single playbook will:
    • Set up SSH key access for root
    • Create the counterweight user with SSH keys
    • Update and secure the system
    • Disable root login and password authentication
    • Test the final configuration
  • For all future playbooks targeting nodito, use the default configuration (no overrides needed).

Note that, by applying these playbooks, both the root user and the counterweight user will use the same SSH pubkey for auth, but root login will be disabled.

Switch to Community Repositories

  • Proxmox VE installations typically come with enterprise repositories enabled, which require a subscription. To avoid subscription warnings and use the community repositories instead:
  • Run the repository switch with: ansible-playbook -i inventory.ini infra/nodito/32_proxmox_community_repos_playbook.yml
  • This playbook will:
    • Detect whether your Proxmox installation uses modern deb822 format (Proxmox VE 9) or legacy format (Proxmox VE 8)
    • Remove enterprise repository files and create community repository files
    • Disable subscription nag messages in both web and mobile interfaces
    • Update Proxmox packages from the community repository
    • Verify the changes are working correctly
  • After running this playbook, clear your browser cache or perform a hard reload (Ctrl+Shift+R) before using the Proxmox VE Web UI to avoid UI display issues.

Deploy Infra Monitoring (Disk, Health, CPU Temp)

  • Nodito can run the same monitoring stack used elsewhere: disk usage, heartbeat healthcheck, and CPU temperature alerts feeding Uptime Kuma.
  • Playbooks to run (in any order):
    • ansible-playbook -i inventory.ini infra/410_disk_usage_alerts.yml
    • ansible-playbook -i inventory.ini infra/420_system_healthcheck.yml
    • ansible-playbook -i inventory.ini infra/430_cpu_temp_alerts.yml
  • Each playbook automatically:
    • Creates/updates the corresponding monitor in Uptime Kuma (including ntfy notification wiring)
    • Installs any required packages (curl, lm-sensors, jq, bc, etc.)
    • Creates the monitoring script(s) and log files
    • Sets up systemd services and timers for automated runs
    • Sends alerts to Uptime Kuma when thresholds are exceeded or heartbeats stop

Setup ZFS Storage Pool

  • The nodito server can be configured with a ZFS RAID 1 storage pool for Proxmox VM storage, providing redundancy and data integrity.
  • Before running the ZFS pool setup playbook, you need to identify your disk IDs and configure them in the variables file:
    • SSH into your nodito server and run: ls -la /dev/disk/by-id/ | grep -E "(ata-|scsi-|nvme-)"
    • This will show you the persistent disk identifiers for all your disks. Look for the two disks you want to use for the ZFS pool.
    • Example output:
      lrwxrwxrwx 1 root root  9 Dec 15 10:30 ata-WDC_WD40EFRX-68N32N0_WD-WCC7K1234567 -> ../../sdb
      lrwxrwxrwx 1 root root  9 Dec 15 10:30 ata-WDC_WD40EFRX-68N32N0_WD-WCC7K7654321 -> ../../sdc
      
    • Update ansible/infra/nodito/nodito_vars.yml with your actual disk IDs:
      zfs_disk_1: "/dev/disk/by-id/ata-WDC_WD40EFRX-68N32N0_WD-WCC7K1234567"
      zfs_disk_2: "/dev/disk/by-id/ata-WDC_WD40EFRX-68N32N0_WD-WCC7K7654321"
      
  • Run the ZFS pool setup with: ansible-playbook -i inventory.ini infra/nodito/32_zfs_pool_setup_playbook.yml
  • This will:
    • Validate Proxmox VE and ZFS installation
    • Install ZFS utilities and kernel modules
    • Create a RAID 1 (mirror) ZFS pool named proxmox-storage with optimized settings
    • Configure ZFS pool properties (ashift=12, compression=lz4, atime=off, etc.)
    • Export and re-import the pool for Proxmox compatibility
    • Configure Proxmox to use the ZFS pool storage (zfspool type)
    • Enable ZFS services for automatic pool import on boot
  • Warning: This will destroy all data on the specified disks. Make sure you're using the correct disk IDs and that the disks don't contain important data.

Build Debian Cloud Template for Proxmox

  • After storage is ready, create a reusable Debian cloud template so future Proxmox VMs can be cloned in seconds.
  • Run: ansible-playbook -i inventory.ini infra/nodito/33_proxmox_debian_cloud_template.yml
  • This playbook:
    • Downloads the latest Debian generic cloud qcow2 image (override via debian_cloud_image_url/debian_cloud_image_filename)
    • Imports it into your Proxmox storage (defaults to the configured ZFS pool) and builds VMID 9001 as a template
    • Injects your SSH keys, enables qemu-guest-agent, configures DHCP networking, and sizes the disk (default 10GB)
    • Drops a cloud-init snippet so clones automatically install qemu-guest-agent and can run upgrades on first boot
  • Once it finishes, provision new machines with qm clone 9001 <vmid> --name <vmname> plus your usual cloud-init overrides.

Provision VMs with OpenTofu

  • Prefer a declarative workflow? The tofu/nodito project clones VM definitions from the template automatically.
  • Quick start (see tofu/nodito/README.md for full details):
    1. Install OpenTofu, copy terraform.tfvars.example to terraform.tfvars, and fill in the Proxmox API URL/token plus your SSH public key.
    2. Define VMs in the vms map (name, cores, memory, disk size, ipconfig0, optional vlan_tag). Disks default to the proxmox-tank-1 ZFS pool.
    3. Run tofu init, tofu plan -var-file=terraform.tfvars, and tofu apply -var-file=terraform.tfvars.
  • Each VM is cloned from the debian-13-cloud-init template (VMID 9001), attaches to vmbr0, and boots with qemu-guest-agent + your keys injected via cloud-init. Updates to the tfvars map let you grow/shrink the fleet with a single tofu apply.

General prep for all machines

Set up Infrastructure Secrets

  • Create ansible/infra_secrets.yml based on the example file:
    cp ansible/infra_secrets.yml.example ansible/infra_secrets.yml
    
  • Edit ansible/infra_secrets.yml and add your Uptime Kuma credentials:
    uptime_kuma_username: "admin"
    uptime_kuma_password: "your_password"
    
  • Important: Never commit this file to version control (it's in .gitignore)

GPG Keys

Some of the backups are stored encrypted for security. To allow this, fill in the gpg variables listed in example.inventory.ini under the lapy block.