stuff

2025-11-14 23:36:00 +01:00 · 2025-11-14 23:36:00 +01:00 · fbbeb59c0e
commit fbbeb59c0e
parent c8754e1bdc
28 changed files with 907 additions and 995 deletions
--- a/01_infra_setup.md
+++ b/01_infra_setup.md
@ -89,20 +89,19 @@ Note that, by applying these playbooks, both the root user and the `counterweigh
  * Verify the changes are working correctly
 * After running this playbook, clear your browser cache or perform a hard reload (Ctrl+Shift+R) before using the Proxmox VE Web UI to avoid UI display issues.

-### Deploy CPU Temperature Monitoring
+### Deploy Infra Monitoring (Disk, Health, CPU Temp)

-* The nodito server can be configured with CPU temperature monitoring that sends alerts to Uptime Kuma when temperatures exceed a threshold.
-* Before running the CPU temperature monitoring playbook, you need to create a secrets file with your Uptime Kuma push URL:
-  * Create `ansible/infra/nodito/nodito_secrets.yml` with:
-    ```yaml
-    uptime_kuma_url: "https://your-uptime-kuma.com/api/push/your-push-key"
-    ```
-* Run the CPU temperature monitoring setup with: `ansible-playbook -i inventory.ini infra/nodito/40_cpu_temp_alerts.yml`
-* This will:
-  * Install required packages (lm-sensors, curl, jq, bc)
-  * Create a monitoring script that checks CPU temperature every minute
-  * Set up a systemd service and timer for automated monitoring
-  * Send alerts to Uptime Kuma when temperature exceeds the threshold (default: 80°C)
+* Nodito can run the same monitoring stack used elsewhere: disk usage, heartbeat healthcheck, and CPU temperature alerts feeding Uptime Kuma.
+* Playbooks to run (in any order):
+  * `ansible-playbook -i inventory.ini infra/410_disk_usage_alerts.yml`
+  * `ansible-playbook -i inventory.ini infra/420_system_healthcheck.yml`
+  * `ansible-playbook -i inventory.ini infra/430_cpu_temp_alerts.yml`
+* Each playbook automatically:
+  * Creates/updates the corresponding monitor in Uptime Kuma (including ntfy notification wiring)
+  * Installs any required packages (curl, lm-sensors, jq, bc, etc.)
+  * Creates the monitoring script(s) and log files
+  * Sets up systemd services and timers for automated runs
+  * Sends alerts to Uptime Kuma when thresholds are exceeded or heartbeats stop

 ### Setup ZFS Storage Pool

@ -131,6 +130,26 @@ Note that, by applying these playbooks, both the root user and the `counterweigh
  * Enable ZFS services for automatic pool import on boot
 * **Warning**: This will destroy all data on the specified disks. Make sure you're using the correct disk IDs and that the disks don't contain important data.

+### Build Debian Cloud Template for Proxmox
+
+* After storage is ready, create a reusable Debian cloud template so future Proxmox VMs can be cloned in seconds.
+* Run: `ansible-playbook -i inventory.ini infra/nodito/33_proxmox_debian_cloud_template.yml`
+* This playbook:
+  * Downloads the latest Debian generic cloud qcow2 image (override via `debian_cloud_image_url`/`debian_cloud_image_filename`)
+  * Imports it into your Proxmox storage (defaults to the configured ZFS pool) and builds VMID `9001` as a template
+  * Injects your SSH keys, enables qemu-guest-agent, configures DHCP networking, and sizes the disk (default 10 GB)
+  * Drops a cloud-init snippet so clones automatically install qemu-guest-agent and can run upgrades on first boot
+* Once it finishes, provision new machines with `qm clone 9001 <vmid> --name <vmname>` plus your usual cloud-init overrides.
+
+### Provision VMs with OpenTofu
+
+* Prefer a declarative workflow? The `tofu/nodito` project clones VM definitions from the template automatically.
+* Quick start (see `tofu/nodito/README.md` for full details):
+  1. Install OpenTofu, copy `terraform.tfvars.example` to `terraform.tfvars`, and fill in the Proxmox API URL/token plus your SSH public key.
+  2. Define VMs in the `vms` map (name, cores, memory, disk size, `ipconfig0`, optional `vlan_tag`). Disks default to the `proxmox-tank-1` ZFS pool.
+  3. Run `tofu init`, `tofu plan -var-file=terraform.tfvars`, and `tofu apply -var-file=terraform.tfvars`.
+* Each VM is cloned from the `debian-13-cloud-init` template (VMID 9001), attaches to `vmbr0`, and boots with qemu-guest-agent + your keys injected via cloud-init. Updates to the tfvars map let you grow/shrink the fleet with a single `tofu apply`.
+
 ## General prep for all machines

 ### Set up Infrastructure Secrets
@ -146,32 +165,6 @@ Note that, by applying these playbooks, both the root user and the `counterweigh
  ```
 * **Important**: Never commit this file to version control (it's in `.gitignore`)

-### Deploy Disk Usage Monitoring
-
-* Any machine can be configured with disk usage monitoring that sends alerts to Uptime Kuma when disk usage exceeds a threshold.
-* This playbook automatically creates an Uptime Kuma push monitor for each host (idempotent - won't create duplicates).
-* Prerequisites:
-  * Install the Uptime Kuma Ansible collection: `ansible-galaxy collection install -r ansible/requirements.yml`
-  * Install Python dependencies: `pip install -r requirements.txt` (includes uptime-kuma-api)
-  * Set up `ansible/infra_secrets.yml` with your Uptime Kuma API token (see above)
-  * Uptime Kuma must be deployed (the playbook automatically uses the URL from `uptime_kuma_vars.yml`)
-* Run the disk monitoring setup with:
-  ```bash
-  ansible-playbook -i inventory.ini infra/410_disk_usage_alerts.yml
-  ```
-* This will:
-  * Create an Uptime Kuma monitor group per host named "{hostname} - infra" (idempotent)
-  * Create a push monitor in Uptime Kuma with "upside down" mode (no news is good news)
-  * Assign the monitor to the host's group for better organization
-  * Install required packages (curl, bc)
-  * Create a monitoring script that checks disk usage at configured intervals (default: 15 minutes)
-  * Set up a systemd service and timer for automated monitoring
-  * Send alerts to Uptime Kuma only when usage exceeds threshold (default: 80%)
-* Optional configuration:
-  * Change threshold: `-e "disk_usage_threshold_percent=85"`
-  * Change check interval: `-e "disk_check_interval_minutes=10"`
-  * Monitor different mount point: `-e "monitored_mount_point=/home"`
-
 ## GPG Keys

 Some of the backups are stored encrypted for security. To allow this, fill in the gpg variables listed in `example.inventory.ini` under the `lapy` block.