Compare commits

...
Sign in to create a new pull request.

15 commits

61 changed files with 8893 additions and 78 deletions

20
.gitignore vendored
View file

@ -1,3 +1,21 @@
# OpenTofu / Terraform
.terraform/
.tofu/
.terraform.lock.hcl
.tofu.lock.hcl
terraform.tfstate
terraform.tfstate.*
crash.log
*.tfvars
*.tfvars.json
test-inventory.ini
inventory.ini
venv/*
.env
.env
# Secrets and sensitive files
*_secrets.yml
*_secrets.yaml
secrets/
.secrets/

View file

@ -18,21 +18,24 @@ This describes how to prepare each machine before deploying services on them.
* Getting and configuring the domain is outside the scope of this repo. Whenever a service needs you to set up a subdomain, it will be mentioned explictly.
* You should add the domain to the var `root_domain` in `ansible/infra_vars.yml`.
## Prepare the VPSs (vipy and watchtower)
## Prepare the VPSs (vipy, watchtower and spacey)
### Source the VPSs
* The guide is agnostic to which provider you pick, but has been tested with VMs from https://99stack.com and contains some operations that are specifically relevant to their VPSs.
* The expectations are that the VPS ticks the following boxes:
+ Runs Debian 12 bookworm.
+ Runs Debian 12/13 bookworm.
+ Has a public IP4 and starts out with SSH listening on port 22.
+ Boots with one of your SSH keys already authorized. If this is not the case, you'll have to manually drop the pubkey there before using the playbooks.
* You will need two VPSs: one to host most services, and another tiny one to monitor Uptime. We use two to prevent the monitoring service from falling down with the main machine.
* You will need three VPSs:
+ One to host most services,
+ Another tiny one to monitor Uptime. We use a different one to prevent the monitoring service from falling down with the main machine.
+ A final one to run the headscale server, since the main VPS needs to be part of the mesh network and can't do so while also running the coordination server.
* Move on once your VPSs are running and satisfies the prerequisites.
### Prepare Ansible vars
* You have an example `ansible/example.inventory.ini`. Copy it with `cp ansible/example.inventory.ini ansible/inventory.ini` and fill in with the values for your VPSs. `[vipy]` is the services VPS. `[watchtower]` is the watchtower VPS.
* You have an example `ansible/example.inventory.ini`. Copy it with `cp ansible/example.inventory.ini ansible/inventory.ini` and fill in with the values for your VPSs. `[vipy]` is the services VPS. `[watchtower]` is the watchtower VPS. `[spacey]`is the headscale VPS.
* A few notes:
* The guides assume you'll only have one VPS in the `[vipy]` group. Stuff will break if you have multiple, so avoid that.
@ -45,6 +48,130 @@ This describes how to prepare each machine before deploying services on them.
Note that, by applying these playbooks, both the root user and the `counterweight` user will use the same SSH pubkey for auth.
## Prepare Nodito Server
### Source the Nodito Server
* This setup is designed for a local Nodito server running in your home environment.
* The expectations are that the Nodito server:
+ Runs Proxmox VE (based on Debian).
+ Has a predictable local IP address.
+ Has root user with password authentication enabled (default Proxmox state).
+ SSH is accessible on port 22.
### Prepare Ansible vars for Nodito
* Add a `[nodito]` group to your `ansible/inventory.ini` (or simply use the one you get by copying `example.inventory.ini`) and fill in with values.
### Bootstrap SSH Key Access and Create User
* Nodito starts with password authentication enabled and no SSH keys configured. We need to bootstrap SSH key access first.
* Run the complete setup with: `ansible-playbook -i inventory.ini infra/nodito/30_proxmox_bootstrap_playbook.yml -e 'ansible_user=root'`
* This single playbook will:
* Set up SSH key access for root
* Create the counterweight user with SSH keys
* Update and secure the system
* Disable root login and password authentication
* Test the final configuration
* For all future playbooks targeting nodito, use the default configuration (no overrides needed).
Note that, by applying these playbooks, both the root user and the `counterweight` user will use the same SSH pubkey for auth, but root login will be disabled.
### Switch to Community Repositories
* Proxmox VE installations typically come with enterprise repositories enabled, which require a subscription. To avoid subscription warnings and use the community repositories instead:
* Run the repository switch with: `ansible-playbook -i inventory.ini infra/nodito/32_proxmox_community_repos_playbook.yml`
* This playbook will:
* Detect whether your Proxmox installation uses modern deb822 format (Proxmox VE 9) or legacy format (Proxmox VE 8)
* Remove enterprise repository files and create community repository files
* Disable subscription nag messages in both web and mobile interfaces
* Update Proxmox packages from the community repository
* Verify the changes are working correctly
* After running this playbook, clear your browser cache or perform a hard reload (Ctrl+Shift+R) before using the Proxmox VE Web UI to avoid UI display issues.
### Deploy CPU Temperature Monitoring
* The nodito server can be configured with CPU temperature monitoring that sends alerts to Uptime Kuma when temperatures exceed a threshold.
* Before running the CPU temperature monitoring playbook, you need to create a secrets file with your Uptime Kuma push URL:
* Create `ansible/infra/nodito/nodito_secrets.yml` with:
```yaml
uptime_kuma_url: "https://your-uptime-kuma.com/api/push/your-push-key"
```
* Run the CPU temperature monitoring setup with: `ansible-playbook -i inventory.ini infra/nodito/40_cpu_temp_alerts.yml`
* This will:
* Install required packages (lm-sensors, curl, jq, bc)
* Create a monitoring script that checks CPU temperature every minute
* Set up a systemd service and timer for automated monitoring
* Send alerts to Uptime Kuma when temperature exceeds the threshold (default: 80°C)
### Setup ZFS Storage Pool
* The nodito server can be configured with a ZFS RAID 1 storage pool for Proxmox VM storage, providing redundancy and data integrity.
* Before running the ZFS pool setup playbook, you need to identify your disk IDs and configure them in the variables file:
* SSH into your nodito server and run: `ls -la /dev/disk/by-id/ | grep -E "(ata-|scsi-|nvme-)"`
* This will show you the persistent disk identifiers for all your disks. Look for the two disks you want to use for the ZFS pool.
* Example output:
```
lrwxrwxrwx 1 root root 9 Dec 15 10:30 ata-WDC_WD40EFRX-68N32N0_WD-WCC7K1234567 -> ../../sdb
lrwxrwxrwx 1 root root 9 Dec 15 10:30 ata-WDC_WD40EFRX-68N32N0_WD-WCC7K7654321 -> ../../sdc
```
* Update `ansible/infra/nodito/nodito_vars.yml` with your actual disk IDs:
```yaml
zfs_disk_1: "/dev/disk/by-id/ata-WDC_WD40EFRX-68N32N0_WD-WCC7K1234567"
zfs_disk_2: "/dev/disk/by-id/ata-WDC_WD40EFRX-68N32N0_WD-WCC7K7654321"
```
* Run the ZFS pool setup with: `ansible-playbook -i inventory.ini infra/nodito/32_zfs_pool_setup_playbook.yml`
* This will:
* Validate Proxmox VE and ZFS installation
* Install ZFS utilities and kernel modules
* Create a RAID 1 (mirror) ZFS pool named `proxmox-storage` with optimized settings
* Configure ZFS pool properties (ashift=12, compression=lz4, atime=off, etc.)
* Export and re-import the pool for Proxmox compatibility
* Configure Proxmox to use the ZFS pool storage (zfspool type)
* Enable ZFS services for automatic pool import on boot
* **Warning**: This will destroy all data on the specified disks. Make sure you're using the correct disk IDs and that the disks don't contain important data.
## General prep for all machines
### Set up Infrastructure Secrets
* Create `ansible/infra_secrets.yml` based on the example file:
```bash
cp ansible/infra_secrets.yml.example ansible/infra_secrets.yml
```
* Edit `ansible/infra_secrets.yml` and add your Uptime Kuma credentials:
```yaml
uptime_kuma_username: "admin"
uptime_kuma_password: "your_password"
```
* **Important**: Never commit this file to version control (it's in `.gitignore`)
### Deploy Disk Usage Monitoring
* Any machine can be configured with disk usage monitoring that sends alerts to Uptime Kuma when disk usage exceeds a threshold.
* This playbook automatically creates an Uptime Kuma push monitor for each host (idempotent - won't create duplicates).
* Prerequisites:
* Install the Uptime Kuma Ansible collection: `ansible-galaxy collection install -r ansible/requirements.yml`
* Install Python dependencies: `pip install -r requirements.txt` (includes uptime-kuma-api)
* Set up `ansible/infra_secrets.yml` with your Uptime Kuma API token (see above)
* Uptime Kuma must be deployed (the playbook automatically uses the URL from `uptime_kuma_vars.yml`)
* Run the disk monitoring setup with:
```bash
ansible-playbook -i inventory.ini infra/410_disk_usage_alerts.yml
```
* This will:
* Create an Uptime Kuma monitor group per host named "{hostname} - infra" (idempotent)
* Create a push monitor in Uptime Kuma with "upside down" mode (no news is good news)
* Assign the monitor to the host's group for better organization
* Install required packages (curl, bc)
* Create a monitoring script that checks disk usage at configured intervals (default: 15 minutes)
* Set up a systemd service and timer for automated monitoring
* Send alerts to Uptime Kuma only when usage exceeds threshold (default: 80%)
* Optional configuration:
* Change threshold: `-e "disk_usage_threshold_percent=85"`
* Change check interval: `-e "disk_check_interval_minutes=10"`
* Monitor different mount point: `-e "monitored_mount_point=/home"`
## GPG Keys
Some of the backups are stored encrypted for security. To allow this, fill in the gpg variables listed in `example.inventory.ini` under the `lapy` block.

View file

@ -237,20 +237,30 @@ Headscale is a self-hosted Tailscale control server that allows you to create yo
### Configure
* **Network Security**: The network starts with a deny-all policy - no devices can communicate with each other until you explicitly configure ACL rules in `/etc/headscale/acl.json`.
* After deployment, you need to create a namespace and generate pre-auth keys for your devices.
* SSH into your VPS and run the following commands:
```bash
# Create a namespace
headscale user create counter-net
# Generate a pre-auth key for device registration
headscale preauthkeys create --user 1 # Assumes you've only created one user
```
* Copy the generated pre-auth key - you'll need it to register your devices.
* After deployment, the namespace specified in `services/headscale/headscale_vars.yml` is automatically created.
### Connect devices
#### Automated method (for servers reachable via SSH from lapy)
* Use the Ansible playbook to automatically join machines to the mesh:
```bash
ansible-playbook -i inventory.ini infra/920_join_headscale_mesh.yml --limit <target-host>
```
* The playbook will:
* Generate an ephemeral pre-auth key (expires in 1 minute) by SSHing from lapy to the headscale server
* Install Tailscale on the target machine
* Configure Tailscale to connect to your headscale server
* Enable magic DNS so devices can talk to each other by hostname
#### Manual method (for mobile apps, desktop clients, etc.)
* Install Tailscale on your devices (mobile apps, desktop clients, etc.).
* Generate a pre-auth key by SSHing into your headscale server:
```bash
ssh <headscale-server>
sudo headscale preauthkeys create --user counter-net --reusable
```
* Instead of using the default Tailscale login, use your headscale server:
* Server URL: `https://headscale.contrapeso.xyz` (or your configured domain)
* Use the pre-auth key you generated above

419
DEPENDENCY_GRAPH.md Normal file
View file

@ -0,0 +1,419 @@
# Infrastructure Dependency Graph
This document maps out the dependencies between all infrastructure components and services, providing a clear order for building out the personal infrastructure.
## Infrastructure Overview
### Machines (Hosts)
- **lapy**: Laptop (Ansible control node)
- **vipy**: Main VPS (207.154.226.192) - hosts most services
- **watchtower**: Monitoring VPS (206.189.63.167) - hosts Uptime Kuma and ntfy
- **spacey**: Headscale VPS (165.232.73.4) - hosts Headscale coordination server
- **nodito**: Proxmox server (192.168.1.139) - home infrastructure
- **memos-box**: Separate box for memos (192.168.1.149)
---
## Dependency Layers
### Layer 0: Prerequisites (No Dependencies)
These must exist before anything else can be deployed.
#### On lapy (Laptop - Ansible Control Node)
- Python venv with Ansible
- SSH keys configured
- Domain name configured (`root_domain` in `infra_vars.yml`)
**Commands:**
```bash
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
ansible-galaxy collection install -r ansible/requirements.yml
```
---
### Layer 1: Basic Machine Setup (Depends on: Layer 0)
Initial machine provisioning and security hardening.
#### All VPSs (vipy, watchtower, spacey)
**Playbooks (in order):**
1. `infra/01_user_and_access_setup_playbook.yml` - Create user, setup SSH
2. `infra/02_firewall_and_fail2ban_playbook.yml` - Firewall, fail2ban, auditd
**Dependencies:**
- SSH access with root user
- SSH key pair
#### Nodito (Proxmox Server)
**Playbooks (in order):**
1. `infra/nodito/30_proxmox_bootstrap_playbook.yml` - SSH keys, user creation, security
2. `infra/nodito/31_proxmox_community_repos_playbook.yml` - Switch to community repos
3. `infra/nodito/32_zfs_pool_setup_playbook.yml` - ZFS storage pool (optional)
4. `infra/nodito/33_proxmox_debian_cloud_template.yml` - Cloud template (optional)
**Dependencies:**
- Root password access initially
- Disk IDs identified for ZFS (if using ZFS)
#### Memos-box
**Playbooks:**
1. `infra/01_user_and_access_setup_playbook.yml`
2. `infra/02_firewall_and_fail2ban_playbook.yml`
---
### Layer 2: General Infrastructure Tools (Depends on: Layer 1)
Common utilities needed across multiple services.
#### On All Machines (as needed per service requirements)
**Playbooks:**
- `infra/900_install_rsync.yml` - For backup operations
- `infra/910_docker_playbook.yml` - For Docker-based services
- `infra/920_join_headscale_mesh.yml` - Join machines to VPN mesh (requires Layer 5 - Headscale)
**Dependencies:**
- Layer 1 complete (user and firewall setup)
**Notes:**
- rsync needed on: vipy, watchtower, lapy (for backups)
- docker needed on: vipy, watchtower (for containerized services)
---
### Layer 3: Reverse Proxy (Depends on: Layer 2)
Caddy provides HTTPS termination and reverse proxying for all web services.
#### On vipy, watchtower, spacey
**Playbook:**
- `services/caddy_playbook.yml`
**Dependencies:**
- Layer 1 complete (firewall configured to allow ports 80/443)
- No other services required
**Critical Note:**
- Caddy is deployed to vipy, watchtower, and spacey
- Each service deployed configures its own Caddy reverse proxy automatically
- All subsequent web services depend on Caddy being installed first
---
### Layer 4: Core Monitoring & Notifications (Depends on: Layer 3)
These services provide monitoring and alerting for all other infrastructure.
#### 4A: ntfy (Notification Service)
**Host:** watchtower
**Playbook:** `services/ntfy/deploy_ntfy_playbook.yml`
**Dependencies:**
- Caddy on watchtower (Layer 3)
- DNS record for ntfy subdomain
- NTFY_USER and NTFY_PASSWORD environment variables
**Used By:**
- Uptime Kuma (for notifications)
- ntfy-emergency-app
- Any service needing push notifications
#### 4B: Uptime Kuma (Monitoring Platform)
**Host:** watchtower
**Playbook:** `services/uptime_kuma/deploy_uptime_kuma_playbook.yml`
**Dependencies:**
- Caddy on watchtower (Layer 3)
- Docker on watchtower (Layer 2)
- DNS record for uptime kuma subdomain
**Used By:**
- All infrastructure monitoring (disk alerts, healthchecks, CPU temp)
- Service availability monitoring
**Backup:** `services/uptime_kuma/setup_backup_uptime_kuma_to_lapy.yml`
- Requires rsync on watchtower and lapy
---
### Layer 5: VPN Infrastructure (Depends on: Layer 3)
Headscale provides secure mesh networking between all machines.
#### Headscale (VPN Coordination Server)
**Host:** spacey
**Playbook:** `services/headscale/deploy_headscale_playbook.yml`
**Dependencies:**
- Caddy on spacey (Layer 3)
- DNS record for headscale subdomain
**Enables:**
- Secure communication between all machines
- Magic DNS for hostname resolution
- Join machines using: `infra/920_join_headscale_mesh.yml`
**Backup:** `services/headscale/setup_backup_headscale_to_lapy.yml`
- Requires rsync on spacey and lapy
---
### Layer 6: Infrastructure Monitoring (Depends on: Layer 4)
Automated monitoring scripts that report to Uptime Kuma.
#### On All Machines
**Playbooks:**
- `infra/410_disk_usage_alerts.yml` - Disk usage monitoring
- `infra/420_system_healthcheck.yml` - System health pings
**Dependencies:**
- Uptime Kuma deployed (Layer 4B)
- `infra_secrets.yml` with Uptime Kuma credentials
- Python uptime-kuma-api installed on lapy
#### On Nodito Only
**Playbook:**
- `infra/nodito/40_cpu_temp_alerts.yml` - CPU temperature monitoring
**Dependencies:**
- Uptime Kuma deployed (Layer 4B)
- `nodito_secrets.yml` with Uptime Kuma push URL
---
### Layer 7: Core Services (Depends on: Layers 3-4)
Essential services for personal infrastructure.
#### 7A: Vaultwarden (Password Manager)
**Host:** vipy
**Playbook:** `services/vaultwarden/deploy_vaultwarden_playbook.yml`
**Dependencies:**
- Caddy on vipy (Layer 3)
- Docker on vipy (Layer 2)
- Fail2ban on vipy (Layer 1)
- DNS record for vaultwarden subdomain
**Post-Deploy:**
- Create first user account
- Run `services/vaultwarden/disable_vaultwarden_sign_ups_playbook.yml` to disable registrations
**Backup:** `services/vaultwarden/setup_backup_vaultwarden_to_lapy.yml`
- Requires rsync on vipy and lapy
#### 7B: Forgejo (Git Server)
**Host:** vipy
**Playbook:** `services/forgejo/deploy_forgejo_playbook.yml`
**Dependencies:**
- Caddy on vipy (Layer 3)
- DNS record for forgejo subdomain
**Used By:**
- Personal blog (Layer 8)
- Any service pulling from git repos
#### 7C: LNBits (Lightning Wallet)
**Host:** vipy
**Playbook:** `services/lnbits/deploy_lnbits_playbook.yml`
**Dependencies:**
- Caddy on vipy (Layer 3)
- DNS record for lnbits subdomain
- Python 3.12 via pyenv
- Poetry for dependency management
**Backup:** `services/lnbits/setup_backup_lnbits_to_lapy.yml`
- Requires rsync on vipy and lapy
- Backups are GPG encrypted (requires GPG keys configured)
---
### Layer 8: Secondary Services (Depends on: Layer 7)
Services that depend on core services being available.
#### 8A: Personal Blog (Static Site)
**Host:** vipy
**Playbook:** `services/personal-blog/deploy_personal_blog_playbook.yml`
**Dependencies:**
- Caddy on vipy (Layer 3)
- Forgejo on vipy (Layer 7B) - blog content hosted in Forgejo repo
- rsync on vipy (Layer 2)
- DNS record for blog subdomain
- PERSONAL_BLOG_DEPLOY_TOKEN environment variable (Forgejo deploy token)
**Notes:**
- Auto-updates hourly via cron from Forgejo repo
- Serves static files directly through Caddy
#### 8B: ntfy-emergency-app
**Host:** vipy
**Playbook:** `services/ntfy-emergency-app/deploy_ntfy_emergency_app_playbook.yml`
**Dependencies:**
- Caddy on vipy (Layer 3)
- Docker on vipy (Layer 2)
- ntfy on watchtower (Layer 4A)
- DNS record for emergency app subdomain
**Notes:**
- Configured with ntfy server URL and credentials
- Sends emergency notifications to ntfy topics
#### 8C: Memos (Note-taking)
**Host:** memos-box
**Playbook:** `services/memos/deploy_memos_playbook.yml`
**Dependencies:**
- Caddy on memos-box (Layer 3)
- DNS record for memos subdomain
---
## Deployment Order Summary
### Phase 1: Foundation
1. Setup lapy as Ansible control node
2. Configure domain and DNS
3. Deploy Layer 1 on all machines (users, firewall)
4. Deploy Layer 2 tools (rsync, docker as needed)
### Phase 2: Web Infrastructure
5. Deploy Caddy (Layer 3) on vipy, watchtower, spacey
### Phase 3: Monitoring Foundation
6. Deploy ntfy on watchtower (Layer 4A)
7. Deploy Uptime Kuma on watchtower (Layer 4B)
8. Configure Uptime Kuma with ntfy notifications
### Phase 4: Mesh Network (Optional but Recommended)
9. Deploy Headscale on spacey (Layer 5)
10. Join machines to mesh using 920 playbook
### Phase 5: Infrastructure Monitoring
11. Deploy disk usage alerts on all machines (Layer 6)
12. Deploy system healthcheck on all machines (Layer 6)
13. Deploy CPU temp alerts on nodito (Layer 6)
### Phase 6: Core Services
14. Deploy Vaultwarden on vipy (Layer 7A)
15. Deploy Forgejo on vipy (Layer 7B)
16. Deploy LNBits on vipy (Layer 7C)
### Phase 7: Secondary Services
17. Deploy Personal Blog on vipy (Layer 8A)
18. Deploy ntfy-emergency-app on vipy (Layer 8B)
19. Deploy Memos on memos-box (Layer 8C)
### Phase 8: Backups
20. Configure all backup playbooks (to lapy)
---
## Critical Dependencies Map
```
Legend: → (depends on)
MONITORING CHAIN:
ntfy (Layer 4A) → Caddy (Layer 3)
Uptime Kuma (Layer 4B) → Caddy (Layer 3) + Docker (Layer 2) + ntfy (Layer 4A)
Disk Alerts (Layer 6) → Uptime Kuma (Layer 4B)
System Healthcheck (Layer 6) → Uptime Kuma (Layer 4B)
CPU Temp Alerts (Layer 6) → Uptime Kuma (Layer 4B)
WEB SERVICES CHAIN:
Caddy (Layer 3) → Firewall configured (Layer 1)
Vaultwarden (Layer 7A) → Caddy (Layer 3) + Docker (Layer 2)
Forgejo (Layer 7B) → Caddy (Layer 3)
LNBits (Layer 7C) → Caddy (Layer 3)
Personal Blog (Layer 8A) → Caddy (Layer 3) + Forgejo (Layer 7B)
ntfy-emergency-app (Layer 8B) → Caddy (Layer 3) + Docker (Layer 2) + ntfy (Layer 4A)
Memos (Layer 8C) → Caddy (Layer 3)
VPN CHAIN:
Headscale (Layer 5) → Caddy (Layer 3)
All machines can join mesh → Headscale (Layer 5)
BACKUP CHAIN:
All backups → rsync (Layer 2) on source + lapy
LNBits backups → GPG keys configured on lapy
```
---
## Host-Service Matrix
| Service | vipy | watchtower | spacey | nodito | memos-box |
|---------|------|------------|--------|--------|-----------|
| Caddy | ✓ | ✓ | ✓ | - | ✓ |
| Docker | ✓ | ✓ | - | - | - |
| Uptime Kuma | - | ✓ | - | - | - |
| ntfy | - | ✓ | - | - | - |
| Headscale | - | - | ✓ | - | - |
| Vaultwarden | ✓ | - | - | - | - |
| Forgejo | ✓ | - | - | - | - |
| LNBits | ✓ | - | - | - | - |
| Personal Blog | ✓ | - | - | - | - |
| ntfy-emergency-app | ✓ | - | - | - | - |
| Memos | - | - | - | - | ✓ |
| Disk Alerts | ✓ | ✓ | ✓ | ✓ | ✓ |
| System Healthcheck | ✓ | ✓ | ✓ | ✓ | ✓ |
| CPU Temp Alerts | - | - | - | ✓ | - |
---
## Pre-Deployment Checklist
### Before Starting
- [ ] SSH keys generated and added to VPS providers
- [ ] Domain name acquired and accessible
- [ ] Python venv created on lapy with Ansible installed
- [ ] `inventory.ini` created and populated with all host IPs
- [ ] `infra_vars.yml` configured with root domain
- [ ] All VPSs accessible via SSH as root initially
### DNS Records to Configure
Create A records pointing to appropriate IPs:
- Uptime Kuma subdomain → watchtower IP
- ntfy subdomain → watchtower IP
- Headscale subdomain → spacey IP
- Vaultwarden subdomain → vipy IP
- Forgejo subdomain → vipy IP
- LNBits subdomain → vipy IP
- Personal Blog subdomain → vipy IP
- ntfy-emergency-app subdomain → vipy IP
- Memos subdomain → memos-box IP
### Secrets to Configure
- [ ] `infra_secrets.yml` created with Uptime Kuma credentials
- [ ] `nodito_secrets.yml` created with Uptime Kuma push URL
- [ ] NTFY_USER and NTFY_PASSWORD environment variables for ntfy deployment
- [ ] PERSONAL_BLOG_DEPLOY_TOKEN environment variable (from Forgejo)
- [ ] GPG keys configured on lapy (for encrypted backups)
---
## Notes
### Why This Order Matters
1. **Caddy First**: All web services need reverse proxy, so Caddy must be deployed before any service that requires HTTPS access.
2. **Monitoring Early**: Deploying ntfy and Uptime Kuma early means all subsequent services can be monitored from the start. Infrastructure alerts can catch issues immediately.
3. **Forgejo Before Blog**: The personal blog pulls content from Forgejo, so the git server must exist first.
4. **Headscale Separation**: Headscale runs on its own VPS (spacey) because vipy needs to be part of the mesh network and can't run the coordination server itself.
5. **Backup Setup Last**: Backups should be configured after services are stable and have initial data to backup.
### Machine Isolation Strategy
- **watchtower**: Runs monitoring services (Uptime Kuma, ntfy) separately so they don't fail when vipy fails
- **spacey**: Runs Headscale coordination server isolated from the mesh clients
- **vipy**: Main services server - most applications run here
- **nodito**: Local Proxmox server for home infrastructure
- **memos-box**: Separate dedicated server for memos service
This isolation ensures monitoring remains functional even when primary services are down.

View file

@ -0,0 +1,4 @@
new_user: counterweight
ssh_port: 22
allow_ssh_from: "any"
root_domain: contrapeso.xyz

View file

@ -4,6 +4,12 @@ your.vps.ip.here ansible_user=counterweight ansible_port=22 ansible_ssh_private_
[watchtower]
your.vps.ip.here ansible_user=counterweight ansible_port=22 ansible_ssh_private_key_file=~/.ssh/your-key
[nodito]
your.proxmox.ip.here ansible_user=counterweight ansible_port=22 ansible_ssh_private_key_file=~/.ssh/your-key ansible_ssh_pass=your_root_password
[spacey]
your.vps.ip.here ansible_user=counterweight ansible_port=22 ansible_ssh_private_key_file=~/.ssh/your-key
# Local connection to laptop: this assumes you're running ansible commands from your personal laptop
# Make sure to adjust the username
[lapy]

View file

@ -1,5 +1,5 @@
- name: Secure Debian VPS
hosts: vipy,watchtower
hosts: vipy,watchtower,spacey
vars_files:
- ../infra_vars.yml
become: true

View file

@ -1,5 +1,5 @@
- name: Secure Debian VPS
hosts: vipy,watchtower
hosts: vipy,watchtower,spacey
vars_files:
- ../infra_vars.yml
become: true

View file

@ -0,0 +1,331 @@
- name: Deploy Disk Usage Monitoring
hosts: all
become: yes
vars_files:
- ../infra_vars.yml
- ../services_config.yml
- ../infra_secrets.yml
- ../services/uptime_kuma/uptime_kuma_vars.yml
- ../services/ntfy/ntfy_vars.yml
vars:
disk_usage_threshold_percent: 80
disk_check_interval_minutes: 15
monitored_mount_point: "/"
monitoring_script_dir: /opt/disk-monitoring
monitoring_script_path: "{{ monitoring_script_dir }}/disk_usage_monitor.sh"
log_file: "{{ monitoring_script_dir }}/disk_usage_monitor.log"
systemd_service_name: disk-usage-monitor
# Uptime Kuma configuration (auto-configured from services_config.yml and infra_secrets.yml)
uptime_kuma_api_url: "https://{{ subdomains.uptime_kuma }}.{{ root_domain }}"
tasks:
- name: Validate Uptime Kuma configuration
assert:
that:
- uptime_kuma_api_url is defined
- uptime_kuma_api_url != ""
- uptime_kuma_username is defined
- uptime_kuma_username != ""
- uptime_kuma_password is defined
- uptime_kuma_password != ""
fail_msg: "uptime_kuma_api_url, uptime_kuma_username and uptime_kuma_password must be set"
- name: Get hostname for monitor identification
command: hostname
register: host_name
changed_when: false
- name: Set monitor name and group based on hostname and mount point
set_fact:
monitor_name: "disk-usage-{{ host_name.stdout }}-{{ monitored_mount_point | replace('/', 'root') }}"
monitor_friendly_name: "Disk Usage: {{ host_name.stdout }} ({{ monitored_mount_point }})"
uptime_kuma_monitor_group: "{{ host_name.stdout }} - infra"
- name: Create Uptime Kuma monitor setup script
copy:
dest: /tmp/setup_uptime_kuma_monitor.py
content: |
#!/usr/bin/env python3
import sys
import json
from uptime_kuma_api import UptimeKumaApi
def main():
api_url = sys.argv[1]
username = sys.argv[2]
password = sys.argv[3]
group_name = sys.argv[4]
monitor_name = sys.argv[5]
monitor_description = sys.argv[6]
interval = int(sys.argv[7])
ntfy_topic = sys.argv[8] if len(sys.argv) > 8 else "alerts"
api = UptimeKumaApi(api_url, timeout=60, wait_events=2.0)
api.login(username, password)
# Get all monitors
monitors = api.get_monitors()
# Get all notifications and find ntfy notification
notifications = api.get_notifications()
ntfy_notification = next((n for n in notifications if n.get('name') == f'ntfy ({ntfy_topic})'), None)
notification_id_list = {}
if ntfy_notification:
notification_id_list[ntfy_notification['id']] = True
# Find or create group
group = next((m for m in monitors if m.get('name') == group_name and m.get('type') == 'group'), None)
if not group:
group_result = api.add_monitor(type='group', name=group_name)
# Refresh to get the full group object with id
monitors = api.get_monitors()
group = next((m for m in monitors if m.get('name') == group_name and m.get('type') == 'group'), None)
# Find or create/update push monitor
existing_monitor = next((m for m in monitors if m.get('name') == monitor_name), None)
monitor_data = {
'type': 'push',
'name': monitor_name,
'parent': group['id'],
'interval': interval,
'upsideDown': True,
'description': monitor_description,
'notificationIDList': notification_id_list
}
if existing_monitor:
monitor = api.edit_monitor(existing_monitor['id'], **monitor_data)
# Refresh to get the full monitor object with pushToken
monitors = api.get_monitors()
monitor = next((m for m in monitors if m.get('name') == monitor_name), None)
else:
monitor_result = api.add_monitor(**monitor_data)
# Refresh to get the full monitor object with pushToken
monitors = api.get_monitors()
monitor = next((m for m in monitors if m.get('name') == monitor_name), None)
# Output result as JSON
result = {
'monitor_id': monitor['id'],
'push_token': monitor['pushToken'],
'group_name': group_name,
'group_id': group['id'],
'monitor_name': monitor_name
}
print(json.dumps(result))
api.disconnect()
if __name__ == '__main__':
main()
mode: '0755'
delegate_to: localhost
become: no
- name: Run Uptime Kuma monitor setup script
command: >
{{ ansible_playbook_python }}
/tmp/setup_uptime_kuma_monitor.py
"{{ uptime_kuma_api_url }}"
"{{ uptime_kuma_username }}"
"{{ uptime_kuma_password }}"
"{{ uptime_kuma_monitor_group }}"
"{{ monitor_name }}"
"{{ monitor_friendly_name }} - Alerts when usage exceeds {{ disk_usage_threshold_percent }}%"
"{{ (disk_check_interval_minutes * 60) + 60 }}"
"{{ ntfy_topic }}"
register: monitor_setup_result
delegate_to: localhost
become: no
changed_when: false
- name: Parse monitor setup result
set_fact:
monitor_info_parsed: "{{ monitor_setup_result.stdout | from_json }}"
- name: Set push URL and monitor ID as facts
set_fact:
uptime_kuma_disk_usage_push_url: "{{ uptime_kuma_api_url }}/api/push/{{ monitor_info_parsed.push_token }}"
uptime_kuma_monitor_id: "{{ monitor_info_parsed.monitor_id }}"
- name: Install required packages for disk monitoring
package:
name:
- curl
state: present
- name: Create monitoring script directory
file:
path: "{{ monitoring_script_dir }}"
state: directory
owner: root
group: root
mode: '0755'
- name: Create disk usage monitoring script
copy:
dest: "{{ monitoring_script_path }}"
content: |
#!/bin/bash
# Disk Usage Monitoring Script
# Monitors disk usage and sends alerts to Uptime Kuma
# Mode: "No news is good news" - only sends alerts when disk usage is HIGH
LOG_FILE="{{ log_file }}"
USAGE_THRESHOLD="{{ disk_usage_threshold_percent }}"
UPTIME_KUMA_URL="{{ uptime_kuma_disk_usage_push_url }}"
MOUNT_POINT="{{ monitored_mount_point }}"
# Function to log messages
log_message() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE"
}
# Function to get disk usage percentage
get_disk_usage() {
local mount_point="$1"
local usage=""
# Get disk usage percentage (without % sign)
usage=$(df -h "$mount_point" 2>/dev/null | awk 'NR==2 {gsub(/%/, "", $5); print $5}')
if [ -z "$usage" ]; then
log_message "ERROR: Could not read disk usage for $mount_point"
return 1
fi
echo "$usage"
}
# Function to get disk usage details
get_disk_details() {
local mount_point="$1"
df -h "$mount_point" 2>/dev/null | awk 'NR==2 {print "Used: "$3" / Total: "$2" ("$5" full)"}'
}
# Function to send alert to Uptime Kuma when disk usage exceeds threshold
# With upside-down mode enabled, sending status=up will trigger an alert
send_uptime_kuma_alert() {
local usage="$1"
local details="$2"
local message="DISK FULL WARNING: ${MOUNT_POINT} is ${usage}% full (Threshold: ${USAGE_THRESHOLD}%) - ${details}"
log_message "ALERT: $message"
# Send push notification to Uptime Kuma with status=up
# In upside-down mode, status=up is treated as down/alert
response=$(curl -s -w "\n%{http_code}" -G \
--data-urlencode "status=up" \
--data-urlencode "msg=$message" \
"$UPTIME_KUMA_URL" 2>&1)
http_code=$(echo "$response" | tail -n1)
if [ "$http_code" = "200" ] || [ "$http_code" = "201" ]; then
log_message "Alert sent successfully to Uptime Kuma (HTTP $http_code)"
else
log_message "ERROR: Failed to send alert to Uptime Kuma (HTTP $http_code)"
fi
}
# Main monitoring logic
main() {
log_message "Starting disk usage check for $MOUNT_POINT"
# Get current disk usage
current_usage=$(get_disk_usage "$MOUNT_POINT")
if [ $? -ne 0 ] || [ -z "$current_usage" ]; then
log_message "ERROR: Could not read disk usage"
exit 1
fi
# Get disk details
disk_details=$(get_disk_details "$MOUNT_POINT")
log_message "Current disk usage: ${current_usage}% - $disk_details"
# Check if usage exceeds threshold
if [ "$current_usage" -gt "$USAGE_THRESHOLD" ]; then
log_message "WARNING: Disk usage ${current_usage}% exceeds threshold ${USAGE_THRESHOLD}%"
send_uptime_kuma_alert "$current_usage" "$disk_details"
else
log_message "Disk usage is within normal range - no alert needed (no news is good news)"
fi
}
# Run main function
main
owner: root
group: root
mode: '0755'
- name: Create systemd service for disk usage monitoring
copy:
dest: "/etc/systemd/system/{{ systemd_service_name }}.service"
content: |
[Unit]
Description=Disk Usage Monitor
After=network.target
[Service]
Type=oneshot
ExecStart={{ monitoring_script_path }}
User=root
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
owner: root
group: root
mode: '0644'
- name: Create systemd timer for disk usage monitoring
copy:
dest: "/etc/systemd/system/{{ systemd_service_name }}.timer"
content: |
[Unit]
Description=Run Disk Usage Monitor every {{ disk_check_interval_minutes }} minute(s)
Requires={{ systemd_service_name }}.service
[Timer]
OnBootSec={{ disk_check_interval_minutes }}min
OnUnitActiveSec={{ disk_check_interval_minutes }}min
Persistent=true
[Install]
WantedBy=timers.target
owner: root
group: root
mode: '0644'
- name: Reload systemd daemon
systemd:
daemon_reload: yes
- name: Enable and start disk usage monitoring timer
systemd:
name: "{{ systemd_service_name }}.timer"
enabled: yes
state: started
- name: Test disk usage monitoring script
command: "{{ monitoring_script_path }}"
register: script_test
changed_when: false
- name: Verify script execution
assert:
that:
- script_test.rc == 0
fail_msg: "Disk usage monitoring script failed to execute properly"
- name: Clean up temporary Uptime Kuma setup script
file:
path: /tmp/setup_uptime_kuma_monitor.py
state: absent
delegate_to: localhost
become: no

View file

@ -0,0 +1,313 @@
- name: Deploy System Healthcheck Monitoring
hosts: all
become: yes
vars_files:
- ../infra_vars.yml
- ../services_config.yml
- ../infra_secrets.yml
- ../services/uptime_kuma/uptime_kuma_vars.yml
- ../services/ntfy/ntfy_vars.yml
vars:
healthcheck_interval_seconds: 60 # Send healthcheck every 60 seconds (1 minute)
healthcheck_timeout_seconds: 90 # Uptime Kuma should alert if no ping received within 90s
healthcheck_retries: 1 # Number of retries before alerting
monitoring_script_dir: /opt/system-healthcheck
monitoring_script_path: "{{ monitoring_script_dir }}/system_healthcheck.sh"
log_file: "{{ monitoring_script_dir }}/system_healthcheck.log"
systemd_service_name: system-healthcheck
# Uptime Kuma configuration (auto-configured from services_config.yml and infra_secrets.yml)
uptime_kuma_api_url: "https://{{ subdomains.uptime_kuma }}.{{ root_domain }}"
tasks:
- name: Validate Uptime Kuma configuration
assert:
that:
- uptime_kuma_api_url is defined
- uptime_kuma_api_url != ""
- uptime_kuma_username is defined
- uptime_kuma_username != ""
- uptime_kuma_password is defined
- uptime_kuma_password != ""
fail_msg: "uptime_kuma_api_url, uptime_kuma_username and uptime_kuma_password must be set"
- name: Get hostname for monitor identification
command: hostname
register: host_name
changed_when: false
- name: Set monitor name and group based on hostname
set_fact:
monitor_name: "system-healthcheck-{{ host_name.stdout }}"
monitor_friendly_name: "System Healthcheck: {{ host_name.stdout }}"
uptime_kuma_monitor_group: "{{ host_name.stdout }} - infra"
- name: Create Uptime Kuma monitor setup script
copy:
dest: /tmp/setup_uptime_kuma_healthcheck_monitor.py
content: |
#!/usr/bin/env python3
import sys
import json
from uptime_kuma_api import UptimeKumaApi
def main():
api_url = sys.argv[1]
username = sys.argv[2]
password = sys.argv[3]
group_name = sys.argv[4]
monitor_name = sys.argv[5]
monitor_description = sys.argv[6]
interval = int(sys.argv[7])
retries = int(sys.argv[8])
ntfy_topic = sys.argv[9] if len(sys.argv) > 9 else "alerts"
api = UptimeKumaApi(api_url, timeout=60, wait_events=2.0)
api.login(username, password)
# Get all monitors
monitors = api.get_monitors()
# Get all notifications and find ntfy notification
notifications = api.get_notifications()
ntfy_notification = next((n for n in notifications if n.get('name') == f'ntfy ({ntfy_topic})'), None)
notification_id_list = {}
if ntfy_notification:
notification_id_list[ntfy_notification['id']] = True
# Find or create group
group = next((m for m in monitors if m.get('name') == group_name and m.get('type') == 'group'), None)
if not group:
group_result = api.add_monitor(type='group', name=group_name)
# Refresh to get the full group object with id
monitors = api.get_monitors()
group = next((m for m in monitors if m.get('name') == group_name and m.get('type') == 'group'), None)
# Find or create/update push monitor
existing_monitor = next((m for m in monitors if m.get('name') == monitor_name), None)
monitor_data = {
'type': 'push',
'name': monitor_name,
'parent': group['id'],
'interval': interval,
'upsideDown': False, # Normal mode: receiving pings = healthy
'maxretries': retries,
'description': monitor_description,
'notificationIDList': notification_id_list
}
if existing_monitor:
monitor = api.edit_monitor(existing_monitor['id'], **monitor_data)
# Refresh to get the full monitor object with pushToken
monitors = api.get_monitors()
monitor = next((m for m in monitors if m.get('name') == monitor_name), None)
else:
monitor_result = api.add_monitor(**monitor_data)
# Refresh to get the full monitor object with pushToken
monitors = api.get_monitors()
monitor = next((m for m in monitors if m.get('name') == monitor_name), None)
# Output result as JSON
result = {
'monitor_id': monitor['id'],
'push_token': monitor['pushToken'],
'group_name': group_name,
'group_id': group['id'],
'monitor_name': monitor_name
}
print(json.dumps(result))
api.disconnect()
if __name__ == '__main__':
main()
mode: '0755'
delegate_to: localhost
become: no
- name: Run Uptime Kuma monitor setup script
command: >
{{ ansible_playbook_python }}
/tmp/setup_uptime_kuma_healthcheck_monitor.py
"{{ uptime_kuma_api_url }}"
"{{ uptime_kuma_username }}"
"{{ uptime_kuma_password }}"
"{{ uptime_kuma_monitor_group }}"
"{{ monitor_name }}"
"{{ monitor_friendly_name }} - Regular healthcheck ping every {{ healthcheck_interval_seconds }}s"
"{{ healthcheck_timeout_seconds }}"
"{{ healthcheck_retries }}"
"{{ ntfy_topic }}"
register: monitor_setup_result
delegate_to: localhost
become: no
changed_when: false
- name: Parse monitor setup result
set_fact:
monitor_info_parsed: "{{ monitor_setup_result.stdout | from_json }}"
- name: Set push URL and monitor ID as facts
set_fact:
uptime_kuma_healthcheck_push_url: "{{ uptime_kuma_api_url }}/api/push/{{ monitor_info_parsed.push_token }}"
uptime_kuma_monitor_id: "{{ monitor_info_parsed.monitor_id }}"
- name: Install required packages for healthcheck monitoring
package:
name:
- curl
state: present
- name: Create monitoring script directory
file:
path: "{{ monitoring_script_dir }}"
state: directory
owner: root
group: root
mode: '0755'
- name: Create system healthcheck script
copy:
dest: "{{ monitoring_script_path }}"
content: |
#!/bin/bash
# System Healthcheck Script
# Sends regular heartbeat pings to Uptime Kuma
# This ensures the system is running and able to communicate
LOG_FILE="{{ log_file }}"
UPTIME_KUMA_URL="{{ uptime_kuma_healthcheck_push_url }}"
HOSTNAME=$(hostname)
# Function to log messages
log_message() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE"
}
# Function to send healthcheck ping to Uptime Kuma
send_healthcheck() {
local uptime_seconds=$(awk '{print int($1)}' /proc/uptime)
local uptime_days=$((uptime_seconds / 86400))
local uptime_hours=$(((uptime_seconds % 86400) / 3600))
local uptime_minutes=$(((uptime_seconds % 3600) / 60))
local message="System healthy - Uptime: ${uptime_days}d ${uptime_hours}h ${uptime_minutes}m"
log_message "Sending healthcheck ping: $message"
# Send push notification to Uptime Kuma with status=up
encoded_message=$(printf '%s\n' "$message" | sed 's/ /%20/g; s/(/%28/g; s/)/%29/g; s/:/%3A/g; s/\//%2F/g')
response=$(curl -s -w "\n%{http_code}" "$UPTIME_KUMA_URL?status=up&msg=$encoded_message" 2>&1)
http_code=$(echo "$response" | tail -n1)
if [ "$http_code" = "200" ] || [ "$http_code" = "201" ]; then
log_message "Healthcheck ping sent successfully (HTTP $http_code)"
else
log_message "ERROR: Failed to send healthcheck ping (HTTP $http_code)"
return 1
fi
}
# Main healthcheck logic
main() {
log_message "Starting system healthcheck for $HOSTNAME"
# Send healthcheck ping
if send_healthcheck; then
log_message "Healthcheck completed successfully"
else
log_message "ERROR: Healthcheck failed"
exit 1
fi
}
# Run main function
main
owner: root
group: root
mode: '0755'
- name: Create systemd service for system healthcheck
copy:
dest: "/etc/systemd/system/{{ systemd_service_name }}.service"
content: |
[Unit]
Description=System Healthcheck Monitor
After=network.target
[Service]
Type=oneshot
ExecStart={{ monitoring_script_path }}
User=root
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
owner: root
group: root
mode: '0644'
- name: Create systemd timer for system healthcheck
copy:
dest: "/etc/systemd/system/{{ systemd_service_name }}.timer"
content: |
[Unit]
Description=Run System Healthcheck every minute
Requires={{ systemd_service_name }}.service
[Timer]
OnBootSec=30sec
OnUnitActiveSec={{ healthcheck_interval_seconds }}sec
Persistent=true
[Install]
WantedBy=timers.target
owner: root
group: root
mode: '0644'
- name: Reload systemd daemon
systemd:
daemon_reload: yes
- name: Enable and start system healthcheck timer
systemd:
name: "{{ systemd_service_name }}.timer"
enabled: yes
state: started
- name: Test system healthcheck script
command: "{{ monitoring_script_path }}"
register: script_test
changed_when: false
- name: Verify script execution
assert:
that:
- script_test.rc == 0
fail_msg: "System healthcheck script failed to execute properly"
- name: Display monitor information
debug:
msg: |
✓ System healthcheck monitoring deployed successfully!
Monitor Name: {{ monitor_friendly_name }}
Monitor Group: {{ uptime_kuma_monitor_group }}
Healthcheck Interval: Every {{ healthcheck_interval_seconds }} seconds (1 minute)
Timeout: {{ healthcheck_timeout_seconds }} seconds (90s)
Retries: {{ healthcheck_retries }}
The system will send a heartbeat ping every minute.
Uptime Kuma will alert if no ping is received within 90 seconds (with 1 retry).
- name: Clean up temporary Uptime Kuma setup script
file:
path: /tmp/setup_uptime_kuma_healthcheck_monitor.py
state: absent
delegate_to: localhost
become: no

View file

@ -0,0 +1,118 @@
- name: Join machine to headscale mesh network
hosts: all
become: yes
vars_files:
- ../infra_vars.yml
- ../services/headscale/headscale_vars.yml
vars:
headscale_domain: "https://{{ headscale_subdomain }}.{{ root_domain }}"
tasks:
- name: Set headscale host
set_fact:
headscale_host: "{{ groups['spacey'][0] }}"
- name: Set facts for headscale server connection
set_fact:
headscale_user: "{{ hostvars[headscale_host]['ansible_user'] }}"
headscale_key: "{{ hostvars[headscale_host]['ansible_ssh_private_key_file'] | default('') }}"
headscale_port: "{{ hostvars[headscale_host]['ansible_port'] | default(22) }}"
- name: Get user ID for namespace from headscale server via lapy
delegate_to: "{{ groups['lapy'][0] }}"
become: no
vars:
ssh_args: "{{ ('-i ' + headscale_key + ' ' if headscale_key else '') + '-p ' + headscale_port|string }}"
shell: >
ssh {{ ssh_args }}
{{ headscale_user }}@{{ headscale_host }}
"sudo headscale users list -o json"
register: users_list_result
changed_when: false
failed_when: users_list_result.rc != 0
- name: Extract user ID from users list
set_fact:
headscale_user_id: "{{ (users_list_result.stdout | from_json) | selectattr('name', 'equalto', headscale_namespace) | map(attribute='id') | first }}"
failed_when: headscale_user_id is not defined or headscale_user_id == ''
- name: Generate pre-auth key from headscale server via lapy
delegate_to: "{{ groups['lapy'][0] }}"
become: no
vars:
ssh_args: "{{ ('-i ' + headscale_key + ' ' if headscale_key else '') + '-p ' + headscale_port|string }}"
shell: >
ssh {{ ssh_args }}
{{ headscale_user }}@{{ headscale_host }}
"sudo headscale preauthkeys create --user {{ headscale_user_id }} --expiration 1m --output json"
register: preauth_key_result
changed_when: true
failed_when: preauth_key_result.rc != 0
- name: Extract auth key from preauth result
set_fact:
auth_key: "{{ (preauth_key_result.stdout | from_json).key }}"
failed_when: auth_key is not defined or auth_key == ''
- name: Install required packages for Tailscale
apt:
name:
- curl
- ca-certificates
- gnupg
state: present
update_cache: yes
- name: Create directory for GPG keyrings
file:
path: /etc/apt/keyrings
state: directory
mode: '0755'
- name: Download Tailscale GPG key
get_url:
url: https://pkgs.tailscale.com/stable/debian/bookworm.gpg
dest: /etc/apt/keyrings/tailscale.gpg
mode: '0644'
- name: Add Tailscale repository
apt_repository:
repo: "deb [signed-by=/etc/apt/keyrings/tailscale.gpg] https://pkgs.tailscale.com/stable/debian {{ ansible_lsb.codename }} main"
state: present
update_cache: yes
- name: Install Tailscale
apt:
name: tailscale
state: present
update_cache: yes
- name: Enable and start Tailscale service
systemd:
name: tailscaled
enabled: yes
state: started
- name: Configure Tailscale to use headscale server
command: >
tailscale up
--login-server {{ headscale_domain }}
--authkey {{ auth_key }}
--accept-dns=true
register: tailscale_up_result
changed_when: "'already authenticated' not in tailscale_up_result.stdout"
failed_when: tailscale_up_result.rc != 0 and 'already authenticated' not in tailscale_up_result.stdout
- name: Wait for Tailscale to be fully connected
pause:
seconds: 2
- name: Display Tailscale status
command: tailscale status
register: tailscale_status
changed_when: false
- name: Show Tailscale connection status
debug:
msg: "{{ tailscale_status.stdout_lines }}"

View file

@ -0,0 +1,128 @@
- name: Bootstrap Nodito SSH Key Access
hosts: nodito
become: true
vars_files:
- ../infra_vars.yml
tasks:
- name: Install sudo package
package:
name: sudo
state: present
- name: Ensure SSH directory exists for root
file:
path: /root/.ssh
state: directory
mode: "0700"
owner: root
group: root
- name: Install SSH public key for root
authorized_key:
user: root
key: "{{ lookup('file', ansible_ssh_private_key_file + '.pub') }}"
state: present
- name: Ensure SSH key-based authentication is enabled
lineinfile:
path: /etc/ssh/sshd_config
regexp: "^#?PubkeyAuthentication"
line: "PubkeyAuthentication yes"
state: present
backrefs: yes
- name: Ensure AuthorizedKeysFile is properly configured
lineinfile:
path: /etc/ssh/sshd_config
regexp: "^#?AuthorizedKeysFile"
line: "AuthorizedKeysFile .ssh/authorized_keys"
state: present
backrefs: yes
- name: Restart SSH service
service:
name: ssh
state: restarted
- name: Wait for SSH to be ready
wait_for:
port: "{{ ssh_port }}"
host: "{{ ansible_host }}"
delay: 2
timeout: 30
- name: Test SSH key authentication
command: whoami
register: ssh_key_test
changed_when: false
- name: Verify SSH key authentication works
assert:
that:
- ssh_key_test.stdout == "root"
fail_msg: "SSH key authentication failed - expected 'root', got '{{ ssh_key_test.stdout }}'"
- name: Create new user
user:
name: "{{ new_user }}"
groups: sudo
shell: /bin/bash
state: present
create_home: yes
- name: Set up SSH directory for new user
file:
path: "/home/{{ new_user }}/.ssh"
state: directory
mode: "0700"
owner: "{{ new_user }}"
group: "{{ new_user }}"
- name: Install SSH public key for new user
authorized_key:
user: "{{ new_user }}"
key: "{{ lookup('file', ansible_ssh_private_key_file + '.pub') }}"
state: present
- name: Allow new user to run sudo without password
copy:
dest: "/etc/sudoers.d/{{ new_user }}"
content: "{{ new_user }} ALL=(ALL) NOPASSWD:ALL"
owner: root
group: root
mode: "0440"
- name: Disable root login
lineinfile:
path: /etc/ssh/sshd_config
regexp: "^#?PermitRootLogin .*"
line: "PermitRootLogin no"
state: present
backrefs: yes
- name: Disable password authentication
lineinfile:
path: /etc/ssh/sshd_config
regexp: "^#?PasswordAuthentication .*"
line: "PasswordAuthentication no"
state: present
backrefs: yes
- name: Restart SSH service
service:
name: ssh
state: restarted
- name: Wait for SSH to be ready
wait_for:
port: "{{ ssh_port }}"
host: "{{ ansible_host }}"
delay: 2
timeout: 30
- name: Test connection with new user
command: whoami
become_user: "{{ new_user }}"
register: new_user_test
changed_when: false

View file

@ -0,0 +1,317 @@
- name: Switch Proxmox VE from Enterprise to Community Repositories
hosts: nodito
become: true
vars_files:
- ../infra_vars.yml
tasks:
- name: Check for deb822 sources format
find:
paths: /etc/apt/sources.list.d/
patterns: "*.sources"
file_type: file
register: deb822_sources
changed_when: false
- name: Check for legacy .list files
find:
paths: /etc/apt/sources.list.d/
patterns: "*.list"
file_type: file
register: legacy_list_files
changed_when: false
- name: Check main sources.list for Proxmox entries
command: grep -q "proxmox\|trixie" /etc/apt/sources.list
register: main_sources_check
failed_when: false
changed_when: false
- name: Display current repository status
debug:
msg: |
Repository status:
- deb822 sources files: {{ deb822_sources.matched }}
- legacy .list files: {{ legacy_list_files.matched }}
- Proxmox/Trixie entries in sources.list: {{ main_sources_check.rc == 0 }}
- name: Check for enterprise repository in deb822 format
shell: |
for file in /etc/apt/sources.list.d/*.sources; do
if grep -q "Components:.*pve-enterprise" "$file" 2>/dev/null; then
echo "$file"
break
fi
done
register: enterprise_deb822_check
failed_when: false
changed_when: false
- name: Check for enterprise repository in legacy format
shell: |
for file in /etc/apt/sources.list.d/*.list; do
if grep -q "enterprise.proxmox.com" "$file" 2>/dev/null; then
echo "$file"
break
fi
done
register: enterprise_legacy_check
failed_when: false
changed_when: false
- name: Check for Ceph enterprise repository in deb822 format
shell: |
for file in /etc/apt/sources.list.d/*.sources; do
if grep -q "enterprise.proxmox.com.*ceph" "$file" 2>/dev/null; then
echo "$file"
break
fi
done
register: ceph_enterprise_deb822_check
failed_when: false
changed_when: false
- name: Check for Ceph enterprise repository in legacy format
shell: |
for file in /etc/apt/sources.list.d/*.list; do
if grep -q "enterprise.proxmox.com.*ceph" "$file" 2>/dev/null; then
echo "$file"
break
fi
done
register: ceph_enterprise_legacy_check
failed_when: false
changed_when: false
- name: Backup enterprise repository files
copy:
src: "{{ item }}"
dest: "{{ item }}.backup"
remote_src: yes
backup: yes
loop: "{{ (enterprise_deb822_check.stdout_lines + enterprise_legacy_check.stdout_lines + ceph_enterprise_deb822_check.stdout_lines + ceph_enterprise_legacy_check.stdout_lines) | select('string') | list }}"
when: (enterprise_deb822_check.stdout_lines + enterprise_legacy_check.stdout_lines + ceph_enterprise_deb822_check.stdout_lines + ceph_enterprise_legacy_check.stdout_lines) | select('string') | list | length > 0
- name: Delete enterprise repository files (deb822 format)
file:
path: "{{ item }}"
state: absent
loop: "{{ enterprise_deb822_check.stdout_lines | select('string') | list }}"
when: enterprise_deb822_check.stdout_lines | select('string') | list | length > 0
- name: Delete enterprise repository files (legacy format)
file:
path: "{{ item }}"
state: absent
loop: "{{ enterprise_legacy_check.stdout_lines | select('string') | list }}"
when: enterprise_legacy_check.stdout_lines | select('string') | list | length > 0
- name: Delete Ceph enterprise repository files (deb822 format)
file:
path: "{{ item }}"
state: absent
loop: "{{ ceph_enterprise_deb822_check.stdout_lines | select('string') | list }}"
when: ceph_enterprise_deb822_check.stdout_lines | select('string') | list | length > 0
- name: Delete Ceph enterprise repository files (legacy format)
file:
path: "{{ item }}"
state: absent
loop: "{{ ceph_enterprise_legacy_check.stdout_lines | select('string') | list }}"
when: ceph_enterprise_legacy_check.stdout_lines | select('string') | list | length > 0
- name: Create community repository file (deb822 format)
copy:
dest: /etc/apt/sources.list.d/proxmox.sources
content: |
Types: deb
URIs: http://download.proxmox.com/debian/pve
Suites: trixie
Components: pve-no-subscription
Signed-By: /usr/share/keyrings/proxmox-archive-keyring.gpg
owner: root
group: root
mode: '0644'
backup: yes
when: deb822_sources.matched > 0
- name: Create community repository file (legacy format)
copy:
dest: /etc/apt/sources.list.d/pve-no-subscription.list
content: |
# PVE pve-no-subscription repository provided by proxmox.com,
# NOT recommended for production use
deb http://download.proxmox.com/debian/pve trixie pve-no-subscription
owner: root
group: root
mode: '0644'
backup: yes
when: deb822_sources.matched == 0
- name: Create Ceph community repository file (deb822 format)
copy:
dest: /etc/apt/sources.list.d/ceph.sources
content: |
Types: deb
URIs: http://download.proxmox.com/debian/ceph-squid
Suites: trixie
Components: no-subscription
Signed-By: /usr/share/keyrings/proxmox-archive-keyring.gpg
owner: root
group: root
mode: '0644'
backup: yes
when: deb822_sources.matched > 0
- name: Create Ceph community repository file (legacy format)
copy:
dest: /etc/apt/sources.list.d/ceph-no-subscription.list
content: |
# Ceph no-subscription repository provided by proxmox.com,
# NOT recommended for production use
deb http://download.proxmox.com/debian/ceph-squid trixie no-subscription
owner: root
group: root
mode: '0644'
backup: yes
when: deb822_sources.matched == 0
- name: Update package cache
apt:
update_cache: yes
cache_valid_time: 3600
- name: Verify community repository is working
command: apt-cache policy proxmox-ve
register: community_repo_verify
changed_when: false
- name: Display community repository verification
debug:
var: community_repo_verify.stdout_lines
- name: Update Proxmox packages from community repository
apt:
name: proxmox-ve
state: latest
update_cache: yes
- name: Verify Proxmox VE version
command: pveversion
register: proxmox_version
changed_when: false
- name: Display Proxmox VE version
debug:
msg: "Proxmox VE version: {{ proxmox_version.stdout }}"
- name: Check repository status
shell: apt-cache policy | grep -A 5 -B 5 proxmox
register: final_repo_status
changed_when: false
- name: Display final repository status
debug:
var: final_repo_status.stdout_lines
- name: Verify no enterprise repository warnings
command: apt update
register: apt_update_result
changed_when: false
- name: Check for enterprise repository warnings
fail:
msg: "Enterprise repository warnings detected. Check the output above."
when: "'enterprise.proxmox.com' in apt_update_result.stdout"
- name: Create subscription nag removal script
copy:
dest: /usr/local/bin/pve-remove-nag.sh
content: |
#!/bin/sh
WEB_JS=/usr/share/javascript/proxmox-widget-toolkit/proxmoxlib.js
if [ -s "$WEB_JS" ] && ! grep -q NoMoreNagging "$WEB_JS"; then
echo "Patching Web UI nag..."
sed -i -e "/data\.status/ s/!//" -e "/data\.status/ s/active/NoMoreNagging/" "$WEB_JS"
fi
MOBILE_TPL=/usr/share/pve-yew-mobile-gui/index.html.tpl
MARKER="<!-- MANAGED BLOCK FOR MOBILE NAG -->"
if [ -f "$MOBILE_TPL" ] && ! grep -q "$MARKER" "$MOBILE_TPL"; then
echo "Patching Mobile UI nag..."
printf "%s\n" \
"$MARKER" \
"<script>" \
" function removeSubscriptionElements() {" \
" // --- Remove subscription dialogs ---" \
" const dialogs = document.querySelectorAll('dialog.pwt-outer-dialog');" \
" dialogs.forEach(dialog => {" \
" const text = (dialog.textContent || '').toLowerCase();" \
" if (text.includes('subscription')) {" \
" dialog.remove();" \
" console.log('Removed subscription dialog');" \
" }" \
" });" \
"" \
" // --- Remove subscription cards, but keep Reboot/Shutdown/Console ---" \
" const cards = document.querySelectorAll('.pwt-card.pwt-p-2.pwt-d-flex.pwt-interactive.pwt-justify-content-center');" \
" cards.forEach(card => {" \
" const text = (card.textContent || '').toLowerCase();" \
" const hasButton = card.querySelector('button');" \
" if (!hasButton && text.includes('subscription')) {" \
" card.remove();" \
" console.log('Removed subscription card');" \
" }" \
" });" \
" }" \
"" \
" const observer = new MutationObserver(removeSubscriptionElements);" \
" observer.observe(document.body, { childList: true, subtree: true });" \
" removeSubscriptionElements();" \
" setInterval(removeSubscriptionElements, 300);" \
" setTimeout(() => {observer.disconnect();}, 10000);" \
"</script>" \
"" >> "$MOBILE_TPL"
fi
owner: root
group: root
mode: '0755'
- name: Create APT configuration for nag removal
copy:
dest: /etc/apt/apt.conf.d/no-nag-script
content: |
DPkg::Post-Invoke { "/usr/local/bin/pve-remove-nag.sh"; };
owner: root
group: root
mode: '0644'
- name: Run nag removal script immediately
command: /usr/local/bin/pve-remove-nag.sh
changed_when: false
- name: Reinstall proxmox-widget-toolkit to apply nag removal
apt:
name: proxmox-widget-toolkit
state: present
force: yes
- name: Clean up backup files
file:
path: "{{ item }}"
state: absent
loop:
- /etc/apt/sources.list.d/ceph.sources.backup
- /etc/apt/sources.list.d/pve-enterprise.sources.backup
ignore_errors: yes
- name: Success message
debug:
msg: |
Successfully switched from Proxmox Enterprise to Community repositories.
Enterprise repository has been disabled and community repository is now active.
Subscription nag messages have been disabled.
Proxmox VE version: {{ proxmox_version.stdout }}
IMPORTANT: Clear your browser cache or perform a hard reload (Ctrl+Shift+R)
before using the Proxmox VE Web UI to avoid UI display issues.

View file

@ -0,0 +1,172 @@
- name: Setup ZFS RAID 1 Pool for Proxmox Storage
hosts: nodito
become: true
vars_files:
- ../infra_vars.yml
- nodito_vars.yml
tasks:
- name: Verify Proxmox VE is running
command: pveversion
register: pve_version_check
changed_when: false
failed_when: pve_version_check.rc != 0
- name: Update package cache
apt:
update_cache: yes
cache_valid_time: 3600
- name: Install ZFS utilities
package:
name:
- zfsutils-linux
- zfs-initramfs
state: present
- name: Load ZFS kernel module
modprobe:
name: zfs
- name: Ensure ZFS module loads at boot
lineinfile:
path: /etc/modules
line: zfs
state: present
- name: Check if ZFS pool already exists
command: zpool list {{ zfs_pool_name }}
register: zfs_pool_exists
failed_when: false
changed_when: false
- name: Check if disks are in use
shell: |
for disk in {{ zfs_disk_1 }} {{ zfs_disk_2 }}; do
if mount | grep -q "^$disk"; then
echo "ERROR: $disk is mounted"
exit 1
fi
if lsblk -n -o MOUNTPOINT "$disk" | grep -v "^$" | grep -q .; then
echo "ERROR: $disk has mounted partitions"
exit 1
fi
done
register: disk_usage_check
failed_when: disk_usage_check.rc != 0
changed_when: false
- name: Create ZFS RAID 1 pool with optimized settings
command: >
zpool create {{ zfs_pool_name }}
-o ashift=12
-O mountpoint=none
mirror {{ zfs_disk_1 }} {{ zfs_disk_2 }}
when: zfs_pool_exists.rc != 0
register: zfs_pool_create_result
- name: Check if ZFS dataset already exists
command: zfs list {{ zfs_pool_name }}/vm-storage
register: zfs_dataset_exists
failed_when: false
changed_when: false
- name: Create ZFS dataset for Proxmox storage
command: zfs create {{ zfs_pool_name }}/vm-storage
when: zfs_dataset_exists.rc != 0
register: zfs_dataset_create_result
- name: Set ZFS dataset properties for Proxmox
command: zfs set {{ item.property }}={{ item.value }} {{ zfs_pool_name }}/vm-storage
loop:
- { property: "mountpoint", value: "{{ zfs_pool_mountpoint }}" }
- { property: "compression", value: "lz4" }
- { property: "atime", value: "off" }
- { property: "xattr", value: "sa" }
- { property: "acltype", value: "posixacl" }
- { property: "dnodesize", value: "auto" }
when: zfs_dataset_exists.rc != 0
- name: Set ZFS pool properties for Proxmox
command: zpool set autotrim=off {{ zfs_pool_name }}
when: zfs_pool_exists.rc != 0
- name: Set ZFS pool mountpoint for Proxmox
command: zfs set mountpoint={{ zfs_pool_mountpoint }} {{ zfs_pool_name }}
when: zfs_pool_exists.rc == 0
- name: Export and re-import ZFS pool for Proxmox compatibility
shell: |
zpool export {{ zfs_pool_name }}
zpool import {{ zfs_pool_name }}
when: zfs_pool_exists.rc != 0
register: zfs_pool_import_result
- name: Ensure ZFS services are enabled
systemd:
name: "{{ item }}"
enabled: yes
state: started
loop:
- zfs-import-cache
- zfs-import-scan
- zfs-mount
- zfs-share
- zfs-zed
- name: Check if ZFS pool storage already exists in Proxmox config
stat:
path: /etc/pve/storage.cfg
register: storage_cfg_file
- name: Check if storage name exists in Proxmox config
shell: "grep -q '^zfspool: {{ zfs_pool_name }}' /etc/pve/storage.cfg"
register: storage_exists_check
failed_when: false
changed_when: false
when: storage_cfg_file.stat.exists
- name: Set storage not configured when config file doesn't exist
set_fact:
storage_exists_check:
rc: 1
when: not storage_cfg_file.stat.exists
- name: Debug storage configuration status
debug:
msg: |
Config file exists: {{ storage_cfg_file.stat.exists }}
Storage check result: {{ storage_exists_check.rc }}
Pool exists: {{ zfs_pool_exists.rc == 0 }}
Will remove storage: {{ zfs_pool_exists.rc == 0 and storage_exists_check.rc == 0 }}
Will add storage: {{ zfs_pool_exists.rc == 0 and storage_exists_check.rc != 0 }}
- name: Remove existing storage if it exists
command: pvesm remove {{ zfs_pool_name }}
register: pvesm_remove_result
failed_when: false
when:
- zfs_pool_exists.rc == 0
- storage_exists_check.rc == 0
- name: Add ZFS pool storage to Proxmox using pvesm
command: >
pvesm add zfspool {{ zfs_pool_name }}
--pool {{ zfs_pool_name }}
--content rootdir,images
--sparse 1
when:
- zfs_pool_exists.rc == 0
- storage_exists_check.rc != 0
register: pvesm_add_result
- name: Verify ZFS pool is healthy
command: zpool status {{ zfs_pool_name }}
register: final_zfs_status
changed_when: false
- name: Fail if ZFS pool is not healthy
fail:
msg: "ZFS pool {{ zfs_pool_name }} is not in a healthy state"
when: "'ONLINE' not in final_zfs_status.stdout"

View file

@ -0,0 +1,186 @@
- name: Create Proxmox template from Debian cloud image (no VM clone)
hosts: nodito
become: true
vars_files:
- ../../infra_vars.yml
- nodito_vars.yml
vars:
# Defaults (override via vars_files or --extra-vars as needed)
debian_cloud_image_url: "https://cloud.debian.org/images/cloud/trixie/20251006-2257/debian-13-genericcloud-amd64-20251006-2257.qcow2"
debian_cloud_image_filename: "debian-13-genericcloud-amd64-20251006-2257.qcow2"
debian_cloud_image_dest_dir: "/var/lib/vz/template/iso"
debian_cloud_image_dest_path: "{{ debian_cloud_image_dest_dir }}/{{ debian_cloud_image_filename }}"
proxmox_template_vmid: 9001
proxmox_template_name: "debian-13-cloud-init"
proxmox_template_memory_mb: 1024
proxmox_template_sockets: 1
proxmox_template_cores: 1
proxmox_template_bridge: "vmbr0"
proxmox_template_cpu_type: "host"
proxmox_template_disk_size_gb: 10
# Cloud-init defaults applied at template level (optional). You can override per-VM later.
proxmox_ciuser: "counterweight" # Default login user to create; distro default may already exist
proxmox_sshkey_path: "/home/{{ new_user }}/.ssh/authorized_keys" # Path to pubkey file for cloud-init injection
proxmox_ci_upgrade: true # If true, run package upgrade on first boot
# Auto-install qemu-guest-agent in clones via cloud-init snippet
qemu_agent_snippet_filename: "user-data-qemu-agent.yaml"
# Storage to import disk into; use existing storage like local-lvm or your ZFS pool name
proxmox_image_storage: "{{ zfs_pool_name }}"
tasks:
- name: Verify Proxmox VE is running
command: pveversion
register: pve_version_check
changed_when: false
failed_when: pve_version_check.rc != 0
- name: Ensure destination directory exists for cloud image
file:
path: "{{ debian_cloud_image_dest_dir }}"
state: directory
mode: '0755'
- name: Check if Debian cloud image already present
stat:
path: "{{ debian_cloud_image_dest_path }}"
register: debian_image_stat
- name: Download Debian cloud image (qcow2)
get_url:
url: "{{ debian_cloud_image_url }}"
dest: "{{ debian_cloud_image_dest_path }}"
mode: '0644'
force: false
when: not debian_image_stat.stat.exists
- name: Ensure local storage allows snippets content (used for cloud-init snippets)
command: >
pvesm set local --content images,iso,vztmpl,snippets
failed_when: false
- name: Ensure snippets directory exists on storage mountpoint
file:
path: "{{ zfs_pool_mountpoint }}/snippets"
state: directory
mode: '0755'
- name: Read SSH public key content
slurp:
src: "{{ proxmox_sshkey_path }}"
register: ssh_key_content
- name: Extract SSH keys from authorized_keys file
set_fact:
ssh_keys_list: "{{ ssh_key_content.content | b64decode | split('\n') | select('match', '^ssh-') | list }}"
- name: Write cloud-init vendor-data snippet to install qemu-guest-agent
copy:
dest: "{{ zfs_pool_mountpoint }}/snippets/{{ qemu_agent_snippet_filename }}"
mode: '0644'
content: |
#cloud-config
# Vendor-data snippet: Proxmox will automatically set hostname from VM name when using vendor-data
# User info (ciuser/sshkeys) is set separately via Terraform/Proxmox parameters
package_update: true
package_upgrade: true
packages:
- qemu-guest-agent
runcmd:
- systemctl enable qemu-guest-agent
- systemctl start qemu-guest-agent
- name: Check if VMID already exists
command: qm config {{ proxmox_template_vmid }}
register: vmid_config_check
failed_when: false
changed_when: false
- name: Determine if VM is already a template
set_fact:
vm_already_template: "{{ 'template: 1' in vmid_config_check.stdout }}"
when: vmid_config_check.rc == 0
- name: Create base VM for template (no disk yet)
command: >
qm create {{ proxmox_template_vmid }}
--name {{ proxmox_template_name }}
--numa 0 --ostype l26
--cpu cputype={{ proxmox_template_cpu_type }}
--cores {{ proxmox_template_cores }}
--sockets {{ proxmox_template_sockets }}
--memory {{ proxmox_template_memory_mb }}
--net0 virtio,bridge={{ proxmox_template_bridge }}
when:
- vmid_config_check.rc != 0
- name: Import Debian cloud image as disk to storage
command: >
qm importdisk {{ proxmox_template_vmid }}
{{ debian_cloud_image_dest_path }}
{{ proxmox_image_storage }}
register: importdisk_result
changed_when: '"Successfully imported disk" in importdisk_result.stdout'
when:
- vmid_config_check.rc != 0 or not vm_already_template
- name: Check if ide2 (cloudinit) drive exists
command: qm config {{ proxmox_template_vmid }}
register: vm_config_check
failed_when: false
changed_when: false
when:
- vmid_config_check.rc == 0
- name: Remove existing ide2 (cloudinit) drive if it exists for idempotency
command: >
qm set {{ proxmox_template_vmid }} --delete ide2
register: ide2_removed
when:
- vmid_config_check.rc == 0
- "'ide2:' in vm_config_check.stdout"
- name: Build consolidated qm set argument list (simplified)
set_fact:
qm_set_args: >-
{{
[
'--scsihw virtio-scsi-pci',
'--scsi0 ' ~ proxmox_image_storage ~ ':vm-' ~ proxmox_template_vmid ~ '-disk-0',
'--ide2 ' ~ proxmox_image_storage ~ ':cloudinit',
'--ipconfig0 ip=dhcp',
'--boot c',
'--bootdisk scsi0',
'--serial0 socket',
'--vga serial0',
'--agent enabled=1',
'--ciuser ' ~ proxmox_ciuser,
'--sshkey ' ~ proxmox_sshkey_path
]
+ (proxmox_ci_upgrade | bool
| ternary(['--ciupgrade 1'], []))
+ ['--cicustom vendor=local:snippets/' ~ qemu_agent_snippet_filename]
}}
when:
- vmid_config_check.rc != 0 or not vm_already_template | default(false) or ide2_removed.changed | default(false)
- name: Apply consolidated qm set
command: >
qm set {{ proxmox_template_vmid }} {{ qm_set_args | join(' ') }}
when:
- vmid_config_check.rc != 0 or not vm_already_template | default(false) or ide2_removed.changed | default(false)
- name: Resize primary disk to requested size
command: >
qm resize {{ proxmox_template_vmid }} scsi0 {{ proxmox_template_disk_size_gb }}G
when:
- vmid_config_check.rc != 0 or not vm_already_template
- name: Convert VM to template
command: qm template {{ proxmox_template_vmid }}
when:
- vmid_config_check.rc == 0 and not vm_already_template or vmid_config_check.rc != 0

View file

@ -0,0 +1,203 @@
- name: Deploy Nodito CPU Temperature Monitoring
hosts: nodito
become: yes
vars_files:
- ../../infra_vars.yml
- ./nodito_vars.yml
- ./nodito_secrets.yml
tasks:
- name: Validate Uptime Kuma URL is provided
assert:
that:
- nodito_uptime_kuma_cpu_temp_push_url != ""
fail_msg: "uptime_kuma_url must be set in nodito_secrets.yml"
- name: Install required packages for temperature monitoring
package:
name:
- lm-sensors
- curl
- jq
- bc
state: present
- name: Create monitoring script directory
file:
path: "{{ monitoring_script_dir }}"
state: directory
owner: root
group: root
mode: '0755'
- name: Create CPU temperature monitoring script
copy:
dest: "{{ monitoring_script_path }}"
content: |
#!/bin/bash
# CPU Temperature Monitoring Script for Nodito
# Monitors CPU temperature and sends alerts to Uptime Kuma
LOG_FILE="{{ log_file }}"
TEMP_THRESHOLD="{{ temp_threshold_celsius }}"
UPTIME_KUMA_URL="{{ nodito_uptime_kuma_cpu_temp_push_url }}"
# Function to log messages
log_message() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE"
}
# Function to get CPU temperature
get_cpu_temp() {
# Try different methods to get CPU temperature
local temp=""
# Method 1: sensors command (most common)
if command -v sensors >/dev/null 2>&1; then
temp=$(sensors 2>/dev/null | grep -E "Core 0|Package id 0|Tdie|Tctl" | head -1 | grep -oE '[0-9]+\.[0-9]+°C' | grep -oE '[0-9]+\.[0-9]+')
fi
# Method 2: thermal zone (fallback)
if [ -z "$temp" ] && [ -f /sys/class/thermal/thermal_zone0/temp ]; then
temp=$(cat /sys/class/thermal/thermal_zone0/temp)
temp=$(echo "scale=1; $temp/1000" | bc -l 2>/dev/null || echo "$temp")
fi
# Method 3: acpi (fallback)
if [ -z "$temp" ] && command -v acpi >/dev/null 2>&1; then
temp=$(acpi -t 2>/dev/null | grep -oE '[0-9]+\.[0-9]+' | head -1)
fi
echo "$temp"
}
# Function to send alert to Uptime Kuma
send_uptime_kuma_alert() {
local temp="$1"
local message="CPU Temperature Alert: ${temp}°C (Threshold: ${TEMP_THRESHOLD}°C)"
log_message "ALERT: $message"
# Send push notification to Uptime Kuma
encoded_message=$(printf '%s\n' "$message" | sed 's/ /%20/g; s/°/%C2%B0/g; s/(/%28/g; s/)/%29/g; s/:/%3A/g')
curl "$UPTIME_KUMA_URL?status=up&msg=$encoded_message"
if [ $? -eq 0 ]; then
log_message "Alert sent successfully to Uptime Kuma"
else
log_message "ERROR: Failed to send alert to Uptime Kuma"
fi
}
# Main monitoring logic
main() {
log_message "Starting CPU temperature check"
# Get current CPU temperature
current_temp=$(get_cpu_temp)
if [ -z "$current_temp" ]; then
log_message "ERROR: Could not read CPU temperature"
exit 1
fi
log_message "Current CPU temperature: ${current_temp}°C"
# Check if temperature exceeds threshold
if (( $(echo "$current_temp > $TEMP_THRESHOLD" | bc -l) )); then
log_message "WARNING: CPU temperature ${current_temp}°C exceeds threshold ${TEMP_THRESHOLD}°C"
send_uptime_kuma_alert "$current_temp"
else
log_message "CPU temperature is within normal range"
fi
}
# Run main function
main
owner: root
group: root
mode: '0755'
- name: Create systemd service for CPU temperature monitoring
copy:
dest: "/etc/systemd/system/{{ systemd_service_name }}.service"
content: |
[Unit]
Description=Nodito CPU Temperature Monitor
After=network.target
[Service]
Type=oneshot
ExecStart={{ monitoring_script_path }}
User=root
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
owner: root
group: root
mode: '0644'
- name: Create systemd timer for CPU temperature monitoring
copy:
dest: "/etc/systemd/system/{{ systemd_service_name }}.timer"
content: |
[Unit]
Description=Run Nodito CPU Temperature Monitor every {{ temp_check_interval_minutes }} minute(s)
Requires={{ systemd_service_name }}.service
[Timer]
OnBootSec={{ temp_check_interval_minutes }}min
OnUnitActiveSec={{ temp_check_interval_minutes }}min
Persistent=true
[Install]
WantedBy=timers.target
owner: root
group: root
mode: '0644'
- name: Reload systemd daemon
systemd:
daemon_reload: yes
- name: Enable and start CPU temperature monitoring timer
systemd:
name: "{{ systemd_service_name }}.timer"
enabled: yes
state: started
- name: Test CPU temperature monitoring script
command: "{{ monitoring_script_path }}"
register: script_test
changed_when: false
- name: Verify script execution
assert:
that:
- script_test.rc == 0
fail_msg: "CPU temperature monitoring script failed to execute properly"
- name: Check if sensors are available
command: sensors
register: sensors_check
changed_when: false
failed_when: false
- name: Display sensor information
debug:
msg: "Sensor information: {{ sensors_check.stdout_lines if sensors_check.rc == 0 else 'Sensors not available - using fallback methods' }}"
- name: Show monitoring configuration
debug:
msg:
- "CPU Temperature Monitoring configured successfully"
- "Temperature threshold: {{ temp_threshold_celsius }}°C"
- "Check interval: {{ temp_check_interval_minutes }} minute(s)"
- "Uptime Kuma URL: {{ nodito_uptime_kuma_cpu_temp_push_url }}"
- "Monitoring script: {{ monitoring_script_path }}"
- "Log file: {{ log_file }}"
- "Service: {{ systemd_service_name }}.service"
- "Timer: {{ systemd_service_name }}.timer"

View file

@ -0,0 +1,19 @@
# Nodito CPU Temperature Monitoring Configuration
# Temperature Monitoring Configuration
temp_threshold_celsius: 80
temp_check_interval_minutes: 1
# Script Configuration
monitoring_script_dir: /opt/nodito-monitoring
monitoring_script_path: "{{ monitoring_script_dir }}/cpu_temp_monitor.sh"
log_file: "{{ monitoring_script_dir }}/cpu_temp_monitor.log"
# System Configuration
systemd_service_name: nodito-cpu-temp-monitor
# ZFS Pool Configuration
zfs_pool_name: "proxmox-tank-1"
zfs_disk_1: "/dev/disk/by-id/ata-ST4000NT001-3M2101_WX11TN0Z" # First disk for RAID 1 mirror
zfs_disk_2: "/dev/disk/by-id/ata-ST4000NT001-3M2101_WX11TN2P" # Second disk for RAID 1 mirror
zfs_pool_mountpoint: "/var/lib/vz"

View file

@ -0,0 +1,11 @@
# Uptime Kuma login credentials
# Used by the disk monitoring playbook to create monitors automatically
uptime_kuma_username: "admin"
uptime_kuma_password: "your_password_here"
# ntfy credentials
# Used for notification channel setup in Uptime Kuma
ntfy_username: "your_ntfy_username"
ntfy_password: "your_ntfy_password"

View file

@ -1,3 +1,6 @@
# Infrastructure Variables
# Generated by setup_layer_0.sh
new_user: counterweight
ssh_port: 22
allow_ssh_from: "any"

11
ansible/requirements.yml Normal file
View file

@ -0,0 +1,11 @@
---
# Ansible Galaxy Collections Requirements
# Install with: ansible-galaxy collection install -r requirements.yml
collections:
# Uptime Kuma Ansible Collection
# Used by: infra/41_disk_usage_alerts.yml
# Provides modules to manage Uptime Kuma monitors programmatically
- name: lucasheld.uptime_kuma
version: ">=1.0.0"

View file

@ -1,5 +1,5 @@
- name: Install and configure Caddy on Debian 12
hosts: vipy,watchtower
hosts: vipy,watchtower,spacey
become: yes
tasks:

View file

@ -3,9 +3,14 @@
become: yes
vars_files:
- ../../infra_vars.yml
- ../../services_config.yml
- ../../infra_secrets.yml
- ./forgejo_vars.yml
vars:
forgejo_subdomain: "{{ subdomains.forgejo }}"
caddy_sites_dir: "{{ caddy_sites_dir }}"
forgejo_domain: "{{ forgejo_subdomain }}.{{ root_domain }}"
uptime_kuma_api_url: "https://{{ subdomains.uptime_kuma }}.{{ root_domain }}"
tasks:
- name: Ensure required packages are installed
@ -98,3 +103,109 @@
service:
name: caddy
state: reloaded
- name: Create Uptime Kuma monitor setup script for Forgejo
delegate_to: localhost
become: no
copy:
dest: /tmp/setup_forgejo_monitor.py
content: |
#!/usr/bin/env python3
import sys
import yaml
from uptime_kuma_api import UptimeKumaApi, MonitorType
try:
with open('/tmp/ansible_config.yml', 'r') as f:
config = yaml.safe_load(f)
url = config['uptime_kuma_url']
username = config['username']
password = config['password']
monitor_url = config['monitor_url']
monitor_name = config['monitor_name']
api = UptimeKumaApi(url, timeout=30)
api.login(username, password)
# Get all monitors
monitors = api.get_monitors()
# Find or create "services" group
group = next((m for m in monitors if m.get('name') == 'services' and m.get('type') == 'group'), None)
if not group:
group_result = api.add_monitor(type='group', name='services')
# Refresh to get the group with id
monitors = api.get_monitors()
group = next((m for m in monitors if m.get('name') == 'services' and m.get('type') == 'group'), None)
# Check if monitor already exists
existing_monitor = None
for monitor in monitors:
if monitor.get('name') == monitor_name:
existing_monitor = monitor
break
# Get ntfy notification ID
notifications = api.get_notifications()
ntfy_notification_id = None
for notif in notifications:
if notif.get('type') == 'ntfy':
ntfy_notification_id = notif.get('id')
break
if existing_monitor:
print(f"Monitor '{monitor_name}' already exists (ID: {existing_monitor['id']})")
print("Skipping - monitor already configured")
else:
print(f"Creating monitor '{monitor_name}'...")
api.add_monitor(
type=MonitorType.HTTP,
name=monitor_name,
url=monitor_url,
parent=group['id'],
interval=60,
maxretries=3,
retryInterval=60,
notificationIDList={ntfy_notification_id: True} if ntfy_notification_id else {}
)
api.disconnect()
print("SUCCESS")
except Exception as e:
print(f"ERROR: {str(e)}", file=sys.stderr)
sys.exit(1)
mode: '0755'
- name: Create temporary config for monitor setup
delegate_to: localhost
become: no
copy:
dest: /tmp/ansible_config.yml
content: |
uptime_kuma_url: "{{ uptime_kuma_api_url }}"
username: "{{ uptime_kuma_username }}"
password: "{{ uptime_kuma_password }}"
monitor_url: "https://{{ forgejo_domain }}/api/healthz"
monitor_name: "Forgejo"
mode: '0644'
- name: Run Uptime Kuma monitor setup
command: python3 /tmp/setup_forgejo_monitor.py
delegate_to: localhost
become: no
register: monitor_setup
changed_when: "'SUCCESS' in monitor_setup.stdout"
ignore_errors: yes
- name: Clean up temporary files
delegate_to: localhost
become: no
file:
path: "{{ item }}"
state: absent
loop:
- /tmp/setup_forgejo_monitor.py
- /tmp/ansible_config.yml

View file

@ -9,9 +9,7 @@ forgejo_url: "https://codeberg.org/forgejo/forgejo/releases/download/v{{ forgejo
forgejo_bin_path: "/usr/local/bin/forgejo"
forgejo_user: "git"
# Caddy
caddy_sites_dir: /etc/caddy/sites-enabled
forgejo_subdomain: forgejo
# (caddy_sites_dir and subdomain now in services_config.yml)
# Remote access
remote_host: "{{ groups['vipy'][0] }}"

View file

@ -1,11 +1,17 @@
- name: Deploy headscale and configure Caddy reverse proxy
hosts: vipy
hosts: spacey
become: no
vars_files:
- ../../infra_vars.yml
- ../../services_config.yml
- ../../infra_secrets.yml
- ./headscale_vars.yml
vars:
headscale_subdomain: "{{ subdomains.headscale }}"
caddy_sites_dir: "{{ caddy_sites_dir }}"
headscale_domain: "{{ headscale_subdomain }}.{{ root_domain }}"
headscale_base_domain: "tailnet.{{ root_domain }}"
uptime_kuma_api_url: "https://{{ subdomains.uptime_kuma }}.{{ root_domain }}"
tasks:
- name: Install required packages
@ -34,6 +40,16 @@
path: /tmp/headscale.deb
state: absent
- name: Ensure headscale user exists
become: yes
user:
name: headscale
system: yes
shell: /usr/sbin/nologin
home: /var/lib/headscale
create_home: yes
state: present
- name: Create headscale data directory
become: yes
file:
@ -50,17 +66,7 @@
state: directory
owner: headscale
group: headscale
mode: '0755'
- name: Ensure headscale user exists
become: yes
user:
name: headscale
system: yes
shell: /usr/sbin/nologin
home: /var/lib/headscale
create_home: yes
state: present
mode: '0770'
- name: Ensure headscale user owns data directory
become: yes
@ -69,6 +75,14 @@
owner: headscale
group: headscale
recurse: yes
mode: '0750'
- name: Add counterweight user to headscale group
become: yes
user:
name: counterweight
groups: headscale
append: yes
- name: Create ACL policies file
become: yes
@ -84,7 +98,7 @@
}
owner: headscale
group: headscale
mode: '0644'
mode: '0640'
notify: Restart headscale
- name: Deploy headscale configuration file
@ -135,17 +149,17 @@
path: /etc/headscale/acl.json
dns:
base_domain: tailnet.contrapeso.xyz
base_domain: {{ headscale_base_domain | quote }}
magic_dns: true
search_domains:
- tailnet.contrapeso.xyz
- {{ headscale_base_domain | quote }}
nameservers:
global:
- 1.1.1.1
- 1.0.0.1
owner: root
group: root
mode: '0644'
group: headscale
mode: '0640'
notify: Restart headscale
- name: Test headscale configuration
@ -158,12 +172,47 @@
debug:
msg: "{{ headscale_config_test.stdout }}"
- name: Ensure headscale data directory has correct ownership before starting service
become: yes
file:
path: /var/lib/headscale
state: directory
owner: headscale
group: headscale
mode: '0750'
recurse: yes
- name: Ensure headscale run directory has correct ownership
become: yes
file:
path: /var/run/headscale
state: directory
owner: headscale
group: headscale
mode: '0770'
- name: Enable and start headscale service
become: yes
systemd:
name: headscale
enabled: yes
state: started
daemon_reload: yes
- name: Wait for headscale unix socket to be ready
become: yes
wait_for:
path: /var/run/headscale/headscale.sock
state: present
timeout: 60
delay: 2
- name: Create headscale namespace if it doesn't exist
become: yes
command: headscale users create {{ headscale_namespace }}
register: create_namespace_result
failed_when: create_namespace_result.rc != 0 and 'already exists' not in create_namespace_result.stderr and 'UNIQUE constraint' not in create_namespace_result.stderr
changed_when: create_namespace_result.rc == 0
- name: Allow HTTPS through UFW
become: yes
@ -220,6 +269,111 @@
become: yes
command: systemctl reload caddy
- name: Create Uptime Kuma monitor setup script for Headscale
delegate_to: localhost
become: no
copy:
dest: /tmp/setup_headscale_monitor.py
content: |
#!/usr/bin/env python3
import sys
import yaml
from uptime_kuma_api import UptimeKumaApi, MonitorType
try:
with open('/tmp/ansible_config.yml', 'r') as f:
config = yaml.safe_load(f)
url = config['uptime_kuma_url']
username = config['username']
password = config['password']
monitor_url = config['monitor_url']
monitor_name = config['monitor_name']
api = UptimeKumaApi(url, timeout=30)
api.login(username, password)
# Get all monitors
monitors = api.get_monitors()
# Find or create "services" group
group = next((m for m in monitors if m.get('name') == 'services' and m.get('type') == 'group'), None)
if not group:
group_result = api.add_monitor(type='group', name='services')
# Refresh to get the group with id
monitors = api.get_monitors()
group = next((m for m in monitors if m.get('name') == 'services' and m.get('type') == 'group'), None)
# Check if monitor already exists
existing_monitor = None
for monitor in monitors:
if monitor.get('name') == monitor_name:
existing_monitor = monitor
break
# Get ntfy notification ID
notifications = api.get_notifications()
ntfy_notification_id = None
for notif in notifications:
if notif.get('type') == 'ntfy':
ntfy_notification_id = notif.get('id')
break
if existing_monitor:
print(f"Monitor '{monitor_name}' already exists (ID: {existing_monitor['id']})")
print("Skipping - monitor already configured")
else:
print(f"Creating monitor '{monitor_name}'...")
api.add_monitor(
type=MonitorType.HTTP,
name=monitor_name,
url=monitor_url,
parent=group['id'],
interval=60,
maxretries=3,
retryInterval=60,
notificationIDList={ntfy_notification_id: True} if ntfy_notification_id else {}
)
api.disconnect()
print("SUCCESS")
except Exception as e:
print(f"ERROR: {str(e)}", file=sys.stderr)
sys.exit(1)
mode: '0755'
- name: Create temporary config for monitor setup
delegate_to: localhost
become: no
copy:
dest: /tmp/ansible_config.yml
content: |
uptime_kuma_url: "{{ uptime_kuma_api_url }}"
username: "{{ uptime_kuma_username }}"
password: "{{ uptime_kuma_password }}"
monitor_url: "https://{{ headscale_domain }}/health"
monitor_name: "Headscale"
mode: '0644'
- name: Run Uptime Kuma monitor setup
command: python3 /tmp/setup_headscale_monitor.py
delegate_to: localhost
become: no
register: monitor_setup
changed_when: "'SUCCESS' in monitor_setup.stdout"
ignore_errors: yes
- name: Clean up temporary files
delegate_to: localhost
become: no
file:
path: "{{ item }}"
state: absent
loop:
- /tmp/setup_headscale_monitor.py
- /tmp/ansible_config.yml
handlers:
- name: Restart headscale
become: yes

View file

@ -1,19 +1,20 @@
# Headscale service configuration
headscale_subdomain: headscale
# (subdomain and caddy_sites_dir now in services_config.yml)
headscale_port: 8080
headscale_grpc_port: 50443
# Version
headscale_version: "0.26.1"
# Caddy
caddy_sites_dir: /etc/caddy/sites-enabled
# Namespace for devices (users in headscale terminology)
headscale_namespace: counter-net
# Data directory
headscale_data_dir: /var/lib/headscale
# Remote access
remote_host: "{{ groups['vipy'][0] }}"
remote_host: "{{ groups['spacey'][0] }}"
remote_user: "{{ hostvars[remote_host]['ansible_user'] }}"
remote_key_file: "{{ hostvars[remote_host]['ansible_ssh_private_key_file'] | default('') }}"

View file

@ -3,9 +3,14 @@
become: yes
vars_files:
- ../../infra_vars.yml
- ../../services_config.yml
- ../../infra_secrets.yml
- ./lnbits_vars.yml
vars:
lnbits_subdomain: "{{ subdomains.lnbits }}"
caddy_sites_dir: "{{ caddy_sites_dir }}"
lnbits_domain: "{{ lnbits_subdomain }}.{{ root_domain }}"
uptime_kuma_api_url: "https://{{ subdomains.uptime_kuma }}.{{ root_domain }}"
tasks:
- name: Create lnbits directory
@ -19,38 +24,41 @@
- name: Install required system packages
apt:
name:
- python3.11
- python3.11-venv
- python3
- python3-pip
- python3-venv
- python3-dev
- git
- curl
- build-essential
- pkg-config
- build-essential
- libsecp256k1-dev
- libffi-dev
- libgmp-dev
- libpq-dev
- automake
- autoconf
- libtool
- m4
- gawk
state: present
update_cache: yes
- name: Install Poetry
- name: Install uv packaging tool
shell: |
curl -sSL https://install.python-poetry.org | python3 -
curl -LsSf https://astral.sh/uv/install.sh | sh
args:
creates: "{{ lookup('env', 'HOME') }}/.local/bin/poetry"
become: yes
become_user: "{{ ansible_user }}"
- name: Add Poetry to PATH
lineinfile:
path: "{{ lookup('env', 'HOME') }}/.bashrc"
line: 'export PATH="$HOME/.local/bin:$PATH"'
state: present
creates: "/home/{{ ansible_user }}/.local/bin/uv"
become: yes
become_user: "{{ ansible_user }}"
environment:
HOME: "/home/{{ ansible_user }}"
- name: Clone LNBits repository
git:
repo: https://github.com/lnbits/lnbits.git
dest: "{{ lnbits_dir }}/lnbits"
version: main
version: "v1.3.1"
accept_hostkey: yes
- name: Change ownership of LNBits directory to user
@ -60,10 +68,19 @@
group: "{{ ansible_user }}"
recurse: yes
- name: Install LNBits dependencies
command: $HOME/.local/bin/poetry install --only main
- name: Install LNBits dependencies with uv (Python 3.12)
command: /home/{{ ansible_user }}/.local/bin/uv sync --python 3.12 --all-extras --no-dev
args:
chdir: "{{ lnbits_dir }}/lnbits"
become: yes
become_user: "{{ ansible_user }}"
environment:
HOME: "/home/{{ ansible_user }}"
PATH: "/home/{{ ansible_user }}/.local/bin:/usr/local/bin:/usr/bin:/bin"
SECP_BUNDLED: "0"
PKG_CONFIG_PATH: "/usr/lib/x86_64-linux-gnu/pkgconfig"
ACLOCAL: "aclocal"
AUTOMAKE: "automake"
- name: Copy .env.example to .env
copy:
@ -107,10 +124,12 @@
Type=simple
User={{ ansible_user }}
WorkingDirectory={{ lnbits_dir }}/lnbits
ExecStart=/home/{{ ansible_user }}/.local/bin/poetry run lnbits
ExecStart=/home/{{ ansible_user }}/.local/bin/uv run --python 3.12 lnbits
Restart=always
RestartSec=30
Environment=PYTHONUNBUFFERED=1
Environment="PATH=/home/{{ ansible_user }}/.local/bin:/usr/local/bin:/usr/bin:/bin"
Environment=SECP_BUNDLED=0
[Install]
WantedBy=multi-user.target
@ -143,6 +162,8 @@
insertafter: EOF
state: present
backup: yes
create: yes
mode: '0644'
- name: Create Caddy reverse proxy configuration for lnbits
copy:
@ -159,3 +180,109 @@
- name: Reload Caddy to apply new config
command: systemctl reload caddy
- name: Create Uptime Kuma monitor setup script for LNBits
delegate_to: localhost
become: no
copy:
dest: /tmp/setup_lnbits_monitor.py
content: |
#!/usr/bin/env python3
import sys
import yaml
from uptime_kuma_api import UptimeKumaApi, MonitorType
try:
with open('/tmp/ansible_config.yml', 'r') as f:
config = yaml.safe_load(f)
url = config['uptime_kuma_url']
username = config['username']
password = config['password']
monitor_url = config['monitor_url']
monitor_name = config['monitor_name']
api = UptimeKumaApi(url, timeout=30)
api.login(username, password)
# Get all monitors
monitors = api.get_monitors()
# Find or create "services" group
group = next((m for m in monitors if m.get('name') == 'services' and m.get('type') == 'group'), None)
if not group:
group_result = api.add_monitor(type='group', name='services')
# Refresh to get the group with id
monitors = api.get_monitors()
group = next((m for m in monitors if m.get('name') == 'services' and m.get('type') == 'group'), None)
# Check if monitor already exists
existing_monitor = None
for monitor in monitors:
if monitor.get('name') == monitor_name:
existing_monitor = monitor
break
# Get ntfy notification ID
notifications = api.get_notifications()
ntfy_notification_id = None
for notif in notifications:
if notif.get('type') == 'ntfy':
ntfy_notification_id = notif.get('id')
break
if existing_monitor:
print(f"Monitor '{monitor_name}' already exists (ID: {existing_monitor['id']})")
print("Skipping - monitor already configured")
else:
print(f"Creating monitor '{monitor_name}'...")
api.add_monitor(
type=MonitorType.HTTP,
name=monitor_name,
url=monitor_url,
parent=group['id'],
interval=60,
maxretries=3,
retryInterval=60,
notificationIDList={ntfy_notification_id: True} if ntfy_notification_id else {}
)
api.disconnect()
print("SUCCESS")
except Exception as e:
print(f"ERROR: {str(e)}", file=sys.stderr)
sys.exit(1)
mode: '0755'
- name: Create temporary config for monitor setup
delegate_to: localhost
become: no
copy:
dest: /tmp/ansible_config.yml
content: |
uptime_kuma_url: "{{ uptime_kuma_api_url }}"
username: "{{ uptime_kuma_username }}"
password: "{{ uptime_kuma_password }}"
monitor_url: "https://{{ lnbits_domain }}/api/v1/health"
monitor_name: "LNBits"
mode: '0644'
- name: Run Uptime Kuma monitor setup
command: python3 /tmp/setup_lnbits_monitor.py
delegate_to: localhost
become: no
register: monitor_setup
changed_when: "'SUCCESS' in monitor_setup.stdout"
ignore_errors: yes
- name: Clean up temporary files
delegate_to: localhost
become: no
file:
path: "{{ item }}"
state: absent
loop:
- /tmp/setup_lnbits_monitor.py
- /tmp/ansible_config.yml

View file

@ -3,9 +3,7 @@ lnbits_dir: /opt/lnbits
lnbits_data_dir: "{{ lnbits_dir }}/data"
lnbits_port: 8765
# Caddy
caddy_sites_dir: /etc/caddy/sites-enabled
lnbits_subdomain: wallet
# (caddy_sites_dir and subdomain now in services_config.yml)
# Remote access
remote_host: "{{ groups['vipy'][0] }}"

View file

@ -0,0 +1,175 @@
- name: Deploy memos and configure Caddy reverse proxy
hosts: memos-box
become: yes
vars_files:
- ../../infra_vars.yml
- ../../services_config.yml
- ./memos_vars.yml
vars:
memos_subdomain: "{{ subdomains.memos }}"
caddy_sites_dir: "{{ caddy_sites_dir }}"
memos_domain: "{{ memos_subdomain }}.{{ root_domain }}"
tasks:
- name: Install required packages
apt:
name:
- wget
- curl
- unzip
state: present
update_cache: yes
- name: Get latest memos release version
uri:
url: https://api.github.com/repos/usememos/memos/releases/latest
return_content: yes
register: memos_latest_release
- name: Set memos version and find download URL
set_fact:
memos_version: "{{ memos_latest_release.json.tag_name | regex_replace('^v', '') }}"
- name: Find linux-amd64 download URL
set_fact:
memos_download_url: "{{ memos_latest_release.json.assets | json_query('[?contains(name, `linux-amd64`) && (contains(name, `.tar.gz`) || contains(name, `.zip`))].browser_download_url') | first }}"
- name: Display memos version to install
debug:
msg: "Installing memos version {{ memos_version }} from {{ memos_download_url }}"
- name: Download memos binary
get_url:
url: "{{ memos_download_url }}"
dest: /tmp/memos_archive
mode: '0644'
register: memos_download
- name: Extract memos binary
unarchive:
src: /tmp/memos_archive
dest: /tmp/memos_extract
remote_src: yes
creates: /tmp/memos_extract/memos
- name: Install memos binary
copy:
src: /tmp/memos_extract/memos
dest: /usr/local/bin/memos
mode: '0755'
remote_src: yes
notify: Restart memos
- name: Remove temporary files
file:
path: "{{ item }}"
state: absent
loop:
- /tmp/memos_archive
- /tmp/memos_extract
- name: Ensure memos user exists
user:
name: memos
system: yes
shell: /usr/sbin/nologin
home: /var/lib/memos
create_home: yes
state: present
- name: Create memos data directory
file:
path: "{{ memos_data_dir }}"
state: directory
owner: memos
group: memos
mode: '0750'
- name: Create memos systemd service file
copy:
dest: /etc/systemd/system/memos.service
content: |
[Unit]
Description=memos service
After=network.target
[Service]
Type=simple
User=memos
Group=memos
ExecStart=/usr/local/bin/memos --port {{ memos_port }} --data {{ memos_data_dir }}
Restart=on-failure
RestartSec=5s
[Install]
WantedBy=multi-user.target
owner: root
group: root
mode: '0644'
notify: Restart memos
- name: Enable and start memos service
systemd:
name: memos
enabled: yes
state: started
daemon_reload: yes
- name: Wait for memos to be ready
uri:
url: "http://localhost:{{ memos_port }}/api/v1/status"
status_code: 200
register: memos_ready
until: memos_ready.status == 200
retries: 30
delay: 2
ignore_errors: yes
- name: Allow HTTPS through UFW
ufw:
rule: allow
port: '443'
proto: tcp
- name: Allow HTTP through UFW (for Let's Encrypt)
ufw:
rule: allow
port: '80'
proto: tcp
- name: Ensure Caddy sites-enabled directory exists
file:
path: "{{ caddy_sites_dir }}"
state: directory
owner: root
group: root
mode: '0755'
- name: Ensure Caddyfile includes import directive for sites-enabled
lineinfile:
path: /etc/caddy/Caddyfile
line: 'import sites-enabled/*'
insertafter: EOF
state: present
backup: yes
- name: Create Caddy reverse proxy configuration for memos
copy:
dest: "{{ caddy_sites_dir }}/memos.conf"
content: |
{{ memos_domain }} {
reverse_proxy localhost:{{ memos_port }}
}
owner: root
group: root
mode: '0644'
- name: Reload Caddy to apply new config
command: systemctl reload caddy
handlers:
- name: Restart memos
systemd:
name: memos
state: restarted

View file

@ -0,0 +1,16 @@
# General
memos_data_dir: /var/lib/memos
memos_port: 5230
# (caddy_sites_dir and subdomain now in services_config.yml)
# Remote access
remote_host: "{{ groups['memos_box'][0] }}"
remote_user: "{{ hostvars[remote_host]['ansible_user'] }}"
remote_key_file: "{{ hostvars[remote_host]['ansible_ssh_private_key_file'] | default('') }}"
# Local backup
local_backup_dir: "{{ lookup('env', 'HOME') }}/memos-backups"
backup_script_path: "{{ lookup('env', 'HOME') }}/.local/bin/memos_backup.sh"

View file

@ -3,8 +3,11 @@
become: yes
vars_files:
- ../../infra_vars.yml
- ../../services_config.yml
- ./ntfy_emergency_app_vars.yml
vars:
ntfy_emergency_app_subdomain: "{{ subdomains.ntfy_emergency_app }}"
caddy_sites_dir: "{{ caddy_sites_dir }}"
ntfy_emergency_app_domain: "{{ ntfy_emergency_app_subdomain }}.{{ root_domain }}"
tasks:

View file

@ -2,9 +2,7 @@
ntfy_emergency_app_dir: /opt/ntfy-emergency-app
ntfy_emergency_app_port: 3000
# Caddy
caddy_sites_dir: /etc/caddy/sites-enabled
ntfy_emergency_app_subdomain: avisame
# (caddy_sites_dir and subdomain now in services_config.yml)
# ntfy configuration
ntfy_emergency_app_topic: "emergencia"

View file

@ -3,8 +3,11 @@
become: yes
vars_files:
- ../../infra_vars.yml
- ../../services_config.yml
- ./ntfy_vars.yml
vars:
ntfy_subdomain: "{{ subdomains.ntfy }}"
caddy_sites_dir: "{{ caddy_sites_dir }}"
ntfy_domain: "{{ ntfy_subdomain }}.{{ root_domain }}"
tasks:

View file

@ -1,3 +1,2 @@
caddy_sites_dir: /etc/caddy/sites-enabled
ntfy_subdomain: ntfy
ntfy_port: 6674
ntfy_port: 6674
ntfy_topic: alerts # Topic for Uptime Kuma notifications

View file

@ -0,0 +1,155 @@
- name: Setup ntfy as Uptime Kuma Notification Channel
hosts: watchtower
become: no
vars_files:
- ../../infra_vars.yml
- ../../services_config.yml
- ../../infra_secrets.yml
- ./ntfy_vars.yml
- ../uptime_kuma/uptime_kuma_vars.yml
vars:
ntfy_subdomain: "{{ subdomains.ntfy }}"
uptime_kuma_subdomain: "{{ subdomains.uptime_kuma }}"
ntfy_domain: "{{ ntfy_subdomain }}.{{ root_domain }}"
ntfy_server_url: "https://{{ ntfy_domain }}"
ntfy_priority: 4 # 1=min, 2=low, 3=default, 4=high, 5=max
uptime_kuma_api_url: "https://{{ uptime_kuma_subdomain }}.{{ root_domain }}"
tasks:
- name: Validate Uptime Kuma configuration
assert:
that:
- uptime_kuma_api_url is defined
- uptime_kuma_api_url != ""
- uptime_kuma_username is defined
- uptime_kuma_username != ""
- uptime_kuma_password is defined
- uptime_kuma_password != ""
fail_msg: "uptime_kuma_api_url, uptime_kuma_username and uptime_kuma_password must be set"
- name: Validate ntfy configuration
assert:
that:
- ntfy_domain is defined
- ntfy_domain != ""
- ntfy_topic is defined
- ntfy_topic != ""
- ntfy_username is defined
- ntfy_username != ""
- ntfy_password is defined
- ntfy_password != ""
fail_msg: "ntfy_domain, ntfy_topic, ntfy_username and ntfy_password must be set"
- name: Create Uptime Kuma ntfy notification setup script
copy:
dest: /tmp/setup_uptime_kuma_ntfy_notification.py
content: |
#!/usr/bin/env python3
import sys
import json
from uptime_kuma_api import UptimeKumaApi
def main():
api_url = sys.argv[1]
username = sys.argv[2]
password = sys.argv[3]
notification_name = sys.argv[4]
ntfy_server_url = sys.argv[5]
ntfy_topic = sys.argv[6]
ntfy_username = sys.argv[7]
ntfy_password = sys.argv[8]
ntfy_priority = int(sys.argv[9])
api = UptimeKumaApi(api_url, timeout=60, wait_events=2.0)
api.login(username, password)
# Get all notifications
notifications = api.get_notifications()
# Find existing ntfy notification by name
existing_notification = next((n for n in notifications if n.get('name') == notification_name), None)
notification_data = {
'name': notification_name,
'type': 'ntfy',
'isDefault': True, # Apply to all monitors by default
'applyExisting': True, # Apply to existing monitors
'ntfyserverurl': ntfy_server_url,
'ntfytopic': ntfy_topic,
'ntfyusername': ntfy_username,
'ntfypassword': ntfy_password,
'ntfyPriority': ntfy_priority
}
if existing_notification:
notification = api.edit_notification(existing_notification['id'], **notification_data)
action = "updated"
else:
notification = api.add_notification(**notification_data)
action = "created"
# Output result as JSON
result = {
'notification_id': notification['id'],
'notification_name': notification_name,
'ntfy_server': ntfy_server_url,
'ntfy_topic': ntfy_topic,
'action': action
}
print(json.dumps(result))
api.disconnect()
if __name__ == '__main__':
main()
mode: '0755'
delegate_to: localhost
become: no
- name: Run Uptime Kuma ntfy notification setup script
command: >
{{ ansible_playbook_python }}
/tmp/setup_uptime_kuma_ntfy_notification.py
"{{ uptime_kuma_api_url }}"
"{{ uptime_kuma_username }}"
"{{ uptime_kuma_password }}"
"ntfy ({{ ntfy_topic }})"
"{{ ntfy_server_url }}"
"{{ ntfy_topic }}"
"{{ ntfy_username }}"
"{{ ntfy_password }}"
"{{ ntfy_priority }}"
register: notification_setup_result
delegate_to: localhost
become: no
changed_when: false
- name: Parse notification setup result
set_fact:
notification_info_parsed: "{{ notification_setup_result.stdout | from_json }}"
- name: Display notification information
debug:
msg: |
✓ ntfy notification channel {{ notification_info_parsed.action }} successfully!
Notification Name: ntfy ({{ ntfy_topic }})
ntfy Server: {{ ntfy_server_url }}
ntfy Topic: {{ ntfy_topic }}
Priority: {{ ntfy_priority }} (4=high)
Default for all monitors: Yes
Applied to existing monitors: Yes
All Uptime Kuma monitors will now send alerts to your ntfy server
on the "{{ ntfy_topic }}" topic.
You can subscribe to alerts at: {{ ntfy_server_url }}/{{ ntfy_topic }}
- name: Clean up temporary Uptime Kuma setup script
file:
path: /tmp/setup_uptime_kuma_ntfy_notification.py
state: absent
delegate_to: localhost
become: no

View file

@ -3,7 +3,12 @@
become: yes
vars_files:
- ../../infra_vars.yml
- ../../services_config.yml
- ./personal_blog_vars.yml
vars:
personal_blog_subdomain: "{{ subdomains.personal_blog }}"
caddy_sites_dir: "{{ caddy_sites_dir }}"
personal_blog_domain: "{{ personal_blog_subdomain }}.{{ root_domain }}"
tasks:
- name: Install git

View file

@ -1,6 +1,4 @@
caddy_sites_dir: /etc/caddy/sites-enabled
personal_blog_subdomain: pablohere
personal_blog_domain: pablohere.contrapeso.xyz
# (caddy_sites_dir and subdomain now in services_config.yml)
personal_blog_git_repo: https://forgejo.contrapeso.xyz/counterweight/pablohere.git
personal_blog_git_username: counterweight
personal_blog_source_dir: /opt/personal-blog

View file

@ -3,8 +3,11 @@
become: yes
vars_files:
- ../../infra_vars.yml
- ../../services_config.yml
- ./uptime_kuma_vars.yml
vars:
uptime_kuma_subdomain: "{{ subdomains.uptime_kuma }}"
caddy_sites_dir: "{{ caddy_sites_dir }}"
uptime_kuma_domain: "{{ uptime_kuma_subdomain }}.{{ root_domain }}"
tasks:

View file

@ -3,9 +3,7 @@ uptime_kuma_dir: /opt/uptime-kuma
uptime_kuma_data_dir: "{{ uptime_kuma_dir }}/data"
uptime_kuma_port: 3001
# Caddy
caddy_sites_dir: /etc/caddy/sites-enabled
uptime_kuma_subdomain: uptime
# (caddy_sites_dir and subdomain now in services_config.yml)
# Remote access
remote_host: "{{ groups['watchtower'][0] }}"

View file

@ -3,9 +3,14 @@
become: yes
vars_files:
- ../../infra_vars.yml
- ../../services_config.yml
- ../../infra_secrets.yml
- ./vaultwarden_vars.yml
vars:
vaultwarden_subdomain: "{{ subdomains.vaultwarden }}"
caddy_sites_dir: "{{ caddy_sites_dir }}"
vaultwarden_domain: "{{ vaultwarden_subdomain }}.{{ root_domain }}"
uptime_kuma_api_url: "https://{{ subdomains.uptime_kuma }}.{{ root_domain }}"
tasks:
- name: Create vaultwarden directory
@ -106,3 +111,110 @@
- name: Reload Caddy to apply new config
command: systemctl reload caddy
- name: Create Uptime Kuma monitor setup script for Vaultwarden
delegate_to: localhost
become: no
copy:
dest: /tmp/setup_vaultwarden_monitor.py
content: |
#!/usr/bin/env python3
import sys
import yaml
from uptime_kuma_api import UptimeKumaApi, MonitorType
try:
# Load configs
with open('/tmp/ansible_config.yml', 'r') as f:
config = yaml.safe_load(f)
url = config['uptime_kuma_url']
username = config['username']
password = config['password']
monitor_url = config['monitor_url']
monitor_name = config['monitor_name']
# Connect to Uptime Kuma
api = UptimeKumaApi(url, timeout=30)
api.login(username, password)
# Get all monitors
monitors = api.get_monitors()
# Find or create "services" group
group = next((m for m in monitors if m.get('name') == 'services' and m.get('type') == 'group'), None)
if not group:
group_result = api.add_monitor(type='group', name='services')
# Refresh to get the group with id
monitors = api.get_monitors()
group = next((m for m in monitors if m.get('name') == 'services' and m.get('type') == 'group'), None)
# Check if monitor already exists
existing_monitor = None
for monitor in monitors:
if monitor.get('name') == monitor_name:
existing_monitor = monitor
break
# Get ntfy notification ID
notifications = api.get_notifications()
ntfy_notification_id = None
for notif in notifications:
if notif.get('type') == 'ntfy':
ntfy_notification_id = notif.get('id')
break
if existing_monitor:
print(f"Monitor '{monitor_name}' already exists (ID: {existing_monitor['id']})")
print("Skipping - monitor already configured")
else:
print(f"Creating monitor '{monitor_name}'...")
api.add_monitor(
type=MonitorType.HTTP,
name=monitor_name,
url=monitor_url,
parent=group['id'],
interval=60,
maxretries=3,
retryInterval=60,
notificationIDList={ntfy_notification_id: True} if ntfy_notification_id else {}
)
api.disconnect()
print("SUCCESS")
except Exception as e:
print(f"ERROR: {str(e)}", file=sys.stderr)
sys.exit(1)
mode: '0755'
- name: Create temporary config for monitor setup
delegate_to: localhost
become: no
copy:
dest: /tmp/ansible_config.yml
content: |
uptime_kuma_url: "{{ uptime_kuma_api_url }}"
username: "{{ uptime_kuma_username }}"
password: "{{ uptime_kuma_password }}"
monitor_url: "https://{{ vaultwarden_domain }}/alive"
monitor_name: "Vaultwarden"
mode: '0644'
- name: Run Uptime Kuma monitor setup
command: python3 /tmp/setup_vaultwarden_monitor.py
delegate_to: localhost
become: no
register: monitor_setup
changed_when: "'SUCCESS' in monitor_setup.stdout"
ignore_errors: yes
- name: Clean up temporary files
delegate_to: localhost
become: no
file:
path: "{{ item }}"
state: absent
loop:
- /tmp/setup_vaultwarden_monitor.py
- /tmp/ansible_config.yml

View file

@ -3,9 +3,7 @@ vaultwarden_dir: /opt/vaultwarden
vaultwarden_data_dir: "{{ vaultwarden_dir }}/data"
vaultwarden_port: 8222
# Caddy
caddy_sites_dir: /etc/caddy/sites-enabled
vaultwarden_subdomain: vault
# (caddy_sites_dir and subdomain now in services_config.yml)
# Remote access
remote_host: "{{ groups['vipy'][0] }}"

View file

@ -0,0 +1,26 @@
# Centralized Services Configuration
# Subdomains and Caddy settings for all services
# Edit these subdomains to match your preferences
subdomains:
# Monitoring Services (on watchtower)
ntfy: test-ntfy
uptime_kuma: test-uptime
# VPN Infrastructure (on spacey)
headscale: test-headscale
# Core Services (on vipy)
vaultwarden: test-vault
forgejo: test-git
lnbits: test-lnbits
# Secondary Services (on vipy)
personal_blog: test-blog
ntfy_emergency_app: test-emergency
# Memos (on memos-box)
memos: test-memos
# Caddy configuration
caddy_sites_dir: /etc/caddy/sites-enabled

View file

@ -0,0 +1,26 @@
# Centralized Services Configuration
# Copy this to services_config.yml and customize
# Edit these subdomains to match your preferences
subdomains:
# Monitoring Services (on watchtower)
ntfy: ntfy
uptime_kuma: uptime
# VPN Infrastructure (on spacey)
headscale: headscale
# Core Services (on vipy)
vaultwarden: vault
forgejo: git
lnbits: lnbits
# Secondary Services (on vipy)
personal_blog: blog
ntfy_emergency_app: emergency
# Memos (on memos-box)
memos: memos
# Caddy configuration
caddy_sites_dir: /etc/caddy/sites-enabled

20
backup.inventory.ini Normal file
View file

@ -0,0 +1,20 @@
[vipy]
207.154.226.192 ansible_user=counterweight ansible_port=22 ansible_ssh_private_key_file=~/.ssh/counterganzua
[watchtower]
206.189.63.167 ansible_user=counterweight ansible_port=22 ansible_ssh_private_key_file=~/.ssh/counterganzua
[spacey]
165.232.73.4 ansible_user=counterweight ansible_port=22 ansible_ssh_private_key_file=~/.ssh/counterganzua
[nodito]
192.168.1.139 ansible_user=counterweight ansible_port=22 ansible_ssh_pass=noesfacilvivirenunmundocentralizado ansible_ssh_private_key_file=~/.ssh/counterganzua
[memos-box]
192.168.1.149 ansible_user=counterweight ansible_port=22 ansible_ssh_private_key_file=~/.ssh/counterganzua
# Local connection to laptop: this assumes you're running ansible commands from your personal laptop
# Make sure to adjust the username
[lapy]
localhost ansible_connection=local ansible_user=counterweight gpg_recipient=counterweightoperator@protonmail.com gpg_key_id=883EDBAA726BD96C

858
human_script.md Normal file
View file

@ -0,0 +1,858 @@
# Personal Infrastructure Setup Guide
This guide walks you through setting up your complete personal infrastructure, layer by layer. Each layer must be completed before moving to the next one.
**Automated Setup:** Each layer has a bash script that handles the setup process. The scripts will:
- Check prerequisites
- Prompt for required variables
- Set up configuration files
- Execute playbooks
- Verify completion
## Prerequisites
Before starting:
- You have a domain name
- You have VPS accounts ready
- You have nodito ready with Proxmox installed, ssh key in place
- You have SSH access to all machines
- You're running this from your laptop (lapy)
---
## Layer 0: Foundation Setup
**Goal:** Set up your laptop (lapy) as the Ansible control node and configure basic settings.
**Script:** `./scripts/setup_layer_0.sh`
### What This Layer Does:
1. Creates Python virtual environment
2. Installs Ansible and required Python packages
3. Installs Ansible Galaxy collections
4. Guides you through creating `inventory.ini` with your machine IPs
5. Guides you through creating `infra_vars.yml` with your domain
6. Creates `services_config.yml` with centralized subdomain settings
7. Creates `infra_secrets.yml` template for Uptime Kuma credentials
8. Validates SSH keys exist
9. Verifies everything is ready for Layer 1
### Required Information:
- Your domain name (e.g., `contrapeso.xyz`)
- SSH key path (default: `~/.ssh/counterganzua`)
- IP addresses for your infrastructure:
- vipy (main VPS)
- watchtower (monitoring VPS)
- spacey (headscale VPS)
- nodito (Proxmox server) - optional
- **Note:** VMs (like memos-box) will be created later on Proxmox and added to the `nodito-vms` group
### Manual Steps:
After running the script, you'll need to:
1. Ensure your SSH key is added to all VPS root users (usually done by VPS provider)
2. Ensure DNS is configured for your domain (nameservers pointing to your DNS provider)
### Centralized Configuration:
The script creates `ansible/services_config.yml` which contains all service subdomains in one place:
- Easy to review all subdomains at a glance
- No need to edit multiple vars files
- Consistent Caddy settings across all services
- **Edit this file to customize your subdomains before deploying services**
### Verification:
The script will verify:
- ✓ Python venv exists and activated
- ✓ Ansible installed
- ✓ Required Python packages installed
- ✓ Ansible Galaxy collections installed
- ✓ `inventory.ini` exists and formatted correctly
- ✓ `infra_vars.yml` exists with domain configured
- ✓ `services_config.yml` created with subdomain settings
- ✓ `infra_secrets.yml` template created
- ✓ SSH key file exists
### Run the Script:
```bash
cd /home/counterweight/personal_infra
./scripts/setup_layer_0.sh
```
---
## Layer 1A: VPS Basic Setup
**Goal:** Configure users, SSH access, firewall, and fail2ban on VPS machines.
**Script:** `./scripts/setup_layer_1a_vps.sh`
**Can be run independently** - doesn't require Nodito setup.
### What This Layer Does:
For VPSs (vipy, watchtower, spacey):
1. Creates the `counterweight` user with sudo access
2. Configures SSH key authentication
3. Disables root login (by design for security)
4. Sets up UFW firewall with SSH access
5. Installs and configures fail2ban
6. Installs and configures auditd for security logging
### Prerequisites:
- ✅ Layer 0 complete
- ✅ SSH key added to all VPS root users
- ✅ Root access to VPSs
### Verification:
The script will verify:
- ✓ Can SSH to all VPSs as root
- ✓ VPS playbooks complete successfully
- ✓ Can SSH to all VPSs as `counterweight` user
- ✓ Firewall is active and configured
- ✓ fail2ban is running
### Run the Script:
```bash
source venv/bin/activate
cd /home/counterweight/personal_infra
./scripts/setup_layer_1a_vps.sh
```
**Note:** After this layer, you will no longer be able to SSH as root to VPSs (by design for security).
---
## Layer 1B: Nodito (Proxmox) Setup
**Goal:** Configure the Nodito Proxmox server.
**Script:** `./scripts/setup_layer_1b_nodito.sh`
**Can be run independently** - doesn't require VPS setup.
### What This Layer Does:
For Nodito (Proxmox server):
1. Bootstraps SSH key access for root
2. Creates the `counterweight` user
3. Updates and secures the system
4. Disables root login and password authentication
5. Switches to Proxmox community repositories
6. Optionally sets up ZFS storage pool (if disks configured)
7. Optionally creates Debian cloud template
### Prerequisites:
- ✅ Layer 0 complete
- ✅ Root password for nodito
- ✅ Nodito configured in inventory.ini
### Optional: ZFS Setup
For ZFS storage pool (optional):
1. SSH into nodito: `ssh root@<nodito-ip>`
2. List disk IDs: `ls -la /dev/disk/by-id/ | grep -E "(ata-|scsi-|nvme-)"`
3. Note the disk IDs you want to use
4. The script will help you create `ansible/infra/nodito/nodito_vars.yml` with disk configuration
⚠️ **Warning:** ZFS setup will DESTROY ALL DATA on specified disks!
### Verification:
The script will verify:
- ✓ Nodito bootstrap successful
- ✓ Community repos configured
- ✓ Can SSH to nodito as `counterweight` user
### Run the Script:
```bash
source venv/bin/activate
cd /home/counterweight/personal_infra
./scripts/setup_layer_1b_nodito.sh
```
**Note:** After this layer, you will no longer be able to SSH as root to nodito (by design for security).
---
## Layer 2: General Infrastructure Tools
**Goal:** Install common utilities needed by various services.
**Script:** `./scripts/setup_layer_2.sh`
### What This Layer Does:
Installs essential tools on machines that need them:
#### rsync
- **Purpose:** Required for backup operations
- **Deployed to:** vipy, watchtower, lapy (and optionally other hosts)
- **Playbook:** `infra/900_install_rsync.yml`
#### Docker + Docker Compose
- **Purpose:** Required for containerized services
- **Deployed to:** vipy, watchtower (and optionally other hosts)
- **Playbook:** `infra/910_docker_playbook.yml`
### Prerequisites:
- ✅ Layer 0 complete
- ✅ Layer 1A complete (for VPSs) OR Layer 1B complete (for nodito)
- ✅ SSH access as counterweight user
### Services That Need These Tools:
- **rsync:** All backup operations (Uptime Kuma, Vaultwarden, LNBits, etc.)
- **docker:** Uptime Kuma, Vaultwarden, ntfy-emergency-app
### Verification:
The script will verify:
- ✓ rsync installed on specified hosts
- ✓ Docker and Docker Compose installed on specified hosts
- ✓ counterweight user added to docker group
- ✓ Docker service running
### Run the Script:
```bash
source venv/bin/activate
cd /home/counterweight/personal_infra
./scripts/setup_layer_2.sh
```
**Note:** This script is interactive and will let you choose which hosts get which tools.
---
## Layer 3: Reverse Proxy (Caddy)
**Goal:** Deploy Caddy reverse proxy for HTTPS termination and routing.
**Script:** `./scripts/setup_layer_3_caddy.sh`
### What This Layer Does:
Installs and configures Caddy web server on VPS machines:
- Installs Caddy from official repositories
- Configures Caddy to listen on ports 80/443
- Opens firewall ports for HTTP/HTTPS
- Creates `/etc/caddy/sites-enabled/` directory structure
- Sets up automatic HTTPS with Let's Encrypt
**Deployed to:** vipy, watchtower, spacey
### Why Caddy is Critical:
Caddy provides:
- **Automatic HTTPS** - Let's Encrypt certificates with auto-renewal
- **Reverse proxy** - Routes traffic to backend services
- **Simple configuration** - Each service adds its own config file
- **HTTP/2 support** - Modern protocol support
### Prerequisites:
- ✅ Layer 0 complete
- ✅ Layer 1A complete (VPS setup)
- ✅ SSH access as counterweight user
- ✅ Ports 80/443 available on VPSs
### Services That Need Caddy:
All web services depend on Caddy:
- Uptime Kuma (watchtower)
- ntfy (watchtower)
- Headscale (spacey)
- Vaultwarden (vipy)
- Forgejo (vipy)
- LNBits (vipy)
- Personal Blog (vipy)
- ntfy-emergency-app (vipy)
### Verification:
The script will verify:
- ✓ Caddy installed on all target hosts
- ✓ Caddy service running
- ✓ Ports 80/443 open in firewall
- ✓ Sites-enabled directory created
- ✓ Can reach Caddy default page
### Run the Script:
```bash
source venv/bin/activate
cd /home/counterweight/personal_infra
./scripts/setup_layer_3_caddy.sh
```
**Note:** Caddy starts with an empty configuration. Services will add their own config files in later layers.
---
## Layer 4: Core Monitoring & Notifications
**Goal:** Deploy ntfy (notifications) and Uptime Kuma (monitoring platform).
**Script:** `./scripts/setup_layer_4_monitoring.sh`
### What This Layer Does:
Deploys core monitoring infrastructure on watchtower:
#### 4A: ntfy (Notification Service)
- Installs ntfy from official repositories
- Configures ntfy with authentication (deny-all by default)
- Creates admin user for sending notifications
- Sets up Caddy reverse proxy
- **Deployed to:** watchtower
#### 4B: Uptime Kuma (Monitoring Platform)
- Deploys Uptime Kuma via Docker
- Configures Caddy reverse proxy
- Sets up data persistence
- Optionally sets up backup to lapy
- **Deployed to:** watchtower
### Prerequisites (Complete BEFORE Running):
**1. Previous layers complete:**
- ✅ Layer 0, 1A, 2, 3 complete (watchtower must be fully set up)
- ✅ Docker installed on watchtower (from Layer 2)
- ✅ Caddy running on watchtower (from Layer 3)
**2. Configure subdomains (in centralized config):**
- ✅ Edit `ansible/services_config.yml` and customize subdomains under `subdomains:` section
- Set `ntfy:` to your preferred subdomain (e.g., `ntfy` or `notify`)
- Set `uptime_kuma:` to your preferred subdomain (e.g., `uptime` or `kuma`)
**3. Create DNS records that match your configured subdomains:**
- ✅ Create A record: `<ntfy_subdomain>.<yourdomain>` → watchtower IP
- ✅ Create A record: `<uptime_kuma_subdomain>.<yourdomain>` → watchtower IP
- ✅ Wait for DNS propagation (can take minutes to hours)
- ✅ Verify with: `dig <subdomain>.<yourdomain>` should return watchtower IP
**4. Prepare ntfy admin credentials:**
- ✅ Decide on username (default: `admin`)
- ✅ Decide on a secure password (script will prompt you)
### Run the Script:
```bash
source venv/bin/activate
cd /home/counterweight/personal_infra
./scripts/setup_layer_4_monitoring.sh
```
The script will prompt you for ntfy admin credentials during deployment.
### Post-Deployment Steps (Complete AFTER Running):
**The script will guide you through most of these, but here's what happens:**
#### Step 1: Set Up Uptime Kuma Admin Account (Manual)
1. Open browser and visit: `https://<uptime_kuma_subdomain>.<yourdomain>`
2. On first visit, you'll see the setup page
3. Create admin username and password
4. Save these credentials securely
#### Step 2: Update infra_secrets.yml (Manual)
1. Edit `ansible/infra_secrets.yml`
2. Add your Uptime Kuma credentials:
```yaml
uptime_kuma_username: "your-admin-username"
uptime_kuma_password: "your-admin-password"
```
3. Save the file
4. **This is required for automated ntfy setup and Layer 6**
#### Step 3: Configure ntfy Notification (Automated)
**The script will offer to do this automatically!** If you completed Steps 1 & 2, the script will:
- Connect to Uptime Kuma via API
- Create ntfy notification configuration
- Test the connection
- No manual UI configuration needed!
**Alternatively (Manual):**
1. In Uptime Kuma web UI, go to **Settings** → **Notifications**
2. Click **Setup Notification**, choose **ntfy**
3. Configure with your ntfy subdomain and credentials
#### Step 4: Final Verification (Automated)
**The script will automatically verify:**
- ✓ Uptime Kuma credentials in infra_secrets.yml
- ✓ Can connect to Uptime Kuma API
- ✓ ntfy notification is configured
- ✓ All post-deployment steps complete
If anything is missing, the script will tell you exactly what to do!
#### Step 5: Subscribe to Notifications on Your Phone (Optional - Manual)
1. Install ntfy app: https://github.com/binwiederhier/ntfy-android
2. Add subscription:
- Server: `https://<ntfy_subdomain>.<yourdomain>`
- Topic: `alerts` (same as configured in Uptime Kuma)
- Username: Your ntfy admin username
- Password: Your ntfy admin password
3. You'll now receive push notifications for all alerts!
**Pro tip:** Run the script again after completing Steps 1 & 2, and it will automatically configure ntfy and verify everything!
### Verification:
The script will automatically verify:
- ✓ DNS records are configured correctly (using `dig`)
- ✓ ntfy service running
- ✓ Uptime Kuma container running
- ✓ Caddy configs created for both services
After post-deployment steps, you can test:
- Visit `https://<ntfy_subdomain>.<yourdomain>` (should load ntfy web UI)
- Visit `https://<uptime_kuma_subdomain>.<yourdomain>` (should load Uptime Kuma)
- Send test notification in Uptime Kuma
**Note:** DNS validation requires `dig` command. If not available, validation will be skipped (you can continue but SSL may fail).
### Why This Layer is Critical:
- **All infrastructure monitoring** (Layer 6) depends on Uptime Kuma
- **All alerts** go through ntfy
- Services availability monitoring needs Uptime Kuma
- Without this layer, you won't know when things break!
---
## Layer 5: VPN Infrastructure (Headscale)
**Goal:** Deploy Headscale for secure mesh networking (like Tailscale, but self-hosted).
**Script:** `./scripts/setup_layer_5_headscale.sh`
**This layer is OPTIONAL** - Skip to Layer 6 if you don't need VPN mesh networking.
### What This Layer Does:
Deploys Headscale coordination server and optionally joins machines to the mesh:
#### 5A: Deploy Headscale Server
- Installs Headscale on spacey
- Configures with deny-all ACL policy (you customize later)
- Creates namespace/user for your network
- Sets up Caddy reverse proxy
- Configures embedded DERP server for NAT traversal
- **Deployed to:** spacey
#### 5B: Join Machines to Mesh (Optional)
- Installs Tailscale client on target machines
- Generates ephemeral pre-auth keys
- Automatically joins machines to your mesh
- Enables Magic DNS
- **Can join:** vipy, watchtower, nodito, lapy, etc.
### Prerequisites (Complete BEFORE Running):
**1. Previous layers complete:**
- ✅ Layer 0, 1A, 3 complete (spacey must be set up)
- ✅ Caddy running on spacey (from Layer 3)
**2. Configure subdomain (in centralized config):**
- ✅ Edit `ansible/services_config.yml` and customize `headscale:` under `subdomains:` section (e.g., `headscale` or `vpn`)
**3. Create DNS record that matches your configured subdomain:**
- ✅ Create A record: `<headscale_subdomain>.<yourdomain>` → spacey IP
- ✅ Wait for DNS propagation
- ✅ Verify with: `dig <subdomain>.<yourdomain>` should return spacey IP
**4. Decide on namespace name:**
- ✅ Choose a namespace for your network (default: `counter-net`)
- ✅ This is set in `headscale_vars.yml` as `headscale_namespace`
### Run the Script:
```bash
source venv/bin/activate
cd /home/counterweight/personal_infra
./scripts/setup_layer_5_headscale.sh
```
The script will:
1. Validate DNS configuration
2. Deploy Headscale server
3. Offer to join machines to the mesh
### Post-Deployment Steps:
#### Configure ACL Policies (Required for machines to communicate)
1. SSH into spacey: `ssh counterweight@<spacey-ip>`
2. Edit ACL file: `sudo nano /etc/headscale/acl.json`
3. Configure rules (example - allow all):
```json
{
"ACLs": [
{"action": "accept", "src": ["*"], "dst": ["*:*"]}
]
}
```
4. Restart Headscale: `sudo systemctl restart headscale`
**Default is deny-all for security** - you must configure ACLs for machines to talk!
#### Join Additional Machines Manually
For machines not in inventory (mobile, desktop):
1. Install Tailscale client on device
2. Generate pre-auth key on spacey:
```bash
ssh counterweight@<spacey-ip>
sudo headscale preauthkeys create --user <namespace> --reusable
```
3. Connect using your Headscale server:
```bash
tailscale up --login-server https://<headscale_subdomain>.<yourdomain> --authkey <key>
```
### Automatic Uptime Kuma Monitor:
**The playbook will automatically create a monitor in Uptime Kuma:**
- ✅ **Headscale** - monitors `https://<subdomain>/health`
- Added to "services" monitor group
- Uses ntfy notification (if configured)
- Check every 60 seconds
**Prerequisites:** Uptime Kuma credentials must be in `infra_secrets.yml` (from Layer 4)
### Verification:
The script will automatically verify:
- ✓ DNS records configured correctly
- ✓ Headscale installed and running
- ✓ Namespace created
- ✓ Caddy config created
- ✓ Machines joined (if selected)
- ✓ Monitor created in Uptime Kuma "services" group
List connected devices:
```bash
ssh counterweight@<spacey-ip>
sudo headscale nodes list
```
### Why Use Headscale:
- **Secure communication** between all your machines
- **Magic DNS** - access machines by hostname
- **NAT traversal** - works even behind firewalls
- **Self-hosted** - full control of your VPN
- **Mobile support** - use official Tailscale apps
### Backup:
Optional backup to lapy:
```bash
ansible-playbook -i inventory.ini services/headscale/setup_backup_headscale_to_lapy.yml
```
---
## Layer 6: Infrastructure Monitoring
**Goal:** Deploy automated monitoring for disk usage, system health, and CPU temperature.
**Script:** `./scripts/setup_layer_6_infra_monitoring.sh`
### What This Layer Does:
Deploys monitoring scripts that report to Uptime Kuma:
#### 6A: Disk Usage Monitoring
- Monitors disk usage on specified mount points
- Sends alerts when usage exceeds threshold (default: 80%)
- Creates Uptime Kuma push monitors automatically
- Organizes monitors in host-specific groups
- **Deploys to:** All hosts (selectable)
#### 6B: System Healthcheck
- Sends regular heartbeat pings to Uptime Kuma
- Alerts if system stops responding
- "No news is good news" monitoring
- **Deploys to:** All hosts (selectable)
#### 6C: CPU Temperature Monitoring (Nodito only)
- Monitors CPU temperature on Proxmox server
- Alerts when temperature exceeds threshold (default: 80°C)
- **Deploys to:** nodito (if configured)
### Prerequisites (Complete BEFORE Running):
**1. Previous layers complete:**
- ✅ Layer 0, 1A/1B, 4 complete
- ✅ Uptime Kuma deployed and configured (Layer 4)
- ✅ **CRITICAL:** `infra_secrets.yml` has Uptime Kuma credentials
**2. Uptime Kuma API credentials ready:**
- ✅ Must have completed Layer 4 post-deployment steps
- ✅ `ansible/infra_secrets.yml` must contain:
```yaml
uptime_kuma_username: "your-username"
uptime_kuma_password: "your-password"
```
**3. Python dependencies installed:**
- ✅ `uptime-kuma-api` must be in requirements.txt
- ✅ Should already be installed from Layer 0
- ✅ Verify: `pip list | grep uptime-kuma-api`
### Run the Script:
```bash
source venv/bin/activate
cd /home/counterweight/personal_infra
./scripts/setup_layer_6_infra_monitoring.sh
```
The script will:
1. Verify Uptime Kuma credentials
2. Offer to deploy disk usage monitoring
3. Offer to deploy system healthchecks
4. Offer to deploy CPU temp monitoring (nodito only)
5. Test monitor creation and alerts
### What Gets Deployed:
**For each monitored host:**
- Push monitor in Uptime Kuma (upside-down mode)
- Monitor group named `{hostname} - infra`
- Systemd service for monitoring script
- Systemd timer for periodic execution
- Log file for monitoring history
**Default settings (customizable):**
- Disk usage threshold: 80%
- Disk check interval: 15 minutes
- Healthcheck interval: 60 seconds
- CPU temp threshold: 80°C
- Monitored mount point: `/` (root)
### Customization Options:
Change thresholds and intervals:
```bash
# Disk monitoring with custom settings
ansible-playbook -i inventory.ini infra/410_disk_usage_alerts.yml \
-e "disk_usage_threshold_percent=85" \
-e "disk_check_interval_minutes=10" \
-e "monitored_mount_point=/home"
# Healthcheck with custom interval
ansible-playbook -i inventory.ini infra/420_system_healthcheck.yml \
-e "healthcheck_interval_seconds=30"
# CPU temp with custom threshold
ansible-playbook -i inventory.ini infra/nodito/40_cpu_temp_alerts.yml \
-e "temp_threshold_celsius=75"
```
### Verification:
The script will automatically verify:
- ✓ Uptime Kuma API accessible
- ✓ Monitors created in Uptime Kuma
- ✓ Monitor groups created
- ✓ Systemd services running
- ✓ Can send test alerts
Check Uptime Kuma web UI:
- Monitors should appear organized by host
- Should receive test pings
- Alerts will show when thresholds exceeded
### Post-Deployment:
**Monitor your infrastructure:**
1. Open Uptime Kuma web UI
2. See all monitors organized by host groups
3. Configure notification rules per monitor
4. Set up status pages (optional)
**Test alerts:**
```bash
# Trigger disk usage alert (fill disk temporarily)
# Trigger healthcheck alert (stop the service)
# Check ntfy for notifications
```
### Why This Layer is Important:
- **Proactive monitoring** - Know about issues before users do
- **Disk space alerts** - Prevent services from failing
- **System health** - Detect crashed/frozen machines
- **Temperature monitoring** - Prevent hardware damage
- **Organized** - All monitors grouped by host
---
## Layer 7: Core Services
**Goal:** Deploy core applications: Vaultwarden, Forgejo, and LNBits.
**Script:** `./scripts/setup_layer_7_services.sh`
### What This Layer Does:
Deploys main services on vipy:
#### 7A: Vaultwarden (Password Manager)
- Deploys via Docker
- Configures Caddy reverse proxy
- Sets up fail2ban protection
- Enables sign-ups initially (disable after creating first user)
- **Deployed to:** vipy
#### 7B: Forgejo (Git Server)
- Installs Forgejo binary
- Creates git user and directories
- Configures Caddy reverse proxy
- Enables SSH cloning
- **Deployed to:** vipy
#### 7C: LNBits (Lightning Wallet)
- Installs system dependencies and uv (Python 3.12 tooling)
- Clones LNBits version v1.3.1
- Syncs dependencies with uv targeting Python 3.12
- Configures with FakeWallet backend (for testing)
- Creates systemd service
- Configures Caddy reverse proxy
- **Deployed to:** vipy
### Prerequisites (Complete BEFORE Running):
**1. Previous layers complete:**
- ✅ Layer 0, 1A, 2, 3 complete
- ✅ Docker installed on vipy (Layer 2)
- ✅ Caddy running on vipy (Layer 3)
**2. Configure subdomains (in centralized config):**
- ✅ Edit `ansible/services_config.yml` and customize subdomains under `subdomains:` section:
- Set `vaultwarden:` to your preferred subdomain (e.g., `vault` or `passwords`)
- Set `forgejo:` to your preferred subdomain (e.g., `git` or `code`)
- Set `lnbits:` to your preferred subdomain (e.g., `lnbits` or `wallet`)
**3. Create DNS records matching your subdomains:**
- ✅ Create A record: `<vaultwarden_subdomain>.<yourdomain>` → vipy IP
- ✅ Create A record: `<forgejo_subdomain>.<yourdomain>` → vipy IP
- ✅ Create A record: `<lnbits_subdomain>.<yourdomain>` → vipy IP
- ✅ Wait for DNS propagation
### Run the Script:
```bash
source venv/bin/activate
cd /home/counterweight/personal_infra
./scripts/setup_layer_7_services.sh
```
The script will:
1. Validate DNS configuration
2. Offer to deploy each service
3. Configure backups (optional)
### Post-Deployment Steps:
#### Vaultwarden:
1. Visit `https://<vaultwarden_subdomain>.<yourdomain>`
2. Create your first user account
3. **Important:** Disable sign-ups after first user:
```bash
ansible-playbook -i inventory.ini services/vaultwarden/disable_vaultwarden_sign_ups_playbook.yml
```
4. Optional: Set up backup to lapy
#### Forgejo:
1. Visit `https://<forgejo_subdomain>.<yourdomain>`
2. Create admin account on first visit
3. Default: registrations disabled for security
4. SSH cloning works automatically after adding SSH key
#### LNBits:
1. Visit `https://<lnbits_subdomain>.<yourdomain>`
2. Create superuser on first visit
3. **Important:** Default uses FakeWallet (testing only)
4. Configure real Lightning backend:
- Edit `/opt/lnbits/lnbits/.env` on vipy
- Or use the superuser UI to configure backend
5. Disable new user registration for security
6. Optional: Set up encrypted backup to lapy
### Backup Configuration:
After services are stable, set up backups:
**Vaultwarden backup:**
```bash
ansible-playbook -i inventory.ini services/vaultwarden/setup_backup_vaultwarden_to_lapy.yml
```
**LNBits backup (GPG encrypted):**
```bash
ansible-playbook -i inventory.ini services/lnbits/setup_backup_lnbits_to_lapy.yml
```
**Note:** Forgejo backups are not automated - backup manually or set up your own solution.
### Automatic Uptime Kuma Monitors:
**The playbooks will automatically create monitors in Uptime Kuma for each service:**
- ✅ **Vaultwarden** - monitors `https://<subdomain>/alive`
- ✅ **Forgejo** - monitors `https://<subdomain>/api/healthz`
- ✅ **LNBits** - monitors `https://<subdomain>/api/v1/health`
All monitors:
- Added to "services" monitor group
- Use ntfy notification (if configured)
- Check every 60 seconds
- 3 retries before alerting
**Prerequisites:** Uptime Kuma credentials must be in `infra_secrets.yml` (from Layer 4)
### Verification:
The script will automatically verify:
- ✓ DNS records configured
- ✓ Services deployed
- ✓ Docker containers running (Vaultwarden)
- ✓ Systemd services running (Forgejo, LNBits)
- ✓ Caddy configs created
Manual verification:
- Visit each service's subdomain
- Create admin/first user accounts
- Test functionality
- Check Uptime Kuma for new monitors in "services" group
### Why These Services:
- **Vaultwarden** - Self-hosted password manager (Bitwarden compatible)
- **Forgejo** - Self-hosted Git server (GitHub/GitLab alternative)
- **LNBits** - Lightning Network wallet and accounts system
---
## Layer 8: Secondary Services
**Status:** 🔒 Locked (Complete Layer 7 first)
---
## Troubleshooting
### Common Issues
#### SSH Connection Fails
- Verify VPS is running and accessible
- Check SSH key is in the correct location
- Ensure SSH key has correct permissions (600)
- Try manual SSH: `ssh -i ~/.ssh/counterganzua root@<ip>`
#### Ansible Not Found
- Make sure you've activated the venv: `source venv/bin/activate`
- Run Layer 0 script again
#### DNS Not Resolving
- DNS changes can take up to 24-48 hours to propagate
- Use `dig <subdomain>.<domain>` to check DNS status
- You can proceed with setup; services will work once DNS propagates
---
## Progress Tracking
Use this checklist to track your progress:
- [ ] Layer 0: Foundation Setup
- [ ] Layer 1A: VPS Basic Setup
- [ ] Layer 1B: Nodito (Proxmox) Setup
- [ ] Layer 2: General Infrastructure Tools
- [ ] Layer 3: Reverse Proxy (Caddy)
- [ ] Layer 4: Core Monitoring & Notifications
- [ ] Layer 5: VPN Infrastructure (Headscale)
- [ ] Layer 6: Infrastructure Monitoring
- [ ] Layer 7: Core Services
- [ ] Layer 8: Secondary Services
- [ ] Backups Configured

View file

@ -8,3 +8,4 @@ packaging==25.0
pycparser==2.22
PyYAML==6.0.2
resolvelib==1.0.1
uptime-kuma-api>=1.2.1

140
scripts/README.md Normal file
View file

@ -0,0 +1,140 @@
# Infrastructure Setup Scripts
This directory contains automated setup scripts for each layer of the infrastructure.
## Overview
Each script handles a complete layer of the infrastructure setup:
- Prompts for required variables
- Validates prerequisites
- Creates configuration files
- Executes playbooks
- Verifies completion
## Usage
Run scripts in order, completing one layer before moving to the next:
### Layer 0: Foundation Setup
```bash
./scripts/setup_layer_0.sh
```
Sets up Ansible control node on your laptop.
### Layer 1A: VPS Basic Setup
```bash
source venv/bin/activate
./scripts/setup_layer_1a_vps.sh
```
Configures users, SSH, firewall, and fail2ban on VPS machines (vipy, watchtower, spacey).
**Runs independently** - no Nodito required.
### Layer 1B: Nodito (Proxmox) Setup
```bash
source venv/bin/activate
./scripts/setup_layer_1b_nodito.sh
```
Configures Nodito Proxmox server: bootstrap, community repos, optional ZFS.
**Runs independently** - no VPS required.
### Layer 2: General Infrastructure Tools
```bash
source venv/bin/activate
./scripts/setup_layer_2.sh
```
Installs rsync and docker on hosts that need them.
- **rsync:** For backup operations (vipy, watchtower, lapy recommended)
- **docker:** For containerized services (vipy, watchtower recommended)
- Interactive: Choose which hosts get which tools
### Layer 3: Reverse Proxy (Caddy)
```bash
source venv/bin/activate
./scripts/setup_layer_3_caddy.sh
```
Deploys Caddy reverse proxy on VPS machines (vipy, watchtower, spacey).
- **Critical:** All web services depend on Caddy
- Automatic HTTPS with Let's Encrypt
- Opens firewall ports 80/443
- Creates sites-enabled directory structure
### Layer 4: Core Monitoring & Notifications
```bash
source venv/bin/activate
./scripts/setup_layer_4_monitoring.sh
```
Deploys ntfy and Uptime Kuma on watchtower.
- **ntfy:** Notification service for alerts
- **Uptime Kuma:** Monitoring platform for all services
- **Critical:** All infrastructure monitoring depends on these
- Sets up backups (optional)
- **Post-deploy:** Create Uptime Kuma admin user and update infra_secrets.yml
### Layer 5: VPN Infrastructure (Headscale)
```bash
source venv/bin/activate
./scripts/setup_layer_5_headscale.sh
```
Deploys Headscale VPN mesh networking on spacey.
- **OPTIONAL** - Skip to Layer 6 if you don't need VPN
- Secure mesh networking between all machines
- Magic DNS for hostname resolution
- NAT traversal support
- Can join machines automatically or manually
- Post-deploy: Configure ACL policies for machine communication
### Layer 6: Infrastructure Monitoring
```bash
source venv/bin/activate
./scripts/setup_layer_6_infra_monitoring.sh
```
Deploys automated monitoring for infrastructure.
- **Requires:** Uptime Kuma credentials in infra_secrets.yml (Layer 4)
- Disk usage monitoring with auto-created push monitors
- System healthcheck (heartbeat) monitoring
- CPU temperature monitoring (nodito only)
- Interactive selection of which hosts to monitor
- All monitors organized by host groups
### Layer 7: Core Services
```bash
source venv/bin/activate
./scripts/setup_layer_7_services.sh
```
Deploys core services on vipy: Vaultwarden, Forgejo, LNBits.
- Password manager (Vaultwarden) with /alive endpoint
- Git server (Forgejo) with /api/healthz endpoint
- Lightning wallet (LNBits) with /api/v1/health endpoint
- **Automatic:** Creates Uptime Kuma monitors in "services" group
- **Requires:** Uptime Kuma credentials in infra_secrets.yml
- Optional: Configure backups to lapy
### Layer 8+
More scripts will be added as we build out each layer.
## Important Notes
1. **Centralized Configuration:**
- All service subdomains are configured in `ansible/services_config.yml`
- Edit this ONE file instead of multiple vars files
- Created automatically in Layer 0
- DNS records must match the subdomains you configure
2. **Always activate the venv first** (except for Layer 0):
```bash
source venv/bin/activate
```
3. **Complete each layer fully** before moving to the next
4. **Scripts are idempotent** - safe to run multiple times
5. **Review changes** before confirming actions
## Getting Started
1. Read `../human_script.md` for the complete guide
2. Start with Layer 0
3. Follow the prompts
4. Proceed layer by layer

494
scripts/setup_layer_0.sh Executable file
View file

@ -0,0 +1,494 @@
#!/bin/bash
###############################################################################
# Layer 0: Foundation Setup
#
# This script sets up your laptop (lapy) as the Ansible control node.
# It prepares all the prerequisites needed for the infrastructure deployment.
###############################################################################
set -e # Exit on error
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Project root directory
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
###############################################################################
# Helper Functions
###############################################################################
print_header() {
echo -e "\n${BLUE}========================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}========================================${NC}\n"
}
print_success() {
echo -e "${GREEN}${NC} $1"
}
print_error() {
echo -e "${RED}${NC} $1"
}
print_warning() {
echo -e "${YELLOW}${NC} $1"
}
print_info() {
echo -e "${BLUE}${NC} $1"
}
prompt_user() {
local prompt="$1"
local default="$2"
local result
if [ -n "$default" ]; then
read -p "$(echo -e ${BLUE}${prompt}${NC} [${default}]: )" result
result="${result:-$default}"
else
read -p "$(echo -e ${BLUE}${prompt}${NC}: )" result
fi
echo "$result"
}
confirm_action() {
local prompt="$1"
local response
read -p "$(echo -e ${YELLOW}${prompt}${NC} [y/N]: )" response
[[ "$response" =~ ^[Yy]$ ]]
}
###############################################################################
# Main Setup Functions
###############################################################################
check_prerequisites() {
print_header "Checking Prerequisites"
# Check if we're in the right directory
if [ ! -f "$PROJECT_ROOT/README.md" ] || [ ! -d "$PROJECT_ROOT/ansible" ]; then
print_error "Not in the correct project directory"
echo "Expected: $PROJECT_ROOT"
exit 1
fi
print_success "Running from correct directory: $PROJECT_ROOT"
# Check if Python 3 is installed
if ! command -v python3 &> /dev/null; then
print_error "Python 3 is not installed. Please install Python 3 first."
exit 1
fi
print_success "Python 3 found: $(python3 --version)"
# Check if git is installed
if ! command -v git &> /dev/null; then
print_warning "Git is not installed. Some features may not work."
else
print_success "Git found: $(git --version | head -n1)"
fi
}
setup_python_venv() {
print_header "Setting Up Python Virtual Environment"
cd "$PROJECT_ROOT"
if [ -d "venv" ]; then
print_info "Virtual environment already exists"
if confirm_action "Recreate virtual environment?"; then
rm -rf venv
python3 -m venv venv
print_success "Virtual environment recreated"
else
print_success "Using existing virtual environment"
fi
else
python3 -m venv venv
print_success "Virtual environment created"
fi
# Activate venv
source venv/bin/activate
print_success "Virtual environment activated"
# Upgrade pip
print_info "Upgrading pip..."
pip install --upgrade pip > /dev/null 2>&1
print_success "pip upgraded"
}
install_python_requirements() {
print_header "Installing Python Requirements"
cd "$PROJECT_ROOT"
if [ ! -f "requirements.txt" ]; then
print_error "requirements.txt not found"
exit 1
fi
print_info "Installing packages from requirements.txt..."
pip install -r requirements.txt
print_success "Python requirements installed"
# Verify Ansible installation
if ! command -v ansible &> /dev/null; then
print_error "Ansible installation failed"
exit 1
fi
print_success "Ansible installed: $(ansible --version | head -n1)"
}
install_ansible_collections() {
print_header "Installing Ansible Galaxy Collections"
cd "$PROJECT_ROOT/ansible"
if [ ! -f "requirements.yml" ]; then
print_warning "requirements.yml not found, skipping Ansible collections"
return
fi
print_info "Installing Ansible Galaxy collections..."
ansible-galaxy collection install -r requirements.yml
print_success "Ansible Galaxy collections installed"
}
setup_inventory_file() {
print_header "Setting Up Inventory File"
cd "$PROJECT_ROOT/ansible"
if [ -f "inventory.ini" ]; then
print_info "inventory.ini already exists"
cat inventory.ini
echo ""
if ! confirm_action "Do you want to update it?"; then
print_success "Using existing inventory.ini"
return
fi
fi
print_info "Let's configure your infrastructure hosts"
echo ""
# Collect information
echo -e -n "${BLUE}SSH key path${NC} [~/.ssh/counterganzua]: "
read ssh_key
ssh_key="${ssh_key:-~/.ssh/counterganzua}"
echo ""
echo "Enter the IP addresses for your infrastructure (VMs will be added later):"
echo ""
echo -e -n "${BLUE}vipy${NC} (main VPS) IP: "
read vipy_ip
echo -e -n "${BLUE}watchtower${NC} (monitoring VPS) IP: "
read watchtower_ip
echo -e -n "${BLUE}spacey${NC} (headscale VPS) IP: "
read spacey_ip
echo -e -n "${BLUE}nodito${NC} (Proxmox server) IP [optional]: "
read nodito_ip
echo ""
echo -e -n "${BLUE}Your username on lapy${NC} [$(whoami)]: "
read lapy_user
lapy_user="${lapy_user:-$(whoami)}"
echo -e -n "${BLUE}GPG recipient email${NC} [optional, for encrypted backups]: "
read gpg_email
echo -e -n "${BLUE}GPG key ID${NC} [optional, for encrypted backups]: "
read gpg_key
# Generate inventory.ini
cat > inventory.ini << EOF
# Ansible Inventory File
# Generated by setup_layer_0.sh
EOF
if [ -n "$vipy_ip" ]; then
cat >> inventory.ini << EOF
[vipy]
$vipy_ip ansible_user=counterweight ansible_port=22 ansible_ssh_private_key_file=$ssh_key
EOF
fi
if [ -n "$watchtower_ip" ]; then
cat >> inventory.ini << EOF
[watchtower]
$watchtower_ip ansible_user=counterweight ansible_port=22 ansible_ssh_private_key_file=$ssh_key
EOF
fi
if [ -n "$spacey_ip" ]; then
cat >> inventory.ini << EOF
[spacey]
$spacey_ip ansible_user=counterweight ansible_port=22 ansible_ssh_private_key_file=$ssh_key
EOF
fi
if [ -n "$nodito_ip" ]; then
cat >> inventory.ini << EOF
[nodito]
$nodito_ip ansible_user=counterweight ansible_port=22 ansible_ssh_private_key_file=$ssh_key
EOF
fi
# Add nodito-vms placeholder for VMs that will be created later
cat >> inventory.ini << EOF
# Nodito VMs - These don't exist yet and will be created on the Proxmox server
# Add them here once you create VMs on nodito (e.g., memos-box, etc.)
[nodito-vms]
# Example:
# 192.168.1.150 ansible_user=counterweight ansible_port=22 ansible_ssh_private_key_file=$ssh_key hostname=memos-box
EOF
# Add lapy
cat >> inventory.ini << EOF
# Local connection to laptop: this assumes you're running ansible commands from your personal laptop
[lapy]
localhost ansible_connection=local ansible_user=$lapy_user
EOF
if [ -n "$gpg_email" ] && [ -n "$gpg_key" ]; then
echo " gpg_recipient=$gpg_email gpg_key_id=$gpg_key" >> inventory.ini
fi
print_success "inventory.ini created"
echo ""
print_info "Review your inventory file:"
cat inventory.ini
echo ""
}
setup_infra_vars() {
print_header "Setting Up Infrastructure Variables"
cd "$PROJECT_ROOT/ansible"
if [ -f "infra_vars.yml" ]; then
print_info "infra_vars.yml already exists"
cat infra_vars.yml
echo ""
if ! confirm_action "Do you want to update it?"; then
print_success "Using existing infra_vars.yml"
return
fi
fi
echo ""
echo -e -n "${BLUE}Your root domain${NC} (e.g., contrapeso.xyz): "
read domain
while [ -z "$domain" ]; do
print_warning "Domain cannot be empty"
echo -e -n "${BLUE}Your root domain${NC}: "
read domain
done
cat > infra_vars.yml << EOF
# Infrastructure Variables
# Generated by setup_layer_0.sh
new_user: counterweight
ssh_port: 22
allow_ssh_from: "any"
root_domain: $domain
EOF
print_success "infra_vars.yml created"
echo ""
print_info "Contents:"
cat infra_vars.yml
echo ""
}
setup_services_config() {
print_header "Setting Up Services Configuration"
cd "$PROJECT_ROOT/ansible"
if [ -f "services_config.yml" ]; then
print_info "services_config.yml already exists"
if ! confirm_action "Do you want to recreate it from template?"; then
print_success "Using existing services_config.yml"
return
fi
fi
if [ ! -f "services_config.yml.example" ]; then
print_error "services_config.yml.example not found"
return
fi
cp services_config.yml.example services_config.yml
print_success "services_config.yml created"
echo ""
print_info "This file centralizes all service subdomains and Caddy settings"
print_info "Customize subdomains in: ansible/services_config.yml"
echo ""
}
setup_infra_secrets() {
print_header "Setting Up Infrastructure Secrets"
cd "$PROJECT_ROOT/ansible"
if [ -f "infra_secrets.yml" ]; then
print_warning "infra_secrets.yml already exists"
if ! confirm_action "Do you want to recreate the template?"; then
print_success "Using existing infra_secrets.yml"
return
fi
fi
cat > infra_secrets.yml << EOF
# Infrastructure Secrets
# Generated by setup_layer_0.sh
#
# IMPORTANT: This file contains sensitive credentials
# It is already in .gitignore - DO NOT commit it to git
#
# You'll need to fill in the Uptime Kuma credentials after Layer 4
# when you deploy Uptime Kuma
# Uptime Kuma Credentials (fill these in after deploying Uptime Kuma in Layer 4)
uptime_kuma_username: ""
uptime_kuma_password: ""
EOF
print_success "infra_secrets.yml template created"
print_warning "You'll need to fill in Uptime Kuma credentials after Layer 4"
echo ""
}
validate_ssh_key() {
print_header "Validating SSH Key"
cd "$PROJECT_ROOT/ansible"
# Extract SSH key path from inventory
if [ -f "inventory.ini" ]; then
ssh_key=$(grep "ansible_ssh_private_key_file" inventory.ini | head -n1 | sed 's/.*ansible_ssh_private_key_file=\([^ ]*\).*/\1/')
# Expand tilde
ssh_key="${ssh_key/#\~/$HOME}"
if [ -f "$ssh_key" ]; then
print_success "SSH key found: $ssh_key"
# Check permissions
perms=$(stat -c "%a" "$ssh_key" 2>/dev/null || stat -f "%OLp" "$ssh_key" 2>/dev/null)
if [ "$perms" != "600" ]; then
print_warning "SSH key permissions are $perms (should be 600)"
if confirm_action "Fix permissions?"; then
chmod 600 "$ssh_key"
print_success "Permissions fixed"
fi
else
print_success "SSH key permissions are correct (600)"
fi
else
print_error "SSH key not found: $ssh_key"
print_warning "Make sure to create your SSH key before proceeding to Layer 1"
echo ""
echo "To generate a new SSH key:"
echo " ssh-keygen -t ed25519 -f $ssh_key -C \"your-email@example.com\""
fi
else
print_warning "inventory.ini not found, skipping SSH key validation"
fi
}
print_summary() {
print_header "Layer 0 Setup Complete! 🎉"
echo "Summary of what was configured:"
echo ""
print_success "Python virtual environment created and activated"
print_success "Ansible and dependencies installed"
print_success "Ansible Galaxy collections installed"
print_success "inventory.ini configured with your hosts"
print_success "infra_vars.yml configured with your domain"
print_success "services_config.yml created with subdomain settings"
print_success "infra_secrets.yml template created"
echo ""
print_info "Before proceeding to Layer 1:"
echo " 1. Ensure your SSH key is added to all VPS root users"
echo " 2. Verify you can SSH into each machine manually"
echo " 3. Configure DNS nameservers for your domain (if not done)"
echo ""
print_info "Note about inventory groups:"
echo " • [nodito-vms] group created as placeholder"
echo " • These VMs will be created later on Proxmox"
echo " • Add their IPs to inventory.ini once created"
echo ""
print_info "To test SSH access to a host:"
echo " ssh -i ~/.ssh/counterganzua root@<host-ip>"
echo ""
print_info "Next steps:"
echo " 1. Review the files in ansible/"
echo " 2. Test SSH connections to your hosts"
echo " 3. Proceed to Layer 1: ./scripts/setup_layer_1.sh"
echo ""
print_warning "Remember to activate the venv before running other commands:"
echo " source venv/bin/activate"
echo ""
}
###############################################################################
# Main Execution
###############################################################################
main() {
clear
print_header "🚀 Layer 0: Foundation Setup"
echo "This script will set up your laptop (lapy) as the Ansible control node."
echo "It will install all prerequisites and configure basic settings."
echo ""
if ! confirm_action "Continue with Layer 0 setup?"; then
echo "Setup cancelled."
exit 0
fi
check_prerequisites
setup_python_venv
install_python_requirements
install_ansible_collections
setup_inventory_file
setup_infra_vars
setup_services_config
setup_infra_secrets
validate_ssh_key
print_summary
}
# Run main function
main "$@"

359
scripts/setup_layer_1a_vps.sh Executable file
View file

@ -0,0 +1,359 @@
#!/bin/bash
###############################################################################
# Layer 1A: VPS Basic Setup
#
# This script configures users, SSH, firewall, and fail2ban on VPS machines.
# Runs independently - can be executed without Nodito setup.
###############################################################################
set -e # Exit on error
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Project root directory
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
ANSIBLE_DIR="$PROJECT_ROOT/ansible"
###############################################################################
# Helper Functions
###############################################################################
print_header() {
echo -e "\n${BLUE}========================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}========================================${NC}\n"
}
print_success() {
echo -e "${GREEN}${NC} $1"
}
print_error() {
echo -e "${RED}${NC} $1"
}
print_warning() {
echo -e "${YELLOW}${NC} $1"
}
print_info() {
echo -e "${BLUE}${NC} $1"
}
confirm_action() {
local prompt="$1"
local response
read -p "$(echo -e ${YELLOW}${prompt}${NC} [y/N]: )" response
[[ "$response" =~ ^[Yy]$ ]]
}
###############################################################################
# Verification Functions
###############################################################################
check_layer_0_complete() {
print_header "Verifying Layer 0 Prerequisites"
local errors=0
# Check if venv exists
if [ ! -d "$PROJECT_ROOT/venv" ]; then
print_error "Python venv not found. Run Layer 0 first."
((errors++))
else
print_success "Python venv exists"
fi
# Check if we're in a venv
if [ -z "$VIRTUAL_ENV" ]; then
print_error "Virtual environment not activated"
echo "Run: source venv/bin/activate"
((errors++))
else
print_success "Virtual environment activated"
fi
# Check if Ansible is installed
if ! command -v ansible &> /dev/null; then
print_error "Ansible not found"
((errors++))
else
print_success "Ansible found: $(ansible --version | head -n1)"
fi
# Check if inventory.ini exists
if [ ! -f "$ANSIBLE_DIR/inventory.ini" ]; then
print_error "inventory.ini not found"
((errors++))
else
print_success "inventory.ini exists"
fi
# Check if infra_vars.yml exists
if [ ! -f "$ANSIBLE_DIR/infra_vars.yml" ]; then
print_error "infra_vars.yml not found"
((errors++))
else
print_success "infra_vars.yml exists"
fi
if [ $errors -gt 0 ]; then
print_error "Layer 0 is not complete. Please run ./scripts/setup_layer_0.sh first"
exit 1
fi
print_success "Layer 0 prerequisites verified"
}
get_hosts_from_inventory() {
local group="$1"
cd "$ANSIBLE_DIR"
ansible-inventory -i inventory.ini --list | \
python3 -c "import sys, json; data=json.load(sys.stdin); print(' '.join(data.get('$group', {}).get('hosts', [])))" 2>/dev/null || echo ""
}
check_vps_configured() {
print_header "Checking VPS Configuration"
local has_vps=false
for group in vipy watchtower spacey; do
local hosts=$(get_hosts_from_inventory "$group")
if [ -n "$hosts" ]; then
print_success "$group configured: $hosts"
has_vps=true
else
print_info "$group not configured (skipping)"
fi
done
if [ "$has_vps" = false ]; then
print_error "No VPSs configured in inventory.ini"
print_info "Add at least one VPS (vipy, watchtower, or spacey) to proceed"
exit 1
fi
echo ""
}
check_ssh_connectivity() {
print_header "Testing SSH Connectivity as Root"
local ssh_key=$(grep "ansible_ssh_private_key_file" "$ANSIBLE_DIR/inventory.ini" | head -n1 | sed 's/.*ansible_ssh_private_key_file=\([^ ]*\).*/\1/')
ssh_key="${ssh_key/#\~/$HOME}"
print_info "Using SSH key: $ssh_key"
echo ""
local all_good=true
# Test VPSs (vipy, watchtower, spacey)
for group in vipy watchtower spacey; do
local hosts=$(get_hosts_from_inventory "$group")
if [ -n "$hosts" ]; then
for host in $hosts; do
print_info "Testing SSH to $host as root..."
if timeout 10 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes root@$host "echo 'SSH OK'" &>/dev/null; then
print_success "SSH to $host as root: OK"
else
print_error "Cannot SSH to $host as root"
print_warning "Make sure your SSH key is added to root on $host"
all_good=false
fi
done
fi
done
if [ "$all_good" = false ]; then
echo ""
print_error "SSH connectivity test failed"
print_info "To fix this:"
echo " 1. Ensure your VPS provider has added your SSH key to root"
echo " 2. Test manually: ssh -i $ssh_key root@<host>"
echo ""
if ! confirm_action "Continue anyway?"; then
exit 1
fi
fi
echo ""
print_success "SSH connectivity verified"
}
###############################################################################
# VPS Setup Functions
###############################################################################
setup_vps_users_and_access() {
print_header "Setting Up Users and SSH Access on VPSs"
cd "$ANSIBLE_DIR"
print_info "This will:"
echo " • Create the 'counterweight' user with sudo access"
echo " • Configure SSH key authentication"
echo " • Disable root login (optional, configured in playbook)"
echo ""
print_info "Running: ansible-playbook -i inventory.ini infra/01_user_and_access_setup_playbook.yml"
echo ""
if ! confirm_action "Proceed with user and access setup?"; then
print_warning "Skipped user and access setup"
return 1
fi
# Run the playbook with -e 'ansible_user="root"' to use root for this first run
if ansible-playbook -i inventory.ini infra/01_user_and_access_setup_playbook.yml -e 'ansible_user="root"'; then
print_success "User and access setup complete"
return 0
else
print_error "User and access setup failed"
return 1
fi
}
setup_vps_firewall_and_fail2ban() {
print_header "Setting Up Firewall and Fail2ban on VPSs"
cd "$ANSIBLE_DIR"
print_info "This will:"
echo " • Configure UFW firewall with SSH access"
echo " • Install and configure fail2ban for brute force protection"
echo " • Install and configure auditd for security logging"
echo ""
print_info "Running: ansible-playbook -i inventory.ini infra/02_firewall_and_fail2ban_playbook.yml"
echo ""
if ! confirm_action "Proceed with firewall and fail2ban setup?"; then
print_warning "Skipped firewall setup"
return 1
fi
# Now use the default counterweight user
if ansible-playbook -i inventory.ini infra/02_firewall_and_fail2ban_playbook.yml; then
print_success "Firewall and fail2ban setup complete"
return 0
else
print_error "Firewall setup failed"
return 1
fi
}
###############################################################################
# Verification Functions
###############################################################################
verify_layer_1a() {
print_header "Verifying Layer 1A Completion"
cd "$ANSIBLE_DIR"
local ssh_key=$(grep "ansible_ssh_private_key_file" "$ANSIBLE_DIR/inventory.ini" | head -n1 | sed 's/.*ansible_ssh_private_key_file=\([^ ]*\).*/\1/')
ssh_key="${ssh_key/#\~/$HOME}"
# Test SSH as counterweight user
print_info "Testing SSH as counterweight user..."
echo ""
local all_good=true
for group in vipy watchtower spacey; do
local hosts=$(get_hosts_from_inventory "$group")
if [ -n "$hosts" ]; then
for host in $hosts; do
if timeout 10 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$host "echo 'SSH OK'" &>/dev/null; then
print_success "SSH to $host as counterweight: OK"
else
print_error "Cannot SSH to $host as counterweight"
all_good=false
fi
done
fi
done
echo ""
if [ "$all_good" = true ]; then
print_success "All SSH connectivity verified"
else
print_warning "Some SSH tests failed - manual verification recommended"
print_info "Test manually: ssh -i $ssh_key counterweight@<host>"
fi
}
###############################################################################
# Summary Functions
###############################################################################
print_summary() {
print_header "Layer 1A: VPS Setup Complete! 🎉"
echo "Summary of what was configured:"
echo ""
print_success "counterweight user created on all VPSs"
print_success "SSH key authentication configured"
print_success "UFW firewall active and configured"
print_success "fail2ban protecting against brute force attacks"
print_success "auditd logging security events"
echo ""
print_warning "Important Security Changes:"
echo " • Root SSH login is now disabled (by design)"
echo " • Always use 'counterweight' user for SSH access"
echo " • Firewall is active - only SSH allowed by default"
echo ""
print_info "Next steps:"
echo " 1. Test SSH access: ssh -i ~/.ssh/counterganzua counterweight@<host>"
echo " 2. (Optional) Set up Nodito: ./scripts/setup_layer_1b_nodito.sh"
echo " 3. Proceed to Layer 2: ./scripts/setup_layer_2.sh"
echo ""
}
###############################################################################
# Main Execution
###############################################################################
main() {
clear
print_header "🔧 Layer 1A: VPS Basic Setup"
echo "This script will configure users, SSH, firewall, and fail2ban on VPS machines."
echo ""
print_info "Targets: vipy, watchtower, spacey"
echo ""
if ! confirm_action "Continue with Layer 1A setup?"; then
echo "Setup cancelled."
exit 0
fi
check_layer_0_complete
check_vps_configured
check_ssh_connectivity
# VPS Setup
local setup_failed=false
setup_vps_users_and_access || setup_failed=true
setup_vps_firewall_and_fail2ban || setup_failed=true
verify_layer_1a
if [ "$setup_failed" = true ]; then
print_warning "Some steps failed - please review errors above"
fi
print_summary
}
# Run main function
main "$@"

401
scripts/setup_layer_1b_nodito.sh Executable file
View file

@ -0,0 +1,401 @@
#!/bin/bash
###############################################################################
# Layer 1B: Nodito (Proxmox) Setup
#
# This script configures the Nodito Proxmox server.
# Runs independently - can be executed without VPS setup.
###############################################################################
set -e # Exit on error
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Project root directory
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
ANSIBLE_DIR="$PROJECT_ROOT/ansible"
###############################################################################
# Helper Functions
###############################################################################
print_header() {
echo -e "\n${BLUE}========================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}========================================${NC}\n"
}
print_success() {
echo -e "${GREEN}${NC} $1"
}
print_error() {
echo -e "${RED}${NC} $1"
}
print_warning() {
echo -e "${YELLOW}${NC} $1"
}
print_info() {
echo -e "${BLUE}${NC} $1"
}
confirm_action() {
local prompt="$1"
local response
read -p "$(echo -e ${YELLOW}${prompt}${NC} [y/N]: )" response
[[ "$response" =~ ^[Yy]$ ]]
}
###############################################################################
# Verification Functions
###############################################################################
check_layer_0_complete() {
print_header "Verifying Layer 0 Prerequisites"
local errors=0
# Check if venv exists
if [ ! -d "$PROJECT_ROOT/venv" ]; then
print_error "Python venv not found. Run Layer 0 first."
((errors++))
else
print_success "Python venv exists"
fi
# Check if we're in a venv
if [ -z "$VIRTUAL_ENV" ]; then
print_error "Virtual environment not activated"
echo "Run: source venv/bin/activate"
((errors++))
else
print_success "Virtual environment activated"
fi
# Check if Ansible is installed
if ! command -v ansible &> /dev/null; then
print_error "Ansible not found"
((errors++))
else
print_success "Ansible found: $(ansible --version | head -n1)"
fi
# Check if inventory.ini exists
if [ ! -f "$ANSIBLE_DIR/inventory.ini" ]; then
print_error "inventory.ini not found"
((errors++))
else
print_success "inventory.ini exists"
fi
if [ $errors -gt 0 ]; then
print_error "Layer 0 is not complete. Please run ./scripts/setup_layer_0.sh first"
exit 1
fi
print_success "Layer 0 prerequisites verified"
}
get_hosts_from_inventory() {
local group="$1"
cd "$ANSIBLE_DIR"
ansible-inventory -i inventory.ini --list | \
python3 -c "import sys, json; data=json.load(sys.stdin); print(' '.join(data.get('$group', {}).get('hosts', [])))" 2>/dev/null || echo ""
}
check_nodito_configured() {
print_header "Checking Nodito Configuration"
local nodito_hosts=$(get_hosts_from_inventory "nodito")
if [ -z "$nodito_hosts" ]; then
print_error "No nodito host configured in inventory.ini"
print_info "Add nodito to [nodito] group in inventory.ini to proceed"
exit 1
fi
print_success "Nodito configured: $nodito_hosts"
echo ""
}
###############################################################################
# Nodito Setup Functions
###############################################################################
setup_nodito_bootstrap() {
print_header "Bootstrapping Nodito (Proxmox Server)"
cd "$ANSIBLE_DIR"
print_info "This will:"
echo " • Set up SSH key access for root"
echo " • Create the counterweight user with SSH keys"
echo " • Update and secure the system"
echo " • Disable root login and password authentication"
echo ""
print_info "Running: ansible-playbook -i inventory.ini infra/nodito/30_proxmox_bootstrap_playbook.yml"
print_warning "You will be prompted for the root password"
echo ""
if ! confirm_action "Proceed with nodito bootstrap?"; then
print_warning "Skipped nodito bootstrap"
return 1
fi
# Run with root user and ask for password
if ansible-playbook -i inventory.ini infra/nodito/30_proxmox_bootstrap_playbook.yml -e 'ansible_user=root' --ask-pass; then
print_success "Nodito bootstrap complete"
return 0
else
print_error "Nodito bootstrap failed"
return 1
fi
}
setup_nodito_community_repos() {
print_header "Switching Nodito to Community Repositories"
cd "$ANSIBLE_DIR"
print_info "This will:"
echo " • Remove enterprise repository files"
echo " • Add community repository files"
echo " • Disable subscription nag messages"
echo " • Update Proxmox packages"
echo ""
print_info "Running: ansible-playbook -i inventory.ini infra/nodito/31_proxmox_community_repos_playbook.yml"
echo ""
if ! confirm_action "Proceed with community repos setup?"; then
print_warning "Skipped community repos setup"
return 1
fi
if ansible-playbook -i inventory.ini infra/nodito/31_proxmox_community_repos_playbook.yml; then
print_success "Community repositories configured"
print_warning "Clear browser cache before using Proxmox web UI (Ctrl+Shift+R)"
return 0
else
print_error "Community repos setup failed"
return 1
fi
}
setup_nodito_zfs() {
print_header "Setting Up ZFS Storage Pool on Nodito (Optional)"
cd "$ANSIBLE_DIR"
print_warning "⚠️ ZFS setup will DESTROY ALL DATA on the specified disks!"
echo ""
print_info "Before proceeding, you must:"
echo " 1. SSH into nodito: ssh root@<nodito-ip>"
echo " 2. List disks: ls -la /dev/disk/by-id/ | grep -E '(ata-|scsi-|nvme-)'"
echo " 3. Identify the two disk IDs you want to use for RAID 1"
echo " 4. Edit ansible/infra/nodito/nodito_vars.yml"
echo " 5. Set zfs_disk_1 and zfs_disk_2 to your disk IDs"
echo ""
print_info "Example nodito_vars.yml content:"
echo ' zfs_disk_1: "/dev/disk/by-id/ata-WDC_WD40EFRX-68N32N0_WD-WCC7K1234567"'
echo ' zfs_disk_2: "/dev/disk/by-id/ata-WDC_WD40EFRX-68N32N0_WD-WCC7K7654321"'
echo ""
if [ ! -f "$ANSIBLE_DIR/infra/nodito/nodito_vars.yml" ]; then
print_warning "nodito_vars.yml not found"
if confirm_action "Create nodito_vars.yml template?"; then
cat > "$ANSIBLE_DIR/infra/nodito/nodito_vars.yml" << 'EOF'
# Nodito Variables
# Configure these before running ZFS setup
# ZFS Storage Pool Configuration
# Uncomment and configure these lines after identifying your disk IDs:
# zfs_disk_1: "/dev/disk/by-id/ata-YOUR-DISK-1-ID-HERE"
# zfs_disk_2: "/dev/disk/by-id/ata-YOUR-DISK-2-ID-HERE"
# zfs_pool_name: "proxmox-storage"
# CPU Temperature Monitoring
monitoring_script_dir: /opt/cpu-temp-monitor
monitoring_script_path: "{{ monitoring_script_dir }}/cpu_temp_monitor.sh"
log_file: "{{ monitoring_script_dir }}/cpu_temp_monitor.log"
temp_threshold_celsius: 80
EOF
print_success "Created nodito_vars.yml template"
print_info "Edit this file and configure ZFS disks, then re-run this script"
fi
return 1
fi
# Check if ZFS disks are configured
if ! grep -q "^zfs_disk_1:" "$ANSIBLE_DIR/infra/nodito/nodito_vars.yml" 2>/dev/null; then
print_info "ZFS disks not configured in nodito_vars.yml"
print_info "Edit ansible/infra/nodito/nodito_vars.yml to configure disk IDs"
if ! confirm_action "Skip ZFS setup for now?"; then
print_info "Please configure ZFS disks first"
return 1
fi
print_warning "Skipped ZFS setup"
return 1
fi
print_info "Running: ansible-playbook -i inventory.ini infra/nodito/32_zfs_pool_setup_playbook.yml"
echo ""
if ! confirm_action "⚠️ Proceed with ZFS setup? (THIS WILL DESTROY DATA ON CONFIGURED DISKS)"; then
print_warning "Skipped ZFS setup"
return 1
fi
if ansible-playbook -i inventory.ini infra/nodito/32_zfs_pool_setup_playbook.yml; then
print_success "ZFS storage pool configured"
return 0
else
print_error "ZFS setup failed"
return 1
fi
}
setup_nodito_cloud_template() {
print_header "Creating Debian Cloud Template on Nodito (Optional)"
cd "$ANSIBLE_DIR"
print_info "This will:"
echo " • Download Debian cloud image"
echo " • Create a VM template (ID 9000)"
echo " • Configure cloud-init for easy VM creation"
echo ""
print_info "Running: ansible-playbook -i inventory.ini infra/nodito/33_proxmox_debian_cloud_template.yml"
echo ""
if ! confirm_action "Proceed with cloud template creation?"; then
print_warning "Skipped cloud template creation"
return 1
fi
if ansible-playbook -i inventory.ini infra/nodito/33_proxmox_debian_cloud_template.yml; then
print_success "Debian cloud template created (VM ID 9000)"
return 0
else
print_error "Cloud template creation failed"
return 1
fi
}
###############################################################################
# Verification Functions
###############################################################################
verify_layer_1b() {
print_header "Verifying Layer 1B Completion"
cd "$ANSIBLE_DIR"
local ssh_key=$(grep "ansible_ssh_private_key_file" "$ANSIBLE_DIR/inventory.ini" | head -n1 | sed 's/.*ansible_ssh_private_key_file=\([^ ]*\).*/\1/')
ssh_key="${ssh_key/#\~/$HOME}"
local nodito_hosts=$(get_hosts_from_inventory "nodito")
print_info "Testing SSH as counterweight user..."
echo ""
for host in $nodito_hosts; do
if timeout 10 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$host "echo 'SSH OK'" &>/dev/null; then
print_success "SSH to $host as counterweight: OK"
else
print_error "Cannot SSH to $host as counterweight"
print_info "Test manually: ssh -i $ssh_key counterweight@$host"
fi
done
echo ""
}
###############################################################################
# Summary Functions
###############################################################################
print_summary() {
print_header "Layer 1B: Nodito Setup Complete! 🎉"
echo "Summary of what was configured:"
echo ""
print_success "Nodito bootstrapped with SSH keys"
print_success "counterweight user created"
print_success "Community repositories configured"
print_success "Root login and password auth disabled"
if grep -q "^zfs_disk_1:" "$ANSIBLE_DIR/infra/nodito/nodito_vars.yml" 2>/dev/null; then
print_success "ZFS storage pool configured (if you ran it)"
fi
echo ""
print_warning "Important Security Changes:"
echo " • Root SSH login is now disabled"
echo " • Always use 'counterweight' user for SSH access"
echo " • Password authentication is disabled"
echo ""
print_info "Proxmox Web UI:"
local nodito_hosts=$(get_hosts_from_inventory "nodito")
echo " • Access at: https://$nodito_hosts:8006"
echo " • Clear browser cache (Ctrl+Shift+R) to avoid UI issues"
echo ""
print_info "Next steps:"
echo " 1. Test SSH: ssh -i ~/.ssh/counterganzua counterweight@<nodito-ip>"
echo " 2. Access Proxmox web UI and verify community repos"
echo " 3. Create VMs on Proxmox (if needed)"
echo " 4. Proceed to Layer 2: ./scripts/setup_layer_2.sh"
echo ""
}
###############################################################################
# Main Execution
###############################################################################
main() {
clear
print_header "🖥️ Layer 1B: Nodito (Proxmox) Setup"
echo "This script will configure your Nodito Proxmox server."
echo ""
print_info "Target: nodito (Proxmox server)"
echo ""
if ! confirm_action "Continue with Layer 1B setup?"; then
echo "Setup cancelled."
exit 0
fi
check_layer_0_complete
check_nodito_configured
# Nodito Setup
local setup_failed=false
setup_nodito_bootstrap || setup_failed=true
setup_nodito_community_repos || setup_failed=true
setup_nodito_zfs || setup_failed=true
setup_nodito_cloud_template || setup_failed=true
verify_layer_1b
if [ "$setup_failed" = true ]; then
print_warning "Some optional steps were skipped - this is normal"
fi
print_summary
}
# Run main function
main "$@"

397
scripts/setup_layer_2.sh Executable file
View file

@ -0,0 +1,397 @@
#!/bin/bash
###############################################################################
# Layer 2: General Infrastructure Tools
#
# This script installs rsync and docker on the machines that need them.
# Must be run after Layer 1A (VPS) or Layer 1B (Nodito) is complete.
###############################################################################
set -e # Exit on error
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Project root directory
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
ANSIBLE_DIR="$PROJECT_ROOT/ansible"
###############################################################################
# Helper Functions
###############################################################################
print_header() {
echo -e "\n${BLUE}========================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}========================================${NC}\n"
}
print_success() {
echo -e "${GREEN}${NC} $1"
}
print_error() {
echo -e "${RED}${NC} $1"
}
print_warning() {
echo -e "${YELLOW}${NC} $1"
}
print_info() {
echo -e "${BLUE}${NC} $1"
}
confirm_action() {
local prompt="$1"
local response
read -p "$(echo -e ${YELLOW}${prompt}${NC} [y/N]: )" response
[[ "$response" =~ ^[Yy]$ ]]
}
###############################################################################
# Verification Functions
###############################################################################
check_layer_0_complete() {
print_header "Verifying Layer 0 Prerequisites"
local errors=0
if [ -z "$VIRTUAL_ENV" ]; then
print_error "Virtual environment not activated"
echo "Run: source venv/bin/activate"
((errors++))
else
print_success "Virtual environment activated"
fi
if ! command -v ansible &> /dev/null; then
print_error "Ansible not found"
((errors++))
else
print_success "Ansible found"
fi
if [ ! -f "$ANSIBLE_DIR/inventory.ini" ]; then
print_error "inventory.ini not found"
((errors++))
else
print_success "inventory.ini exists"
fi
if [ $errors -gt 0 ]; then
print_error "Layer 0 is not complete"
exit 1
fi
print_success "Layer 0 prerequisites verified"
}
get_hosts_from_inventory() {
local group="$1"
cd "$ANSIBLE_DIR"
ansible-inventory -i inventory.ini --list | \
python3 -c "import sys, json; data=json.load(sys.stdin); print(' '.join(data.get('$group', {}).get('hosts', [])))" 2>/dev/null || echo ""
}
check_ssh_connectivity() {
print_header "Testing SSH Connectivity"
local ssh_key=$(grep "ansible_ssh_private_key_file" "$ANSIBLE_DIR/inventory.ini" | head -n1 | sed 's/.*ansible_ssh_private_key_file=\([^ ]*\).*/\1/')
ssh_key="${ssh_key/#\~/$HOME}"
local all_good=true
for group in vipy watchtower spacey nodito; do
local hosts=$(get_hosts_from_inventory "$group")
if [ -n "$hosts" ]; then
for host in $hosts; do
print_info "Testing SSH to $host as counterweight..."
if timeout 10 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$host "echo 'SSH OK'" &>/dev/null; then
print_success "SSH to $host: OK"
else
print_error "Cannot SSH to $host as counterweight"
print_warning "Make sure Layer 1A or 1B is complete for this host"
all_good=false
fi
done
fi
done
if [ "$all_good" = false ]; then
echo ""
print_error "SSH connectivity test failed"
print_info "Ensure Layer 1A (VPS) or Layer 1B (Nodito) is complete"
echo ""
if ! confirm_action "Continue anyway?"; then
exit 1
fi
fi
echo ""
print_success "SSH connectivity verified"
}
###############################################################################
# rsync Installation
###############################################################################
install_rsync() {
print_header "Installing rsync"
cd "$ANSIBLE_DIR"
print_info "rsync is needed for backup operations"
print_info "Recommended hosts: vipy, watchtower, lapy"
echo ""
# Show available hosts
echo "Available hosts in inventory:"
for group in vipy watchtower spacey nodito lapy; do
local hosts=$(get_hosts_from_inventory "$group")
if [ -n "$hosts" ]; then
echo " [$group]: $hosts"
fi
done
echo ""
print_info "Installation options:"
echo " 1. Install on recommended hosts (vipy, watchtower, lapy)"
echo " 2. Install on all hosts"
echo " 3. Custom selection (specify groups)"
echo " 4. Skip rsync installation"
echo ""
echo -e -n "${BLUE}Choose option${NC} [1-4]: "
read option
local limit_hosts=""
case "$option" in
1)
limit_hosts="vipy,watchtower,lapy"
print_info "Installing rsync on: vipy, watchtower, lapy"
;;
2)
limit_hosts="all"
print_info "Installing rsync on: all hosts"
;;
3)
echo -e -n "${BLUE}Enter groups (comma-separated, e.g., vipy,watchtower,nodito)${NC}: "
read limit_hosts
print_info "Installing rsync on: $limit_hosts"
;;
4)
print_warning "Skipping rsync installation"
return 1
;;
*)
print_error "Invalid option"
return 1
;;
esac
echo ""
if ! confirm_action "Proceed with rsync installation?"; then
print_warning "Skipped rsync installation"
return 1
fi
print_info "Running: ansible-playbook -i inventory.ini infra/900_install_rsync.yml --limit $limit_hosts"
echo ""
if ansible-playbook -i inventory.ini infra/900_install_rsync.yml --limit "$limit_hosts"; then
print_success "rsync installation complete"
return 0
else
print_error "rsync installation failed"
return 1
fi
}
###############################################################################
# Docker Installation
###############################################################################
install_docker() {
print_header "Installing Docker and Docker Compose"
cd "$ANSIBLE_DIR"
print_info "Docker is needed for containerized services"
print_info "Recommended hosts: vipy, watchtower"
echo ""
# Show available hosts (exclude lapy - docker on laptop is optional)
echo "Available hosts in inventory:"
for group in vipy watchtower spacey nodito; do
local hosts=$(get_hosts_from_inventory "$group")
if [ -n "$hosts" ]; then
echo " [$group]: $hosts"
fi
done
echo ""
print_info "Installation options:"
echo " 1. Install on recommended hosts (vipy, watchtower)"
echo " 2. Install on all hosts"
echo " 3. Custom selection (specify groups)"
echo " 4. Skip docker installation"
echo ""
echo -e -n "${BLUE}Choose option${NC} [1-4]: "
read option
local limit_hosts=""
case "$option" in
1)
limit_hosts="vipy,watchtower"
print_info "Installing Docker on: vipy, watchtower"
;;
2)
limit_hosts="all"
print_info "Installing Docker on: all hosts"
;;
3)
echo -e -n "${BLUE}Enter groups (comma-separated, e.g., vipy,watchtower,nodito)${NC}: "
read limit_hosts
print_info "Installing Docker on: $limit_hosts"
;;
4)
print_warning "Skipping Docker installation"
return 1
;;
*)
print_error "Invalid option"
return 1
;;
esac
echo ""
if ! confirm_action "Proceed with Docker installation?"; then
print_warning "Skipped Docker installation"
return 1
fi
print_info "Running: ansible-playbook -i inventory.ini infra/910_docker_playbook.yml --limit $limit_hosts"
echo ""
if ansible-playbook -i inventory.ini infra/910_docker_playbook.yml --limit "$limit_hosts"; then
print_success "Docker installation complete"
print_warning "You may need to log out and back in for docker group to take effect"
return 0
else
print_error "Docker installation failed"
return 1
fi
}
###############################################################################
# Verification Functions
###############################################################################
verify_installations() {
print_header "Verifying Installations"
cd "$ANSIBLE_DIR"
local ssh_key=$(grep "ansible_ssh_private_key_file" "$ANSIBLE_DIR/inventory.ini" | head -n1 | sed 's/.*ansible_ssh_private_key_file=\([^ ]*\).*/\1/')
ssh_key="${ssh_key/#\~/$HOME}"
echo "Checking installed tools on hosts..."
echo ""
# Check all remote hosts
for group in vipy watchtower spacey nodito; do
local hosts=$(get_hosts_from_inventory "$group")
if [ -n "$hosts" ]; then
for host in $hosts; do
print_info "Checking $host..."
# Check rsync
if timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$host "command -v rsync" &>/dev/null; then
print_success "$host: rsync installed"
else
print_warning "$host: rsync not found (may not be needed)"
fi
# Check docker
if timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$host "command -v docker" &>/dev/null; then
print_success "$host: docker installed"
# Check docker service
if timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$host "sudo systemctl is-active docker" &>/dev/null; then
print_success "$host: docker service running"
else
print_warning "$host: docker service not running"
fi
else
print_warning "$host: docker not found (may not be needed)"
fi
echo ""
done
fi
done
}
###############################################################################
# Summary Functions
###############################################################################
print_summary() {
print_header "Layer 2 Setup Complete! 🎉"
echo "Summary:"
echo ""
print_success "Infrastructure tools installed on specified hosts"
echo ""
print_info "What was installed:"
echo " • rsync - for backup operations"
echo " • docker + docker compose - for containerized services"
echo ""
print_info "Next steps:"
echo " 1. Proceed to Layer 3: ./scripts/setup_layer_3_caddy.sh"
echo ""
}
###############################################################################
# Main Execution
###############################################################################
main() {
clear
print_header "🔧 Layer 2: General Infrastructure Tools"
echo "This script will install rsync and docker on your infrastructure."
echo ""
if ! confirm_action "Continue with Layer 2 setup?"; then
echo "Setup cancelled."
exit 0
fi
check_layer_0_complete
check_ssh_connectivity
# Install tools
install_rsync
echo ""
install_docker
verify_installations
print_summary
}
# Run main function
main "$@"

345
scripts/setup_layer_3_caddy.sh Executable file
View file

@ -0,0 +1,345 @@
#!/bin/bash
###############################################################################
# Layer 3: Reverse Proxy (Caddy)
#
# This script deploys Caddy reverse proxy on VPS machines.
# Must be run after Layer 1A (VPS setup) is complete.
###############################################################################
set -e # Exit on error
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Project root directory
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
ANSIBLE_DIR="$PROJECT_ROOT/ansible"
###############################################################################
# Helper Functions
###############################################################################
print_header() {
echo -e "\n${BLUE}========================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}========================================${NC}\n"
}
print_success() {
echo -e "${GREEN}${NC} $1"
}
print_error() {
echo -e "${RED}${NC} $1"
}
print_warning() {
echo -e "${YELLOW}${NC} $1"
}
print_info() {
echo -e "${BLUE}${NC} $1"
}
confirm_action() {
local prompt="$1"
local response
read -p "$(echo -e ${YELLOW}${prompt}${NC} [y/N]: )" response
[[ "$response" =~ ^[Yy]$ ]]
}
###############################################################################
# Verification Functions
###############################################################################
check_layer_0_complete() {
print_header "Verifying Layer 0 Prerequisites"
local errors=0
if [ -z "$VIRTUAL_ENV" ]; then
print_error "Virtual environment not activated"
echo "Run: source venv/bin/activate"
((errors++))
else
print_success "Virtual environment activated"
fi
if ! command -v ansible &> /dev/null; then
print_error "Ansible not found"
((errors++))
else
print_success "Ansible found"
fi
if [ ! -f "$ANSIBLE_DIR/inventory.ini" ]; then
print_error "inventory.ini not found"
((errors++))
else
print_success "inventory.ini exists"
fi
if [ $errors -gt 0 ]; then
print_error "Layer 0 is not complete"
exit 1
fi
print_success "Layer 0 prerequisites verified"
}
get_hosts_from_inventory() {
local group="$1"
cd "$ANSIBLE_DIR"
ansible-inventory -i inventory.ini --list | \
python3 -c "import sys, json; data=json.load(sys.stdin); print(' '.join(data.get('$group', {}).get('hosts', [])))" 2>/dev/null || echo ""
}
check_target_hosts() {
print_header "Checking Target Hosts"
local has_hosts=false
print_info "Caddy will be deployed to these hosts:"
echo ""
for group in vipy watchtower spacey; do
local hosts=$(get_hosts_from_inventory "$group")
if [ -n "$hosts" ]; then
echo " [$group]: $hosts"
has_hosts=true
else
print_warning "[$group]: not configured (skipping)"
fi
done
echo ""
if [ "$has_hosts" = false ]; then
print_error "No target hosts configured for Caddy"
print_info "Caddy needs vipy, watchtower, or spacey in inventory.ini"
exit 1
fi
print_success "Target hosts verified"
}
check_ssh_connectivity() {
print_header "Testing SSH Connectivity"
local ssh_key=$(grep "ansible_ssh_private_key_file" "$ANSIBLE_DIR/inventory.ini" | head -n1 | sed 's/.*ansible_ssh_private_key_file=\([^ ]*\).*/\1/')
ssh_key="${ssh_key/#\~/$HOME}"
local all_good=true
for group in vipy watchtower spacey; do
local hosts=$(get_hosts_from_inventory "$group")
if [ -n "$hosts" ]; then
for host in $hosts; do
print_info "Testing SSH to $host as counterweight..."
if timeout 10 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$host "echo 'SSH OK'" &>/dev/null; then
print_success "SSH to $host: OK"
else
print_error "Cannot SSH to $host as counterweight"
print_warning "Make sure Layer 1A is complete for this host"
all_good=false
fi
done
fi
done
if [ "$all_good" = false ]; then
echo ""
print_error "SSH connectivity test failed"
print_info "Ensure Layer 1A (VPS setup) is complete"
echo ""
if ! confirm_action "Continue anyway?"; then
exit 1
fi
fi
echo ""
print_success "SSH connectivity verified"
}
###############################################################################
# Caddy Deployment
###############################################################################
deploy_caddy() {
print_header "Deploying Caddy"
cd "$ANSIBLE_DIR"
print_info "This will:"
echo " • Install Caddy from official repositories"
echo " • Configure Caddy service"
echo " • Open firewall ports 80/443"
echo " • Create sites-enabled directory structure"
echo " • Enable automatic HTTPS with Let's Encrypt"
echo ""
print_info "Target hosts: vipy, watchtower, spacey (if configured)"
echo ""
print_warning "Important:"
echo " • Caddy will start with empty configuration"
echo " • Services will add their own config files in later layers"
echo " • Ports 80/443 must be available on the VPSs"
echo ""
if ! confirm_action "Proceed with Caddy deployment?"; then
print_warning "Skipped Caddy deployment"
return 1
fi
print_info "Running: ansible-playbook -i inventory.ini services/caddy_playbook.yml"
echo ""
if ansible-playbook -i inventory.ini services/caddy_playbook.yml; then
print_success "Caddy deployment complete"
return 0
else
print_error "Caddy deployment failed"
return 1
fi
}
###############################################################################
# Verification Functions
###############################################################################
verify_caddy() {
print_header "Verifying Caddy Installation"
cd "$ANSIBLE_DIR"
local ssh_key=$(grep "ansible_ssh_private_key_file" "$ANSIBLE_DIR/inventory.ini" | head -n1 | sed 's/.*ansible_ssh_private_key_file=\([^ ]*\).*/\1/')
ssh_key="${ssh_key/#\~/$HOME}"
echo "Checking Caddy on each host..."
echo ""
for group in vipy watchtower spacey; do
local hosts=$(get_hosts_from_inventory "$group")
if [ -n "$hosts" ]; then
for host in $hosts; do
print_info "Checking $host..."
# Check if caddy is installed
if timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$host "command -v caddy" &>/dev/null; then
print_success "$host: Caddy installed"
else
print_error "$host: Caddy not found"
continue
fi
# Check if caddy service is running
if timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$host "sudo systemctl is-active caddy" &>/dev/null; then
print_success "$host: Caddy service running"
else
print_error "$host: Caddy service not running"
fi
# Check if sites-enabled directory exists
if timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$host "test -d /etc/caddy/sites-enabled" &>/dev/null; then
print_success "$host: sites-enabled directory exists"
else
print_warning "$host: sites-enabled directory not found"
fi
# Check if ports 80/443 are open
if timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$host "sudo ufw status | grep -E '80|443'" &>/dev/null; then
print_success "$host: Firewall ports 80/443 open"
else
print_warning "$host: Could not verify firewall ports"
fi
echo ""
done
fi
done
}
###############################################################################
# Summary Functions
###############################################################################
print_summary() {
print_header "Layer 3 Setup Complete! 🎉"
echo "Summary of what was configured:"
echo ""
print_success "Caddy installed on VPS hosts"
print_success "Caddy service running"
print_success "Firewall ports 80/443 opened"
print_success "Sites-enabled directory structure created"
echo ""
print_info "What Caddy provides:"
echo " • Automatic HTTPS with Let's Encrypt"
echo " • Reverse proxy for all web services"
echo " • HTTP/2 support"
echo " • Simple per-service configuration"
echo ""
print_info "How services use Caddy:"
echo " • Each service adds a config file to /etc/caddy/sites-enabled/"
echo " • Main Caddyfile imports all configs"
echo " • Caddy automatically manages SSL certificates"
echo ""
print_warning "Important Notes:"
echo " • Caddy is currently running with default/empty config"
echo " • Services deployed in later layers will add their configs"
echo " • DNS must point to your VPS IPs for SSL to work"
echo ""
print_info "Next steps:"
echo " 1. Verify Caddy is accessible (optional): curl http://<vps-ip>"
echo " 2. Proceed to Layer 4: ./scripts/setup_layer_4_monitoring.sh"
echo ""
}
###############################################################################
# Main Execution
###############################################################################
main() {
clear
print_header "🌐 Layer 3: Reverse Proxy (Caddy)"
echo "This script will deploy Caddy reverse proxy on your VPS machines."
echo ""
print_info "Targets: vipy, watchtower, spacey"
echo ""
if ! confirm_action "Continue with Layer 3 setup?"; then
echo "Setup cancelled."
exit 0
fi
check_layer_0_complete
check_target_hosts
check_ssh_connectivity
# Deploy Caddy
if deploy_caddy; then
verify_caddy
print_summary
else
print_error "Caddy deployment failed"
exit 1
fi
}
# Run main function
main "$@"

View file

@ -0,0 +1,768 @@
#!/bin/bash
###############################################################################
# Layer 4: Core Monitoring & Notifications
#
# This script deploys ntfy and Uptime Kuma on watchtower.
# Must be run after Layers 1A, 2, and 3 are complete.
###############################################################################
set -e # Exit on error
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Project root directory
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
ANSIBLE_DIR="$PROJECT_ROOT/ansible"
###############################################################################
# Helper Functions
###############################################################################
print_header() {
echo -e "\n${BLUE}========================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}========================================${NC}\n"
}
print_success() {
echo -e "${GREEN}${NC} $1"
}
print_error() {
echo -e "${RED}${NC} $1"
}
print_warning() {
echo -e "${YELLOW}${NC} $1"
}
print_info() {
echo -e "${BLUE}${NC} $1"
}
confirm_action() {
local prompt="$1"
local response
read -p "$(echo -e ${YELLOW}${prompt}${NC} [y/N]: )" response
[[ "$response" =~ ^[Yy]$ ]]
}
###############################################################################
# Verification Functions
###############################################################################
check_prerequisites() {
print_header "Verifying Prerequisites"
local errors=0
if [ -z "$VIRTUAL_ENV" ]; then
print_error "Virtual environment not activated"
echo "Run: source venv/bin/activate"
((errors++))
else
print_success "Virtual environment activated"
fi
if ! command -v ansible &> /dev/null; then
print_error "Ansible not found"
((errors++))
else
print_success "Ansible found"
fi
if [ ! -f "$ANSIBLE_DIR/inventory.ini" ]; then
print_error "inventory.ini not found"
((errors++))
else
print_success "inventory.ini exists"
fi
# Check if watchtower is configured
if ! grep -q "^\[watchtower\]" "$ANSIBLE_DIR/inventory.ini"; then
print_error "watchtower not configured in inventory.ini"
print_info "Layer 4 requires watchtower VPS"
((errors++))
else
print_success "watchtower configured in inventory"
fi
if [ $errors -gt 0 ]; then
print_error "Prerequisites not met"
exit 1
fi
print_success "Prerequisites verified"
}
check_vars_files() {
print_header "Checking Configuration Files"
# Check services_config.yml
if [ ! -f "$ANSIBLE_DIR/services_config.yml" ]; then
print_error "services_config.yml not found"
print_info "This file should have been created in Layer 0"
exit 1
fi
print_success "services_config.yml exists"
# Show configured subdomains
local ntfy_sub=$(grep "^ ntfy:" "$ANSIBLE_DIR/services_config.yml" | awk '{print $2}' 2>/dev/null || echo "ntfy")
local uptime_sub=$(grep "^ uptime_kuma:" "$ANSIBLE_DIR/services_config.yml" | awk '{print $2}' 2>/dev/null || echo "uptime")
print_info "Configured subdomains:"
echo " • ntfy: $ntfy_sub"
echo " • uptime_kuma: $uptime_sub"
echo ""
}
check_dns_configuration() {
print_header "Validating DNS Configuration"
cd "$ANSIBLE_DIR"
# Get watchtower IP
local watchtower_ip=$(ansible-inventory -i inventory.ini --list | python3 -c "import sys, json; data=json.load(sys.stdin); hosts=data.get('watchtower', {}).get('hosts', []); print(hosts[0] if hosts else '')" 2>/dev/null)
if [ -z "$watchtower_ip" ]; then
print_error "Could not determine watchtower IP from inventory"
return 1
fi
print_info "Watchtower IP: $watchtower_ip"
echo ""
# Get domain from infra_vars.yml
local root_domain=$(grep "^root_domain:" "$ANSIBLE_DIR/infra_vars.yml" | awk '{print $2}' 2>/dev/null)
if [ -z "$root_domain" ]; then
print_error "Could not determine root_domain from infra_vars.yml"
return 1
fi
# Get subdomains from centralized config
local ntfy_subdomain="ntfy"
local uptime_subdomain="uptime"
if [ -f "$ANSIBLE_DIR/services_config.yml" ]; then
ntfy_subdomain=$(grep "^ ntfy:" "$ANSIBLE_DIR/services_config.yml" | awk '{print $2}' 2>/dev/null || echo "ntfy")
uptime_subdomain=$(grep "^ uptime_kuma:" "$ANSIBLE_DIR/services_config.yml" | awk '{print $2}' 2>/dev/null || echo "uptime")
fi
local ntfy_fqdn="${ntfy_subdomain}.${root_domain}"
local uptime_fqdn="${uptime_subdomain}.${root_domain}"
print_info "Checking DNS records..."
echo ""
local dns_ok=true
# Check ntfy DNS
print_info "Checking $ntfy_fqdn..."
if command -v dig &> /dev/null; then
local ntfy_resolved=$(dig +short "$ntfy_fqdn" | head -n1)
if [ "$ntfy_resolved" = "$watchtower_ip" ]; then
print_success "$ntfy_fqdn$ntfy_resolved"
elif [ -n "$ntfy_resolved" ]; then
print_error "$ntfy_fqdn$ntfy_resolved (expected $watchtower_ip)"
dns_ok=false
else
print_error "$ntfy_fqdn does not resolve"
dns_ok=false
fi
else
print_warning "dig command not found, skipping DNS validation"
print_info "Install dnsutils/bind-tools to enable DNS validation"
return 1
fi
# Check Uptime Kuma DNS
print_info "Checking $uptime_fqdn..."
if command -v dig &> /dev/null; then
local uptime_resolved=$(dig +short "$uptime_fqdn" | head -n1)
if [ "$uptime_resolved" = "$watchtower_ip" ]; then
print_success "$uptime_fqdn$uptime_resolved"
elif [ -n "$uptime_resolved" ]; then
print_error "$uptime_fqdn$uptime_resolved (expected $watchtower_ip)"
dns_ok=false
else
print_error "$uptime_fqdn does not resolve"
dns_ok=false
fi
fi
echo ""
if [ "$dns_ok" = false ]; then
print_error "DNS validation failed"
print_info "Please configure DNS records:"
echo "$ntfy_fqdn$watchtower_ip"
echo "$uptime_fqdn$watchtower_ip"
echo ""
print_warning "DNS changes can take time to propagate (up to 24-48 hours)"
echo ""
if ! confirm_action "Continue anyway? (SSL certificates will fail without proper DNS)"; then
exit 1
fi
else
print_success "DNS validation passed"
fi
}
###############################################################################
# ntfy Deployment
###############################################################################
deploy_ntfy() {
print_header "Deploying ntfy (Notification Service)"
cd "$ANSIBLE_DIR"
print_info "ntfy requires admin credentials for authentication"
echo ""
# Check if env vars are set
if [ -z "$NTFY_USER" ] || [ -z "$NTFY_PASSWORD" ]; then
print_warning "NTFY_USER and NTFY_PASSWORD environment variables not set"
echo ""
print_info "Please enter credentials for ntfy admin user:"
echo ""
echo -e -n "${BLUE}ntfy admin username${NC} [admin]: "
read ntfy_user
ntfy_user="${ntfy_user:-admin}"
echo -e -n "${BLUE}ntfy admin password${NC}: "
read -s ntfy_password
echo ""
if [ -z "$ntfy_password" ]; then
print_error "Password cannot be empty"
return 1
fi
export NTFY_USER="$ntfy_user"
export NTFY_PASSWORD="$ntfy_password"
else
print_success "Using NTFY_USER and NTFY_PASSWORD from environment"
fi
echo ""
print_info "This will:"
echo " • Install ntfy from official repositories"
echo " • Configure ntfy with authentication (deny-all by default)"
echo " • Create admin user: $NTFY_USER"
echo " • Set up Caddy reverse proxy"
echo ""
if ! confirm_action "Proceed with ntfy deployment?"; then
print_warning "Skipped ntfy deployment"
return 1
fi
print_info "Running: ansible-playbook -i inventory.ini services/ntfy/deploy_ntfy_playbook.yml"
echo ""
if ansible-playbook -i inventory.ini services/ntfy/deploy_ntfy_playbook.yml; then
print_success "ntfy deployment complete"
echo ""
print_info "ntfy is now available at your configured subdomain"
print_info "Admin user: $NTFY_USER"
return 0
else
print_error "ntfy deployment failed"
return 1
fi
}
###############################################################################
# Uptime Kuma Deployment
###############################################################################
deploy_uptime_kuma() {
print_header "Deploying Uptime Kuma (Monitoring Platform)"
cd "$ANSIBLE_DIR"
print_info "This will:"
echo " • Deploy Uptime Kuma via Docker"
echo " • Configure Caddy reverse proxy"
echo " • Set up data persistence"
echo ""
if ! confirm_action "Proceed with Uptime Kuma deployment?"; then
print_warning "Skipped Uptime Kuma deployment"
return 1
fi
print_info "Running: ansible-playbook -i inventory.ini services/uptime_kuma/deploy_uptime_kuma_playbook.yml"
echo ""
if ansible-playbook -i inventory.ini services/uptime_kuma/deploy_uptime_kuma_playbook.yml; then
print_success "Uptime Kuma deployment complete"
echo ""
print_warning "IMPORTANT: First-time setup required"
echo " 1. Access Uptime Kuma at your configured subdomain"
echo " 2. Create admin user on first visit"
echo " 3. Update ansible/infra_secrets.yml with credentials"
return 0
else
print_error "Uptime Kuma deployment failed"
return 1
fi
}
###############################################################################
# Backup Configuration
###############################################################################
setup_uptime_kuma_backup() {
print_header "Setting Up Uptime Kuma Backup (Optional)"
cd "$ANSIBLE_DIR"
print_info "This will set up automated backups to lapy"
echo ""
if ! confirm_action "Set up Uptime Kuma backup to lapy?"; then
print_warning "Skipped backup setup"
return 0
fi
# Check if rsync is available
print_info "Verifying rsync is installed on watchtower and lapy..."
if ! ansible watchtower -i inventory.ini -m shell -a "command -v rsync" &>/dev/null; then
print_error "rsync not found on watchtower"
print_info "Run Layer 2 to install rsync"
print_warning "Backup setup skipped - rsync not available"
return 0
fi
print_info "Running: ansible-playbook -i inventory.ini services/uptime_kuma/setup_backup_uptime_kuma_to_lapy.yml"
echo ""
if ansible-playbook -i inventory.ini services/uptime_kuma/setup_backup_uptime_kuma_to_lapy.yml; then
print_success "Uptime Kuma backup configured"
print_info "Backups will run periodically via cron"
return 0
else
print_error "Backup setup failed"
return 1
fi
}
###############################################################################
# Post-Deployment Configuration
###############################################################################
setup_ntfy_notification() {
print_header "Setting Up ntfy Notification in Uptime Kuma (Optional)"
cd "$ANSIBLE_DIR"
print_info "This will automatically configure ntfy as a notification method in Uptime Kuma"
print_warning "Prerequisites:"
echo " • Uptime Kuma admin account must be created first"
echo " • infra_secrets.yml must have Uptime Kuma credentials"
echo ""
if ! confirm_action "Set up ntfy notification in Uptime Kuma?"; then
print_warning "Skipped ntfy notification setup"
print_info "You can set this up manually or run this script again later"
return 0
fi
# Check if infra_secrets.yml has Uptime Kuma credentials
if ! grep -q "uptime_kuma_username:" "$ANSIBLE_DIR/infra_secrets.yml" 2>/dev/null || \
! grep -q "uptime_kuma_password:" "$ANSIBLE_DIR/infra_secrets.yml" 2>/dev/null; then
print_error "Uptime Kuma credentials not found in infra_secrets.yml"
print_info "Please complete Step 1 and 2 of post-deployment steps first:"
echo " 1. Create admin user in Uptime Kuma web UI"
echo " 2. Add credentials to ansible/infra_secrets.yml"
print_warning "Skipped - you can run this script again after completing those steps"
return 0
fi
# Check credentials are not empty
local uk_user=$(grep "^uptime_kuma_username:" "$ANSIBLE_DIR/infra_secrets.yml" | awk '{print $2}' | tr -d '"' | tr -d "'")
local uk_pass=$(grep "^uptime_kuma_password:" "$ANSIBLE_DIR/infra_secrets.yml" | awk '{print $2}' | tr -d '"' | tr -d "'")
if [ -z "$uk_user" ] || [ -z "$uk_pass" ]; then
print_error "Uptime Kuma credentials are empty in infra_secrets.yml"
print_info "Please update ansible/infra_secrets.yml with your credentials"
return 0
fi
print_success "Found Uptime Kuma credentials in infra_secrets.yml"
print_info "Running playbook to configure ntfy notification..."
echo ""
if ansible-playbook -i inventory.ini services/ntfy/setup_ntfy_uptime_kuma_notification.yml; then
print_success "ntfy notification configured in Uptime Kuma"
print_info "You can now use ntfy for all your monitors!"
return 0
else
print_error "Failed to configure ntfy notification"
print_info "You can set this up manually or run the playbook again later:"
echo " ansible-playbook -i inventory.ini services/ntfy/setup_ntfy_uptime_kuma_notification.yml"
return 0
fi
}
###############################################################################
# Verification Functions
###############################################################################
verify_deployments() {
print_header "Verifying Deployments"
cd "$ANSIBLE_DIR"
local ssh_key=$(grep "ansible_ssh_private_key_file" "$ANSIBLE_DIR/inventory.ini" | head -n1 | sed 's/.*ansible_ssh_private_key_file=\([^ ]*\).*/\1/')
ssh_key="${ssh_key/#\~/$HOME}"
local watchtower_host=$(ansible-inventory -i inventory.ini --list | python3 -c "import sys, json; data=json.load(sys.stdin); print(' '.join(data.get('watchtower', {}).get('hosts', [])))" 2>/dev/null)
if [ -z "$watchtower_host" ]; then
print_error "Could not determine watchtower host"
return
fi
print_info "Checking services on watchtower ($watchtower_host)..."
echo ""
# Check ntfy
if timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$watchtower_host "systemctl is-active ntfy" &>/dev/null; then
print_success "ntfy service running"
else
print_warning "ntfy service not running or not installed"
fi
# Check Uptime Kuma docker container
if timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$watchtower_host "docker ps | grep uptime-kuma" &>/dev/null; then
print_success "Uptime Kuma container running"
else
print_warning "Uptime Kuma container not running"
fi
# Check Caddy configs
if timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$watchtower_host "test -f /etc/caddy/sites-enabled/ntfy.conf" &>/dev/null; then
print_success "ntfy Caddy config exists"
else
print_warning "ntfy Caddy config not found"
fi
if timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$watchtower_host "test -f /etc/caddy/sites-enabled/uptime-kuma.conf" &>/dev/null; then
print_success "Uptime Kuma Caddy config exists"
else
print_warning "Uptime Kuma Caddy config not found"
fi
echo ""
}
verify_final_setup() {
print_header "Final Verification - Post-Deployment Steps"
cd "$ANSIBLE_DIR"
print_info "Checking if all post-deployment steps were completed..."
echo ""
local all_ok=true
# Check 1: infra_secrets.yml has Uptime Kuma credentials
print_info "Checking infra_secrets.yml..."
if grep -q "^uptime_kuma_username:" "$ANSIBLE_DIR/infra_secrets.yml" 2>/dev/null && \
grep -q "^uptime_kuma_password:" "$ANSIBLE_DIR/infra_secrets.yml" 2>/dev/null; then
local uk_user=$(grep "^uptime_kuma_username:" "$ANSIBLE_DIR/infra_secrets.yml" | awk '{print $2}' | tr -d '"' | tr -d "'")
local uk_pass=$(grep "^uptime_kuma_password:" "$ANSIBLE_DIR/infra_secrets.yml" | awk '{print $2}' | tr -d '"' | tr -d "'")
if [ -n "$uk_user" ] && [ -n "$uk_pass" ] && [ "$uk_user" != '""' ] && [ "$uk_pass" != '""' ]; then
print_success "Uptime Kuma credentials configured in infra_secrets.yml"
else
print_error "Uptime Kuma credentials are empty in infra_secrets.yml"
print_info "Please complete Step 2: Update infra_secrets.yml"
all_ok=false
fi
else
print_error "Uptime Kuma credentials not found in infra_secrets.yml"
print_info "Please complete Step 2: Update infra_secrets.yml"
all_ok=false
fi
echo ""
# Check 2: Can connect to Uptime Kuma API
print_info "Checking Uptime Kuma API access..."
if [ -n "$uk_user" ] && [ -n "$uk_pass" ]; then
# Create a test Python script to check API access
local test_script=$(mktemp)
cat > "$test_script" << 'EOFPYTHON'
import sys
import yaml
from uptime_kuma_api import UptimeKumaApi
try:
# Load config
with open('infra_vars.yml', 'r') as f:
infra_vars = yaml.safe_load(f)
with open('services/uptime_kuma/uptime_kuma_vars.yml', 'r') as f:
uk_vars = yaml.safe_load(f)
with open('infra_secrets.yml', 'r') as f:
secrets = yaml.safe_load(f)
root_domain = infra_vars.get('root_domain')
subdomain = uk_vars.get('uptime_kuma_subdomain', 'uptime')
url = f"https://{subdomain}.{root_domain}"
username = secrets.get('uptime_kuma_username')
password = secrets.get('uptime_kuma_password')
# Try to connect
api = UptimeKumaApi(url)
api.login(username, password)
# Check if we can get monitors
monitors = api.get_monitors()
print(f"SUCCESS:{len(monitors)}")
api.disconnect()
sys.exit(0)
except Exception as e:
print(f"ERROR:{str(e)}", file=sys.stderr)
sys.exit(1)
EOFPYTHON
local result=$(cd "$ANSIBLE_DIR" && python3 "$test_script" 2>&1)
rm -f "$test_script"
if echo "$result" | grep -q "^SUCCESS:"; then
local monitor_count=$(echo "$result" | grep "^SUCCESS:" | cut -d: -f2)
print_success "Successfully connected to Uptime Kuma API"
print_info "Current monitors: $monitor_count"
else
print_error "Cannot connect to Uptime Kuma API"
print_warning "This usually means:"
echo " • Admin account not created yet (Step 1)"
echo " • Wrong credentials in infra_secrets.yml (Step 2)"
echo " • Uptime Kuma not accessible"
all_ok=false
fi
else
print_warning "Skipping API check - credentials not configured"
all_ok=false
fi
echo ""
# Check 3: ntfy notification configured in Uptime Kuma
print_info "Checking ntfy notification configuration..."
if [ -n "$uk_user" ] && [ -n "$uk_pass" ]; then
local test_notif=$(mktemp)
cat > "$test_notif" << 'EOFPYTHON'
import sys
import yaml
from uptime_kuma_api import UptimeKumaApi
try:
# Load config
with open('infra_vars.yml', 'r') as f:
infra_vars = yaml.safe_load(f)
with open('services/uptime_kuma/uptime_kuma_vars.yml', 'r') as f:
uk_vars = yaml.safe_load(f)
with open('infra_secrets.yml', 'r') as f:
secrets = yaml.safe_load(f)
root_domain = infra_vars.get('root_domain')
subdomain = uk_vars.get('uptime_kuma_subdomain', 'uptime')
url = f"https://{subdomain}.{root_domain}"
username = secrets.get('uptime_kuma_username')
password = secrets.get('uptime_kuma_password')
# Connect
api = UptimeKumaApi(url)
api.login(username, password)
# Check for ntfy notification
notifications = api.get_notifications()
ntfy_found = any(n.get('type') == 'ntfy' for n in notifications)
if ntfy_found:
print("SUCCESS:ntfy notification configured")
else:
print("NOTFOUND:No ntfy notification found")
api.disconnect()
sys.exit(0)
except Exception as e:
print(f"ERROR:{str(e)}", file=sys.stderr)
sys.exit(1)
EOFPYTHON
local notif_result=$(cd "$ANSIBLE_DIR" && python3 "$test_notif" 2>&1)
rm -f "$test_notif"
if echo "$notif_result" | grep -q "^SUCCESS:"; then
print_success "ntfy notification is configured in Uptime Kuma"
elif echo "$notif_result" | grep -q "^NOTFOUND:"; then
print_warning "ntfy notification not yet configured"
print_info "Run the script again and choose 'yes' for ntfy notification setup"
print_info "Or complete Step 3 manually"
all_ok=false
else
print_warning "Could not verify ntfy notification (API access issue)"
fi
else
print_warning "Skipping ntfy check - credentials not configured"
fi
echo ""
# Summary
if [ "$all_ok" = true ]; then
print_success "All post-deployment steps completed! ✓"
echo ""
print_info "Layer 4 is fully configured and ready to use"
print_info "You can now proceed to Layer 6 (infrastructure monitoring)"
return 0
else
print_warning "Some post-deployment steps are incomplete"
echo ""
print_info "Complete these steps:"
echo " 1. Access Uptime Kuma web UI and create admin account"
echo " 2. Update ansible/infra_secrets.yml with credentials"
echo " 3. Run this script again to configure ntfy notification"
echo ""
print_info "You can also complete manually and verify with:"
echo " ./scripts/setup_layer_4_monitoring.sh"
return 1
fi
}
###############################################################################
# Summary Functions
###############################################################################
print_summary() {
print_header "Layer 4 Setup Complete! 🎉"
echo "Summary of what was configured:"
echo ""
print_success "ntfy notification service deployed"
print_success "Uptime Kuma monitoring platform deployed"
print_success "Caddy reverse proxy configured for both services"
echo ""
print_warning "REQUIRED POST-DEPLOYMENT STEPS:"
echo ""
echo "MANUAL (do these first):"
echo " 1. Access Uptime Kuma Web UI and create admin account"
echo " 2. Update ansible/infra_secrets.yml with credentials"
echo ""
echo "AUTOMATED (script can do these):"
echo " 3. Configure ntfy notification - script will offer to set this up"
echo " 4. Final verification - script will check everything"
echo ""
print_info "After completing steps 1 & 2, the script will:"
echo " • Automatically configure ntfy in Uptime Kuma"
echo " • Verify all post-deployment steps"
echo " • Tell you if anything is missing"
echo ""
print_warning "You MUST complete steps 1 & 2 before proceeding to Layer 6!"
echo ""
print_info "What these services enable:"
echo " • ntfy: Push notifications to your devices"
echo " • Uptime Kuma: Monitor all services and infrastructure"
echo " • Together: Complete monitoring and alerting solution"
echo ""
print_info "Next steps:"
echo " 1. Complete the post-deployment steps above"
echo " 2. Test ntfy: Send a test notification"
echo " 3. Test Uptime Kuma: Create a test monitor"
echo " 4. Proceed to Layer 5: ./scripts/setup_layer_5_headscale.sh (optional)"
echo " OR Layer 6: ./scripts/setup_layer_6_infra_monitoring.sh"
echo ""
}
###############################################################################
# Main Execution
###############################################################################
main() {
clear
print_header "📊 Layer 4: Core Monitoring & Notifications"
echo "This script will deploy ntfy and Uptime Kuma on watchtower."
echo ""
print_info "Services to deploy:"
echo " • ntfy (notification service)"
echo " • Uptime Kuma (monitoring platform)"
echo ""
if ! confirm_action "Continue with Layer 4 setup?"; then
echo "Setup cancelled."
exit 0
fi
check_prerequisites
check_vars_files
check_dns_configuration
# Deploy services (don't fail if skipped)
deploy_ntfy || true
echo ""
deploy_uptime_kuma || true
echo ""
setup_uptime_kuma_backup || true
echo ""
verify_deployments
# Always show summary and offer ntfy configuration
print_summary
echo ""
# Always ask about ntfy notification setup (regardless of deployment status)
print_header "Configure ntfy Notification in Uptime Kuma"
print_info "After creating your Uptime Kuma admin account and updating infra_secrets.yml,"
print_info "the script can automatically configure ntfy as a notification method."
echo ""
print_warning "Prerequisites:"
echo " 1. Access Uptime Kuma web UI and create admin account"
echo " 2. Update ansible/infra_secrets.yml with your credentials"
echo ""
# Always offer to set up ntfy notification
setup_ntfy_notification
# Final verification
echo ""
verify_final_setup
}
# Run main function
main "$@"

View file

@ -0,0 +1,494 @@
#!/bin/bash
###############################################################################
# Layer 5: VPN Infrastructure (Headscale)
#
# This script deploys Headscale and optionally joins machines to the mesh.
# Must be run after Layers 0, 1A, and 3 are complete.
# THIS LAYER IS OPTIONAL - skip to Layer 6 if you don't need VPN.
###############################################################################
set -e # Exit on error
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Project root directory
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
ANSIBLE_DIR="$PROJECT_ROOT/ansible"
###############################################################################
# Helper Functions
###############################################################################
print_header() {
echo -e "\n${BLUE}========================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}========================================${NC}\n"
}
print_success() {
echo -e "${GREEN}${NC} $1"
}
print_error() {
echo -e "${RED}${NC} $1"
}
print_warning() {
echo -e "${YELLOW}${NC} $1"
}
print_info() {
echo -e "${BLUE}${NC} $1"
}
confirm_action() {
local prompt="$1"
local response
read -p "$(echo -e ${YELLOW}${prompt}${NC} [y/N]: )" response
[[ "$response" =~ ^[Yy]$ ]]
}
###############################################################################
# Verification Functions
###############################################################################
check_prerequisites() {
print_header "Verifying Prerequisites"
local errors=0
if [ -z "$VIRTUAL_ENV" ]; then
print_error "Virtual environment not activated"
echo "Run: source venv/bin/activate"
((errors++))
else
print_success "Virtual environment activated"
fi
if ! command -v ansible &> /dev/null; then
print_error "Ansible not found"
((errors++))
else
print_success "Ansible found"
fi
if [ ! -f "$ANSIBLE_DIR/inventory.ini" ]; then
print_error "inventory.ini not found"
((errors++))
else
print_success "inventory.ini exists"
fi
# Check if spacey is configured
if ! grep -q "^\[spacey\]" "$ANSIBLE_DIR/inventory.ini"; then
print_error "spacey not configured in inventory.ini"
print_info "Layer 5 requires spacey VPS for Headscale server"
((errors++))
else
print_success "spacey configured in inventory"
fi
if [ $errors -gt 0 ]; then
print_error "Prerequisites not met"
exit 1
fi
print_success "Prerequisites verified"
}
get_hosts_from_inventory() {
local group="$1"
cd "$ANSIBLE_DIR"
ansible-inventory -i inventory.ini --list | \
python3 -c "import sys, json; data=json.load(sys.stdin); print(' '.join(data.get('$group', {}).get('hosts', [])))" 2>/dev/null || echo ""
}
check_vars_files() {
print_header "Checking Configuration Files"
# Check services_config.yml
if [ ! -f "$ANSIBLE_DIR/services_config.yml" ]; then
print_error "services_config.yml not found"
print_info "This file should have been created in Layer 0"
exit 1
fi
print_success "services_config.yml exists"
# Show configured subdomain
local hs_sub=$(grep "^ headscale:" "$ANSIBLE_DIR/services_config.yml" | awk '{print $2}' 2>/dev/null || echo "headscale")
print_info "Configured subdomain: headscale: $hs_sub"
echo ""
}
check_dns_configuration() {
print_header "Validating DNS Configuration"
cd "$ANSIBLE_DIR"
# Get spacey IP
local spacey_ip=$(ansible-inventory -i inventory.ini --list | python3 -c "import sys, json; data=json.load(sys.stdin); hosts=data.get('spacey', {}).get('hosts', []); print(hosts[0] if hosts else '')" 2>/dev/null)
if [ -z "$spacey_ip" ]; then
print_error "Could not determine spacey IP from inventory"
return 1
fi
print_info "Spacey IP: $spacey_ip"
echo ""
# Get domain from infra_vars.yml
local root_domain=$(grep "^root_domain:" "$ANSIBLE_DIR/infra_vars.yml" | awk '{print $2}' 2>/dev/null)
if [ -z "$root_domain" ]; then
print_error "Could not determine root_domain from infra_vars.yml"
return 1
fi
# Get subdomain from centralized config
local headscale_subdomain="headscale"
if [ -f "$ANSIBLE_DIR/services_config.yml" ]; then
headscale_subdomain=$(grep "^ headscale:" "$ANSIBLE_DIR/services_config.yml" | awk '{print $2}' 2>/dev/null || echo "headscale")
fi
local headscale_fqdn="${headscale_subdomain}.${root_domain}"
print_info "Checking DNS record..."
echo ""
# Check Headscale DNS
print_info "Checking $headscale_fqdn..."
if command -v dig &> /dev/null; then
local resolved=$(dig +short "$headscale_fqdn" | head -n1)
if [ "$resolved" = "$spacey_ip" ]; then
print_success "$headscale_fqdn$resolved"
elif [ -n "$resolved" ]; then
print_error "$headscale_fqdn$resolved (expected $spacey_ip)"
print_warning "DNS changes can take time to propagate (up to 24-48 hours)"
echo ""
if ! confirm_action "Continue anyway? (SSL certificates will fail without proper DNS)"; then
exit 1
fi
else
print_error "$headscale_fqdn does not resolve"
print_warning "DNS changes can take time to propagate"
echo ""
if ! confirm_action "Continue anyway? (SSL certificates will fail without proper DNS)"; then
exit 1
fi
fi
else
print_warning "dig command not found, skipping DNS validation"
print_info "Install dnsutils/bind-tools to enable DNS validation"
fi
echo ""
print_success "DNS validation complete"
}
###############################################################################
# Headscale Deployment
###############################################################################
deploy_headscale() {
print_header "Deploying Headscale Server"
cd "$ANSIBLE_DIR"
print_info "This will:"
echo " • Install Headscale on spacey"
echo " • Configure with deny-all ACL policy (you customize later)"
echo " • Create namespace for your network"
echo " • Set up Caddy reverse proxy"
echo " • Configure embedded DERP server"
echo ""
print_warning "After deployment, you MUST configure ACL policies for machines to communicate"
echo ""
if ! confirm_action "Proceed with Headscale deployment?"; then
print_warning "Skipped Headscale deployment"
return 1
fi
print_info "Running: ansible-playbook -i inventory.ini services/headscale/deploy_headscale_playbook.yml"
echo ""
if ansible-playbook -i inventory.ini services/headscale/deploy_headscale_playbook.yml; then
print_success "Headscale deployment complete"
return 0
else
print_error "Headscale deployment failed"
return 1
fi
}
###############################################################################
# Join Machines to Mesh
###############################################################################
join_machines_to_mesh() {
print_header "Join Machines to Mesh (Optional)"
cd "$ANSIBLE_DIR"
print_info "This will install Tailscale client and join machines to your Headscale mesh"
echo ""
# Show available hosts
echo "Available hosts to join:"
for group in vipy watchtower nodito lapy; do
local hosts=$(get_hosts_from_inventory "$group")
if [ -n "$hosts" ]; then
echo " [$group]: $hosts"
fi
done
echo ""
print_info "Join options:"
echo " 1. Join recommended machines (vipy, watchtower, nodito)"
echo " 2. Join all machines"
echo " 3. Custom selection (specify groups)"
echo " 4. Skip - join machines later manually"
echo ""
echo -e -n "${BLUE}Choose option${NC} [1-4]: "
read option
local limit_hosts=""
case "$option" in
1)
limit_hosts="vipy,watchtower,nodito"
print_info "Joining: vipy, watchtower, nodito"
;;
2)
limit_hosts="all"
print_info "Joining: all hosts"
;;
3)
echo -e -n "${BLUE}Enter groups (comma-separated, e.g., vipy,watchtower)${NC}: "
read limit_hosts
print_info "Joining: $limit_hosts"
;;
4)
print_warning "Skipping machine join - you can join manually later"
print_info "To join manually:"
echo " ansible-playbook -i inventory.ini infra/920_join_headscale_mesh.yml --limit <host>"
return 0
;;
*)
print_error "Invalid option"
return 0
;;
esac
echo ""
if ! confirm_action "Proceed with joining machines?"; then
print_warning "Skipped joining machines"
return 0
fi
print_info "Running: ansible-playbook -i inventory.ini infra/920_join_headscale_mesh.yml --limit $limit_hosts"
echo ""
if ansible-playbook -i inventory.ini infra/920_join_headscale_mesh.yml --limit "$limit_hosts"; then
print_success "Machines joined to mesh"
return 0
else
print_error "Failed to join some machines"
print_info "You can retry or join manually later"
return 0
fi
}
###############################################################################
# Backup Configuration
###############################################################################
setup_headscale_backup() {
print_header "Setting Up Headscale Backup (Optional)"
cd "$ANSIBLE_DIR"
print_info "This will set up automated backups to lapy"
echo ""
if ! confirm_action "Set up Headscale backup to lapy?"; then
print_warning "Skipped backup setup"
return 0
fi
# Check if rsync is available
print_info "Verifying rsync is installed on spacey and lapy..."
if ! ansible spacey -i inventory.ini -m shell -a "command -v rsync" &>/dev/null; then
print_error "rsync not found on spacey"
print_info "Run Layer 2 to install rsync"
print_warning "Backup setup skipped - rsync not available"
return 0
fi
print_info "Running: ansible-playbook -i inventory.ini services/headscale/setup_backup_headscale_to_lapy.yml"
echo ""
if ansible-playbook -i inventory.ini services/headscale/setup_backup_headscale_to_lapy.yml; then
print_success "Headscale backup configured"
print_info "Backups will run periodically via cron"
return 0
else
print_error "Backup setup failed"
return 0
fi
}
###############################################################################
# Verification Functions
###############################################################################
verify_deployment() {
print_header "Verifying Headscale Deployment"
cd "$ANSIBLE_DIR"
local ssh_key=$(grep "ansible_ssh_private_key_file" "$ANSIBLE_DIR/inventory.ini" | head -n1 | sed 's/.*ansible_ssh_private_key_file=\([^ ]*\).*/\1/')
ssh_key="${ssh_key/#\~/$HOME}"
local spacey_host=$(get_hosts_from_inventory "spacey")
if [ -z "$spacey_host" ]; then
print_error "Could not determine spacey host"
return
fi
print_info "Checking Headscale on spacey ($spacey_host)..."
echo ""
# Check Headscale service
if timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$spacey_host "systemctl is-active headscale" &>/dev/null; then
print_success "Headscale service running"
else
print_warning "Headscale service not running"
fi
# Check Caddy config
if timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$spacey_host "test -f /etc/caddy/sites-enabled/headscale.conf" &>/dev/null; then
print_success "Headscale Caddy config exists"
else
print_warning "Headscale Caddy config not found"
fi
# Check ACL file
if timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$spacey_host "test -f /etc/headscale/acl.json" &>/dev/null; then
print_success "ACL policy file exists"
else
print_warning "ACL policy file not found"
fi
# List nodes
print_info "Attempting to list connected nodes..."
local nodes_output=$(timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$spacey_host "sudo headscale nodes list" 2>/dev/null || echo "")
if [ -n "$nodes_output" ]; then
echo "$nodes_output"
else
print_warning "Could not list nodes (this is normal if no machines joined yet)"
fi
echo ""
}
###############################################################################
# Summary Functions
###############################################################################
print_summary() {
print_header "Layer 5 Setup Complete! 🎉"
echo "Summary of what was configured:"
echo ""
print_success "Headscale VPN server deployed on spacey"
print_success "Caddy reverse proxy configured"
print_success "Namespace created for your network"
echo ""
print_warning "CRITICAL POST-DEPLOYMENT STEPS:"
echo ""
echo "1. Configure ACL Policies (REQUIRED for machines to communicate):"
echo " • SSH to spacey: ssh counterweight@<spacey-ip>"
echo " • Edit ACL: sudo nano /etc/headscale/acl.json"
echo " • Add rules to allow communication"
echo " • Restart: sudo systemctl restart headscale"
echo ""
echo "2. Verify machines joined (if you selected that option):"
echo " • SSH to spacey: ssh counterweight@<spacey-ip>"
echo " • List nodes: sudo headscale nodes list"
echo ""
echo "3. Join additional machines (mobile, desktop):"
echo " • Generate key: sudo headscale preauthkeys create --user <namespace> --reusable"
echo " • On device: tailscale up --login-server https://<headscale-domain> --authkey <key>"
echo ""
print_info "What Headscale enables:"
echo " • Secure mesh networking between all machines"
echo " • Magic DNS - access machines by hostname"
echo " • NAT traversal - works behind firewalls"
echo " • Self-hosted Tailscale alternative"
echo ""
print_info "Next steps:"
echo " 1. Configure ACL policies on spacey"
echo " 2. Verify nodes are connected"
echo " 3. Proceed to Layer 6: ./scripts/setup_layer_6_infra_monitoring.sh"
echo ""
}
###############################################################################
# Main Execution
###############################################################################
main() {
clear
print_header "🔐 Layer 5: VPN Infrastructure (Headscale)"
echo "This script will deploy Headscale for secure mesh networking."
echo ""
print_warning "THIS LAYER IS OPTIONAL"
print_info "Skip to Layer 6 if you don't need VPN mesh networking"
echo ""
if ! confirm_action "Continue with Layer 5 setup?"; then
echo "Setup skipped - proceeding to Layer 6 is fine!"
exit 0
fi
check_prerequisites
check_vars_files
check_dns_configuration
# Deploy Headscale
if deploy_headscale; then
echo ""
join_machines_to_mesh
echo ""
setup_headscale_backup
echo ""
verify_deployment
print_summary
else
print_error "Headscale deployment failed"
exit 1
fi
}
# Run main function
main "$@"

View file

@ -0,0 +1,491 @@
#!/bin/bash
###############################################################################
# Layer 6: Infrastructure Monitoring
#
# This script deploys disk usage, healthcheck, and CPU temp monitoring.
# Must be run after Layer 4 (Uptime Kuma) is complete with credentials set.
###############################################################################
set -e # Exit on error
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Project root directory
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
ANSIBLE_DIR="$PROJECT_ROOT/ansible"
###############################################################################
# Helper Functions
###############################################################################
print_header() {
echo -e "\n${BLUE}========================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}========================================${NC}\n"
}
print_success() {
echo -e "${GREEN}${NC} $1"
}
print_error() {
echo -e "${RED}${NC} $1"
}
print_warning() {
echo -e "${YELLOW}${NC} $1"
}
print_info() {
echo -e "${BLUE}${NC} $1"
}
confirm_action() {
local prompt="$1"
local response
read -p "$(echo -e ${YELLOW}${prompt}${NC} [y/N]: )" response
[[ "$response" =~ ^[Yy]$ ]]
}
###############################################################################
# Verification Functions
###############################################################################
check_prerequisites() {
print_header "Verifying Prerequisites"
local errors=0
if [ -z "$VIRTUAL_ENV" ]; then
print_error "Virtual environment not activated"
echo "Run: source venv/bin/activate"
((errors++))
else
print_success "Virtual environment activated"
fi
if ! command -v ansible &> /dev/null; then
print_error "Ansible not found"
((errors++))
else
print_success "Ansible found"
fi
if [ ! -f "$ANSIBLE_DIR/inventory.ini" ]; then
print_error "inventory.ini not found"
((errors++))
else
print_success "inventory.ini exists"
fi
# Check Python uptime-kuma-api
if ! python3 -c "import uptime_kuma_api" 2>/dev/null; then
print_error "uptime-kuma-api Python package not found"
print_info "Install with: pip install -r requirements.txt"
((errors++))
else
print_success "uptime-kuma-api package found"
fi
if [ $errors -gt 0 ]; then
print_error "Prerequisites not met"
exit 1
fi
print_success "Prerequisites verified"
}
check_uptime_kuma_credentials() {
print_header "Verifying Uptime Kuma Configuration"
cd "$ANSIBLE_DIR"
# Check if infra_secrets.yml has credentials
if ! grep -q "^uptime_kuma_username:" "$ANSIBLE_DIR/infra_secrets.yml" 2>/dev/null || \
! grep -q "^uptime_kuma_password:" "$ANSIBLE_DIR/infra_secrets.yml" 2>/dev/null; then
print_error "Uptime Kuma credentials not found in infra_secrets.yml"
print_info "You must complete Layer 4 post-deployment steps first:"
echo " 1. Create admin user in Uptime Kuma web UI"
echo " 2. Add credentials to ansible/infra_secrets.yml"
exit 1
fi
local uk_user=$(grep "^uptime_kuma_username:" "$ANSIBLE_DIR/infra_secrets.yml" | awk '{print $2}' | tr -d '"' | tr -d "'")
local uk_pass=$(grep "^uptime_kuma_password:" "$ANSIBLE_DIR/infra_secrets.yml" | awk '{print $2}' | tr -d '"' | tr -d "'")
if [ -z "$uk_user" ] || [ -z "$uk_pass" ]; then
print_error "Uptime Kuma credentials are empty in infra_secrets.yml"
exit 1
fi
print_success "Uptime Kuma credentials found"
# Test API connection
print_info "Testing Uptime Kuma API connection..."
local test_script=$(mktemp)
cat > "$test_script" << 'EOFPYTHON'
import sys
import yaml
from uptime_kuma_api import UptimeKumaApi
try:
with open('infra_vars.yml', 'r') as f:
infra_vars = yaml.safe_load(f)
with open('services_config.yml', 'r') as f:
services_config = yaml.safe_load(f)
with open('infra_secrets.yml', 'r') as f:
secrets = yaml.safe_load(f)
root_domain = infra_vars.get('root_domain')
subdomain = services_config.get('subdomains', {}).get('uptime_kuma', 'uptime')
url = f"https://{subdomain}.{root_domain}"
username = secrets.get('uptime_kuma_username')
password = secrets.get('uptime_kuma_password')
api = UptimeKumaApi(url)
api.login(username, password)
monitors = api.get_monitors()
print(f"SUCCESS:{len(monitors)}")
api.disconnect()
except Exception as e:
print(f"ERROR:{str(e)}", file=sys.stderr)
sys.exit(1)
EOFPYTHON
local result=$(cd "$ANSIBLE_DIR" && python3 "$test_script" 2>&1)
rm -f "$test_script"
if echo "$result" | grep -q "^SUCCESS:"; then
local monitor_count=$(echo "$result" | grep "^SUCCESS:" | cut -d: -f2)
print_success "Successfully connected to Uptime Kuma API"
print_info "Current monitors: $monitor_count"
else
print_error "Cannot connect to Uptime Kuma API"
print_info "Error: $result"
echo ""
print_info "Make sure:"
echo " • Uptime Kuma is running (Layer 4)"
echo " • Credentials are correct in infra_secrets.yml"
echo " • Uptime Kuma is accessible"
exit 1
fi
echo ""
print_success "Uptime Kuma configuration verified"
}
get_hosts_from_inventory() {
local group="$1"
cd "$ANSIBLE_DIR"
ansible-inventory -i inventory.ini --list | \
python3 -c "import sys, json; data=json.load(sys.stdin); print(' '.join(data.get('$group', {}).get('hosts', [])))" 2>/dev/null || echo ""
}
###############################################################################
# Disk Usage Monitoring
###############################################################################
deploy_disk_usage_monitoring() {
print_header "Deploying Disk Usage Monitoring"
cd "$ANSIBLE_DIR"
print_info "This will deploy disk usage monitoring on selected hosts"
print_info "Default settings:"
echo " • Threshold: 80%"
echo " • Check interval: 15 minutes"
echo " • Mount point: /"
echo ""
# Show available hosts
echo "Available hosts:"
for group in vipy watchtower spacey nodito lapy; do
local hosts=$(get_hosts_from_inventory "$group")
if [ -n "$hosts" ]; then
echo " [$group]: $hosts"
fi
done
echo ""
print_info "Deployment options:"
echo " 1. Deploy on all remote hosts (vipy, watchtower, spacey, nodito)"
echo " 2. Deploy on all hosts (including lapy)"
echo " 3. Custom selection (specify groups)"
echo " 4. Skip disk monitoring"
echo ""
echo -e -n "${BLUE}Choose option${NC} [1-4]: "
read option
local limit_hosts=""
case "$option" in
1)
limit_hosts="vipy,watchtower,spacey,nodito"
print_info "Deploying to remote hosts"
;;
2)
limit_hosts="all"
print_info "Deploying to all hosts"
;;
3)
echo -e -n "${BLUE}Enter groups (comma-separated)${NC}: "
read limit_hosts
print_info "Deploying to: $limit_hosts"
;;
4)
print_warning "Skipping disk usage monitoring"
return 0
;;
*)
print_error "Invalid option"
return 0
;;
esac
echo ""
if ! confirm_action "Proceed with disk usage monitoring deployment?"; then
print_warning "Skipped"
return 0
fi
print_info "Running: ansible-playbook -i inventory.ini infra/410_disk_usage_alerts.yml --limit $limit_hosts"
echo ""
if ansible-playbook -i inventory.ini infra/410_disk_usage_alerts.yml --limit "$limit_hosts"; then
print_success "Disk usage monitoring deployed"
return 0
else
print_error "Deployment failed"
return 0
fi
}
###############################################################################
# System Healthcheck Monitoring
###############################################################################
deploy_system_healthcheck() {
print_header "Deploying System Healthcheck Monitoring"
cd "$ANSIBLE_DIR"
print_info "This will deploy system healthcheck monitoring on selected hosts"
print_info "Default settings:"
echo " • Heartbeat interval: 60 seconds"
echo " • Upside-down mode (no news is good news)"
echo ""
# Show available hosts
echo "Available hosts:"
for group in vipy watchtower spacey nodito lapy; do
local hosts=$(get_hosts_from_inventory "$group")
if [ -n "$hosts" ]; then
echo " [$group]: $hosts"
fi
done
echo ""
print_info "Deployment options:"
echo " 1. Deploy on all remote hosts (vipy, watchtower, spacey, nodito)"
echo " 2. Deploy on all hosts (including lapy)"
echo " 3. Custom selection (specify groups)"
echo " 4. Skip healthcheck monitoring"
echo ""
echo -e -n "${BLUE}Choose option${NC} [1-4]: "
read option
local limit_hosts=""
case "$option" in
1)
limit_hosts="vipy,watchtower,spacey,nodito"
print_info "Deploying to remote hosts"
;;
2)
limit_hosts="all"
print_info "Deploying to all hosts"
;;
3)
echo -e -n "${BLUE}Enter groups (comma-separated)${NC}: "
read limit_hosts
print_info "Deploying to: $limit_hosts"
;;
4)
print_warning "Skipping healthcheck monitoring"
return 0
;;
*)
print_error "Invalid option"
return 0
;;
esac
echo ""
if ! confirm_action "Proceed with healthcheck monitoring deployment?"; then
print_warning "Skipped"
return 0
fi
print_info "Running: ansible-playbook -i inventory.ini infra/420_system_healthcheck.yml --limit $limit_hosts"
echo ""
if ansible-playbook -i inventory.ini infra/420_system_healthcheck.yml --limit "$limit_hosts"; then
print_success "System healthcheck monitoring deployed"
return 0
else
print_error "Deployment failed"
return 0
fi
}
###############################################################################
# CPU Temperature Monitoring (Nodito)
###############################################################################
deploy_cpu_temp_monitoring() {
print_header "Deploying CPU Temperature Monitoring (Nodito)"
cd "$ANSIBLE_DIR"
# Check if nodito is configured
local nodito_hosts=$(get_hosts_from_inventory "nodito")
if [ -z "$nodito_hosts" ]; then
print_info "Nodito not configured in inventory, skipping CPU temp monitoring"
return 0
fi
print_info "This will deploy CPU temperature monitoring on nodito (Proxmox)"
print_info "Default settings:"
echo " • Threshold: 80°C"
echo " • Check interval: 60 seconds"
echo ""
# Check if nodito_secrets.yml exists
if [ ! -f "$ANSIBLE_DIR/infra/nodito/nodito_secrets.yml" ]; then
print_warning "nodito_secrets.yml not found"
print_info "You need to create this file with Uptime Kuma push URL"
if confirm_action "Create nodito_secrets.yml now?"; then
# Get Uptime Kuma URL
local root_domain=$(grep "^root_domain:" "$ANSIBLE_DIR/infra_vars.yml" | awk '{print $2}' 2>/dev/null)
local uk_subdomain=$(grep "^uptime_kuma_subdomain:" "$ANSIBLE_DIR/services/uptime_kuma/uptime_kuma_vars.yml" | awk '{print $2}' 2>/dev/null || echo "uptime")
echo -e -n "${BLUE}Enter Uptime Kuma push URL${NC} (e.g., https://${uk_subdomain}.${root_domain}/api/push/xxxxx): "
read push_url
mkdir -p "$ANSIBLE_DIR/infra/nodito"
cat > "$ANSIBLE_DIR/infra/nodito/nodito_secrets.yml" << EOF
# Nodito Secrets
# DO NOT commit to git
# Uptime Kuma Push URL for CPU temperature monitoring
nodito_uptime_kuma_cpu_temp_push_url: "${push_url}"
EOF
print_success "Created nodito_secrets.yml"
else
print_warning "Skipping CPU temp monitoring"
return 0
fi
fi
echo ""
if ! confirm_action "Proceed with CPU temp monitoring deployment?"; then
print_warning "Skipped"
return 0
fi
print_info "Running: ansible-playbook -i inventory.ini infra/nodito/40_cpu_temp_alerts.yml"
echo ""
if ansible-playbook -i inventory.ini infra/nodito/40_cpu_temp_alerts.yml; then
print_success "CPU temperature monitoring deployed"
return 0
else
print_error "Deployment failed"
return 0
fi
}
###############################################################################
# Summary
###############################################################################
print_summary() {
print_header "Layer 6 Setup Complete! 🎉"
echo "Summary of what was deployed:"
echo ""
print_success "Infrastructure monitoring configured"
print_success "Monitors created in Uptime Kuma"
print_success "Systemd services and timers running"
echo ""
print_info "What you have now:"
echo " • Disk usage monitoring on selected hosts"
echo " • System healthcheck monitoring"
echo " • CPU temperature monitoring (if nodito configured)"
echo " • All organized in host-specific groups"
echo ""
print_info "Verify your monitoring:"
echo " 1. Open Uptime Kuma web UI"
echo " 2. Check monitors organized by host groups"
echo " 3. Verify monitors are receiving data"
echo " 4. Configure notification rules"
echo " 5. Watch for alerts via ntfy"
echo ""
print_info "Next steps:"
echo " 1. Customize thresholds if needed"
echo " 2. Proceed to Layer 7: Core Services deployment"
echo ""
}
###############################################################################
# Main Execution
###############################################################################
main() {
clear
print_header "📊 Layer 6: Infrastructure Monitoring"
echo "This script will deploy automated monitoring for your infrastructure."
echo ""
if ! confirm_action "Continue with Layer 6 setup?"; then
echo "Setup cancelled."
exit 0
fi
check_prerequisites
check_uptime_kuma_credentials
# Deploy monitoring
deploy_disk_usage_monitoring
echo ""
deploy_system_healthcheck
echo ""
deploy_cpu_temp_monitoring
echo ""
print_summary
}
# Run main function
main "$@"

494
scripts/setup_layer_7_services.sh Executable file
View file

@ -0,0 +1,494 @@
#!/bin/bash
###############################################################################
# Layer 7: Core Services
#
# This script deploys Vaultwarden, Forgejo, and LNBits on vipy.
# Must be run after Layers 0, 1A, 2, and 3 are complete.
###############################################################################
set -e # Exit on error
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Project root directory
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
ANSIBLE_DIR="$PROJECT_ROOT/ansible"
###############################################################################
# Helper Functions
###############################################################################
print_header() {
echo -e "\n${BLUE}========================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}========================================${NC}\n"
}
print_success() {
echo -e "${GREEN}${NC} $1"
}
print_error() {
echo -e "${RED}${NC} $1"
}
print_warning() {
echo -e "${YELLOW}${NC} $1"
}
print_info() {
echo -e "${BLUE}${NC} $1"
}
confirm_action() {
local prompt="$1"
local response
read -p "$(echo -e ${YELLOW}${prompt}${NC} [y/N]: )" response
[[ "$response" =~ ^[Yy]$ ]]
}
###############################################################################
# Verification Functions
###############################################################################
check_prerequisites() {
print_header "Verifying Prerequisites"
local errors=0
if [ -z "$VIRTUAL_ENV" ]; then
print_error "Virtual environment not activated"
echo "Run: source venv/bin/activate"
((errors++))
else
print_success "Virtual environment activated"
fi
if ! command -v ansible &> /dev/null; then
print_error "Ansible not found"
((errors++))
else
print_success "Ansible found"
fi
if [ ! -f "$ANSIBLE_DIR/inventory.ini" ]; then
print_error "inventory.ini not found"
((errors++))
else
print_success "inventory.ini exists"
fi
# Check if vipy is configured
if ! grep -q "^\[vipy\]" "$ANSIBLE_DIR/inventory.ini"; then
print_error "vipy not configured in inventory.ini"
print_info "Layer 7 requires vipy VPS"
((errors++))
else
print_success "vipy configured in inventory"
fi
if [ $errors -gt 0 ]; then
print_error "Prerequisites not met"
exit 1
fi
print_success "Prerequisites verified"
}
get_hosts_from_inventory() {
local group="$1"
cd "$ANSIBLE_DIR"
ansible-inventory -i inventory.ini --list | \
python3 -c "import sys, json; data=json.load(sys.stdin); print(' '.join(data.get('$group', {}).get('hosts', [])))" 2>/dev/null || echo ""
}
check_dns_configuration() {
print_header "Validating DNS Configuration"
cd "$ANSIBLE_DIR"
# Get vipy IP
local vipy_ip=$(ansible-inventory -i inventory.ini --list | python3 -c "import sys, json; data=json.load(sys.stdin); hosts=data.get('vipy', {}).get('hosts', []); print(hosts[0] if hosts else '')" 2>/dev/null)
if [ -z "$vipy_ip" ]; then
print_error "Could not determine vipy IP from inventory"
return 1
fi
print_info "Vipy IP: $vipy_ip"
echo ""
# Get domain from infra_vars.yml
local root_domain=$(grep "^root_domain:" "$ANSIBLE_DIR/infra_vars.yml" | awk '{print $2}' 2>/dev/null)
if [ -z "$root_domain" ]; then
print_error "Could not determine root_domain from infra_vars.yml"
return 1
fi
# Get subdomains from centralized config
local vw_subdomain="vault"
local fg_subdomain="git"
local ln_subdomain="lnbits"
if [ -f "$ANSIBLE_DIR/services_config.yml" ]; then
vw_subdomain=$(grep "^ vaultwarden:" "$ANSIBLE_DIR/services_config.yml" | awk '{print $2}' 2>/dev/null || echo "vault")
fg_subdomain=$(grep "^ forgejo:" "$ANSIBLE_DIR/services_config.yml" | awk '{print $2}' 2>/dev/null || echo "git")
ln_subdomain=$(grep "^ lnbits:" "$ANSIBLE_DIR/services_config.yml" | awk '{print $2}' 2>/dev/null || echo "lnbits")
fi
print_info "Checking DNS records..."
echo ""
local dns_ok=true
if command -v dig &> /dev/null; then
# Check each subdomain
for service in "vaultwarden:$vw_subdomain" "forgejo:$fg_subdomain" "lnbits:$ln_subdomain"; do
local name=$(echo "$service" | cut -d: -f1)
local subdomain=$(echo "$service" | cut -d: -f2)
local fqdn="${subdomain}.${root_domain}"
print_info "Checking $fqdn..."
local resolved=$(dig +short "$fqdn" | head -n1)
if [ "$resolved" = "$vipy_ip" ]; then
print_success "$fqdn$resolved"
elif [ -n "$resolved" ]; then
print_error "$fqdn$resolved (expected $vipy_ip)"
dns_ok=false
else
print_error "$fqdn does not resolve"
dns_ok=false
fi
done
else
print_warning "dig command not found, skipping DNS validation"
print_info "Install dnsutils/bind-tools to enable DNS validation"
return 1
fi
echo ""
if [ "$dns_ok" = false ]; then
print_error "DNS validation failed"
print_info "Please configure DNS records for all services"
echo ""
print_warning "DNS changes can take time to propagate"
echo ""
if ! confirm_action "Continue anyway? (SSL certificates will fail without proper DNS)"; then
exit 1
fi
else
print_success "DNS validation passed"
fi
}
###############################################################################
# Service Deployment
###############################################################################
deploy_vaultwarden() {
print_header "Deploying Vaultwarden (Password Manager)"
cd "$ANSIBLE_DIR"
print_info "This will:"
echo " • Deploy Vaultwarden via Docker"
echo " • Configure Caddy reverse proxy"
echo " • Set up fail2ban protection"
echo " • Enable sign-ups (disable after first user)"
echo ""
if ! confirm_action "Proceed with Vaultwarden deployment?"; then
print_warning "Skipped Vaultwarden deployment"
return 0
fi
print_info "Running: ansible-playbook -i inventory.ini services/vaultwarden/deploy_vaultwarden_playbook.yml"
echo ""
if ansible-playbook -i inventory.ini services/vaultwarden/deploy_vaultwarden_playbook.yml; then
print_success "Vaultwarden deployed"
echo ""
print_warning "POST-DEPLOYMENT:"
echo " 1. Visit your Vaultwarden subdomain"
echo " 2. Create your first user account"
echo " 3. Run: ansible-playbook -i inventory.ini services/vaultwarden/disable_vaultwarden_sign_ups_playbook.yml"
return 0
else
print_error "Vaultwarden deployment failed"
return 0
fi
}
deploy_forgejo() {
print_header "Deploying Forgejo (Git Server)"
cd "$ANSIBLE_DIR"
print_info "This will:"
echo " • Install Forgejo binary"
echo " • Create git user and directories"
echo " • Configure Caddy reverse proxy"
echo " • Enable SSH cloning on port 22"
echo ""
if ! confirm_action "Proceed with Forgejo deployment?"; then
print_warning "Skipped Forgejo deployment"
return 0
fi
print_info "Running: ansible-playbook -i inventory.ini services/forgejo/deploy_forgejo_playbook.yml"
echo ""
if ansible-playbook -i inventory.ini services/forgejo/deploy_forgejo_playbook.yml; then
print_success "Forgejo deployed"
echo ""
print_warning "POST-DEPLOYMENT:"
echo " 1. Visit your Forgejo subdomain"
echo " 2. Create admin account on first visit"
echo " 3. Add your SSH key for git cloning"
return 0
else
print_error "Forgejo deployment failed"
return 0
fi
}
deploy_lnbits() {
print_header "Deploying LNBits (Lightning Wallet)"
cd "$ANSIBLE_DIR"
print_info "This will:"
echo " • Install system dependencies and uv (Python 3.12 tooling)"
echo " • Clone LNBits repository (version v1.3.1)"
echo " • Sync dependencies with uv targeting Python 3.12"
echo " • Configure with FakeWallet (testing)"
echo " • Create systemd service"
echo " • Configure Caddy reverse proxy"
echo ""
if ! confirm_action "Proceed with LNBits deployment?"; then
print_warning "Skipped LNBits deployment"
return 0
fi
print_info "Running: ansible-playbook -i inventory.ini services/lnbits/deploy_lnbits_playbook.yml"
echo ""
if ansible-playbook -i inventory.ini services/lnbits/deploy_lnbits_playbook.yml; then
print_success "LNBits deployed"
echo ""
print_warning "POST-DEPLOYMENT:"
echo " 1. Visit your LNBits subdomain"
echo " 2. Create superuser on first visit"
echo " 3. Configure real Lightning backend (FakeWallet is for testing only)"
echo " 4. Disable new user registration"
return 0
else
print_error "LNBits deployment failed"
return 0
fi
}
###############################################################################
# Backup Configuration
###############################################################################
setup_backups() {
print_header "Setting Up Backups (Optional)"
cd "$ANSIBLE_DIR"
print_info "Configure automated backups to lapy"
echo ""
# Vaultwarden backup
if confirm_action "Set up Vaultwarden backup to lapy?"; then
print_info "Running: ansible-playbook -i inventory.ini services/vaultwarden/setup_backup_vaultwarden_to_lapy.yml"
if ansible-playbook -i inventory.ini services/vaultwarden/setup_backup_vaultwarden_to_lapy.yml; then
print_success "Vaultwarden backup configured"
else
print_error "Vaultwarden backup setup failed"
fi
echo ""
fi
# LNBits backup
if confirm_action "Set up LNBits backup to lapy (GPG encrypted)?"; then
print_info "Running: ansible-playbook -i inventory.ini services/lnbits/setup_backup_lnbits_to_lapy.yml"
if ansible-playbook -i inventory.ini services/lnbits/setup_backup_lnbits_to_lapy.yml; then
print_success "LNBits backup configured"
else
print_error "LNBits backup setup failed"
fi
echo ""
fi
print_warning "Forgejo backups are not automated - set up manually if needed"
}
###############################################################################
# Verification
###############################################################################
verify_services() {
print_header "Verifying Service Deployments"
cd "$ANSIBLE_DIR"
local ssh_key=$(grep "ansible_ssh_private_key_file" "$ANSIBLE_DIR/inventory.ini" | head -n1 | sed 's/.*ansible_ssh_private_key_file=\([^ ]*\).*/\1/')
ssh_key="${ssh_key/#\~/$HOME}"
local vipy_host=$(get_hosts_from_inventory "vipy")
if [ -z "$vipy_host" ]; then
print_error "Could not determine vipy host"
return
fi
print_info "Checking services on vipy ($vipy_host)..."
echo ""
# Check Vaultwarden
if timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$vipy_host "docker ps | grep vaultwarden" &>/dev/null; then
print_success "Vaultwarden container running"
else
print_warning "Vaultwarden container not running (may not be deployed)"
fi
# Check Forgejo
if timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$vipy_host "systemctl is-active forgejo" &>/dev/null; then
print_success "Forgejo service running"
else
print_warning "Forgejo service not running (may not be deployed)"
fi
# Check LNBits
if timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$vipy_host "systemctl is-active lnbits" &>/dev/null; then
print_success "LNBits service running"
else
print_warning "LNBits service not running (may not be deployed)"
fi
# Check Caddy configs
if timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$vipy_host "ls /etc/caddy/sites-enabled/*.conf 2>/dev/null" &>/dev/null; then
print_success "Caddy configs exist"
local configs=$(timeout 5 ssh -i "$ssh_key" -o StrictHostKeyChecking=no -o BatchMode=yes counterweight@$vipy_host "ls /etc/caddy/sites-enabled/*.conf 2>/dev/null" | xargs -n1 basename)
print_info "Configured services:"
echo "$configs" | sed 's/^/ /'
else
print_warning "No Caddy configs found"
fi
echo ""
}
###############################################################################
# Summary
###############################################################################
print_summary() {
print_header "Layer 7 Setup Complete! 🎉"
echo "Summary of what was deployed:"
echo ""
print_success "Core services deployed on vipy"
echo ""
print_warning "CRITICAL POST-DEPLOYMENT STEPS:"
echo ""
echo "For each service you deployed, you MUST:"
echo ""
echo "1. Vaultwarden (if deployed):"
echo " • Visit web UI and create first user"
echo " • Disable sign-ups: ansible-playbook -i inventory.ini services/vaultwarden/disable_vaultwarden_sign_ups_playbook.yml"
echo " • Optional: Set up backup"
echo ""
echo "2. Forgejo (if deployed):"
echo " • Visit web UI and create admin account"
echo " • Add your SSH public key for git operations"
echo " • Test cloning: git clone git@<forgejo_subdomain>.<yourdomain>:username/repo.git"
echo ""
echo "3. LNBits (if deployed):"
echo " • Visit web UI and create superuser"
echo " • Configure real Lightning backend (currently FakeWallet)"
echo " • Disable new user registration"
echo " • Optional: Set up encrypted backup"
echo ""
print_info "Services are now accessible:"
echo " • Vaultwarden: https://<vaultwarden_subdomain>.<yourdomain>"
echo " • Forgejo: https://<forgejo_subdomain>.<yourdomain>"
echo " • LNBits: https://<lnbits_subdomain>.<yourdomain>"
echo ""
print_success "Uptime Kuma monitors automatically created:"
echo " • Check Uptime Kuma web UI"
echo " • Look in 'services' monitor group"
echo " • Monitors for Vaultwarden, Forgejo, LNBits should appear"
echo ""
print_info "Next steps:"
echo " 1. Complete post-deployment steps above"
echo " 2. Test each service"
echo " 3. Check Uptime Kuma monitors are working"
echo " 4. Proceed to Layer 8: ./scripts/setup_layer_8_secondary_services.sh"
echo ""
}
###############################################################################
# Main Execution
###############################################################################
main() {
clear
print_header "🚀 Layer 7: Core Services"
echo "This script will deploy core services on vipy:"
echo " • Vaultwarden (password manager)"
echo " • Forgejo (git server)"
echo " • LNBits (Lightning wallet)"
echo ""
if ! confirm_action "Continue with Layer 7 setup?"; then
echo "Setup cancelled."
exit 0
fi
check_prerequisites
check_dns_configuration
# Deploy services
deploy_vaultwarden
echo ""
deploy_forgejo
echo ""
deploy_lnbits
echo ""
verify_services
echo ""
setup_backups
print_summary
}
# Run main function
main "$@"

66
tofu/nodito/README.md Normal file
View file

@ -0,0 +1,66 @@
## Nodito VMs with OpenTofu (Proxmox)
This directory lets you declare VMs on the `nodito` Proxmox node and apply with OpenTofu. It clones the Ansible-built template `debian-13-cloud-init` and places disks on the ZFS pool `proxmox-tank-1`.
### Prereqs
- Proxmox API token with VM privileges. Example: user `root@pam`, token name `tofu`.
- OpenTofu installed.
```
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://get.opentofu.org/opentofu.gpg | sudo tee /etc/apt/keyrings/opentofu.gpg >/dev/null
curl -fsSL https://packages.opentofu.org/opentofu/tofu/gpgkey | sudo gpg --no-tty --batch --dearmor -o /etc/apt/keyrings/opentofu-repo.gpg >/dev/null
sudo chmod a+r /etc/apt/keyrings/opentofu.gpg /etc/apt/keyrings/opentofu-repo.gpg
echo \
"deb [signed-by=/etc/apt/keyrings/opentofu.gpg,/etc/apt/keyrings/opentofu-repo.gpg] https://packages.opentofu.org/opentofu/tofu/any/ any main
deb-src [signed-by=/etc/apt/keyrings/opentofu.gpg,/etc/apt/keyrings/opentofu-repo.gpg] https://packages.opentofu.org/opentofu/tofu/any/ any main" | \
sudo tee /etc/apt/sources.list.d/opentofu.list > /dev/null
sudo chmod a+r /etc/apt/sources.list.d/opentofu.list
sudo apt-get update
sudo apt-get install -y tofu
tofu version
```
- The Ansible template exists: `debian-13-cloud-init` (VMID 9001 by default).
### Provider Auth
Create a `terraform.tfvars` (copy from `terraform.tfvars.example`) and set:
- `proxmox_api_url` (e.g. `https://nodito:8006/api2/json`)
- `proxmox_api_token_id` (e.g. `root@pam!tofu`)
- `proxmox_api_token_secret`
- `ssh_authorized_keys` (your public key content)
Alternatively, you can export env vars and reference them in a tfvars file.
### Declare VMs
Edit `terraform.tfvars` and fill the `vms` map. Example entry:
```
vms = {
web1 = {
name = "web1"
cores = 2
memory_mb = 2048
disk_size_gb = 20
ipconfig0 = "ip=dhcp" # or "ip=192.168.1.50/24,gw=192.168.1.1"
}
}
```
All VM disks are created on `zfs_storage_name` (defaults to `proxmox-tank-1`). Network attaches to `vmbr0`. VLAN can be set per-VM with `vlan_tag`.
### Usage
```
tofu init
tofu plan -var-file=terraform.tfvars
tofu apply -var-file=terraform.tfvars
```
### Notes
- Clones are full clones by default (`full_clone = true`).
- Cloud-init injects `cloud_init_user` and `ssh_authorized_keys`.
- Disks use `scsi0` on ZFS with `discard` enabled.

70
tofu/nodito/main.tf Normal file
View file

@ -0,0 +1,70 @@
locals {
default_ipconfig0 = "ip=dhcp"
}
resource "proxmox_vm_qemu" "vm" {
for_each = var.vms
name = each.value.name
target_node = var.proxmox_node
vmid = try(each.value.vmid, null)
onboot = true
agent = 1
clone = var.template_name
full_clone = true
vga {
type = "serial0"
}
cpu {
sockets = 1
cores = each.value.cores
type = "host"
}
memory = each.value.memory_mb
scsihw = "virtio-scsi-pci"
boot = "c"
bootdisk = "scsi0"
serial {
id = 0
type = "socket"
}
# Network: bridge vmbr0, optional VLAN tag
network {
id = 0
model = "virtio"
bridge = "vmbr0"
tag = try(each.value.vlan_tag, 0)
}
# Cloud-init: user, ssh keys, IP, and custom snippet for qemu-guest-agent
# Note: Using vendor-data snippet (instead of user-data) allows Proxmox to automatically
# set the hostname from the VM name. User info is set separately via ciuser/sshkeys.
# Using 'local' storage for snippets (not ZFS) as ZFS storage doesn't properly support snippet paths
ciuser = var.cloud_init_user
sshkeys = var.ssh_authorized_keys
ipconfig0 = try(each.value.ipconfig0, local.default_ipconfig0)
cicustom = "vendor=local:snippets/user-data-qemu-agent.yaml"
# Disk on ZFS storage
disk {
slot = "scsi0"
type = "disk"
storage = var.zfs_storage_name
size = "${each.value.disk_size_gb}G"
# optional flags like iothread/ssd/discard differ by provider versions; keep minimal
}
# Cloud-init CD-ROM so ipconfig0/sshkeys apply
disk {
slot = "ide2"
type = "cloudinit"
storage = var.zfs_storage_name
}
}

8
tofu/nodito/provider.tf Normal file
View file

@ -0,0 +1,8 @@
provider "proxmox" {
pm_api_url = var.proxmox_api_url
pm_api_token_id = var.proxmox_api_token_id
pm_api_token_secret = var.proxmox_api_token_secret
pm_tls_insecure = true
}

View file

@ -0,0 +1,35 @@
proxmox_api_url = "https://nodito:8006/api2/json"
proxmox_api_token_id = "root@pam!tofu"
proxmox_api_token_secret = "REPLACE_ME"
proxmox_node = "nodito"
zfs_storage_name = "proxmox-tank-1"
template_name = "debian-13-cloud-init"
cloud_init_user = "counterweight"
# paste your ~/.ssh/id_ed25519.pub or similar
ssh_authorized_keys = <<EOKEY
ssh-ed25519 AAAA... your-key
EOKEY
vms = {
web1 = {
name = "web1"
vmid = 1101
cores = 2
memory_mb = 2048
disk_size_gb = 20
ipconfig0 = "ip=dhcp"
}
db1 = {
name = "db1"
vmid = 1102
cores = 4
memory_mb = 4096
disk_size_gb = 40
ipconfig0 = "ip=dhcp"
}
}

62
tofu/nodito/variables.tf Normal file
View file

@ -0,0 +1,62 @@
variable "proxmox_api_url" {
description = "Base URL for Proxmox API, e.g. https://nodito:8006/api2/json"
type = string
}
variable "proxmox_api_token_id" {
description = "Proxmox API token ID, e.g. root@pam!tofu"
type = string
sensitive = true
}
variable "proxmox_api_token_secret" {
description = "Proxmox API token secret"
type = string
sensitive = true
}
variable "proxmox_node" {
description = "Target Proxmox node name"
type = string
default = "nodito"
}
variable "zfs_storage_name" {
description = "Proxmox storage name backed by ZFS (from Ansible: zfs_pool_name)"
type = string
default = "proxmox-tank-1"
}
variable "template_name" {
description = "Cloud-init template to clone (created by Ansible)"
type = string
default = "debian-13-cloud-init"
}
variable "cloud_init_user" {
description = "Default cloud-init user"
type = string
default = "counterweight"
}
variable "ssh_authorized_keys" {
description = "SSH public key content to inject via cloud-init"
type = string
sensitive = true
}
variable "vms" {
description = "Map of VMs to create"
type = map(object({
name = string
vmid = optional(number)
cores = number
memory_mb = number
disk_size_gb = number
vlan_tag = optional(number)
ipconfig0 = optional(string) # e.g. "ip=dhcp" or "ip=192.168.1.50/24,gw=192.168.1.1"
}))
default = {}
}

12
tofu/nodito/versions.tf Normal file
View file

@ -0,0 +1,12 @@
terraform {
required_version = ">= 1.6.0"
required_providers {
proxmox = {
source = "Telmate/proxmox"
version = "= 3.0.2-rc05"
}
}
}