Compare commits
1 commit
master
...
valuatng-d
| Author | SHA1 | Date | |
|---|---|---|---|
| 4de1436ade |
14
.gitignore
vendored
|
|
@ -1,14 +0,0 @@
|
|||
# Deployment configuration (contains sensitive server details)
|
||||
deploy.config
|
||||
|
||||
# OS files
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
|
||||
# Editor files
|
||||
.vscode/
|
||||
.idea/
|
||||
*.swp
|
||||
*.swo
|
||||
*~
|
||||
|
||||
|
|
@ -8,6 +8,4 @@ The `index.html` is ready in the `public` folder.
|
|||
|
||||
## How to deploy
|
||||
|
||||
1. Copy `deploy.config.example` to `deploy.config`
|
||||
2. Fill in your server details in `deploy.config` (host, user, remote path)
|
||||
3. Run `./deploy.sh` to sync the `public` folder to your remote webserver
|
||||
Somehow get the `public` folder behind a webserver manually and sort out DNS.
|
||||
|
|
|
|||
|
|
@ -1,21 +0,0 @@
|
|||
# Deployment Configuration
|
||||
# Copy this file to deploy.config and fill in your server details
|
||||
# deploy.config is gitignored to keep your credentials safe
|
||||
|
||||
# Remote server hostname or IP address
|
||||
REMOTE_HOST="example.com"
|
||||
|
||||
# SSH username for the remote server
|
||||
REMOTE_USER="username"
|
||||
|
||||
# Remote path where the website should be deployed
|
||||
# This should be the directory served by your webserver (e.g., /var/www/html, /home/username/public_html)
|
||||
REMOTE_PATH="/var/www/html"
|
||||
|
||||
# Optional: Path to SSH private key (if not using default ~/.ssh/id_rsa)
|
||||
# Leave empty to use default SSH key
|
||||
SSH_KEY=""
|
||||
|
||||
# Optional: SSH port (defaults to 22 if not specified)
|
||||
# SSH_PORT="22"
|
||||
|
||||
34
deploy.sh
|
|
@ -1,34 +0,0 @@
|
|||
#!/bin/bash
|
||||
|
||||
# Deployment script for pablohere website
|
||||
# This script syncs the public folder to a remote webserver
|
||||
|
||||
set -e # Exit on error
|
||||
|
||||
# Load deployment configuration
|
||||
if [ ! -f "deploy.config" ]; then
|
||||
echo "Error: deploy.config file not found!"
|
||||
echo "Please copy deploy.config.example to deploy.config and fill in your server details."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
source deploy.config
|
||||
|
||||
# Validate required variables
|
||||
if [ -z "$REMOTE_HOST" ] || [ -z "$REMOTE_USER" ] || [ -z "$REMOTE_PATH" ]; then
|
||||
echo "Error: Required variables not set in deploy.config"
|
||||
echo "Please ensure REMOTE_HOST, REMOTE_USER, and REMOTE_PATH are set."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Use rsync to sync files
|
||||
echo "Deploying public folder to $REMOTE_USER@$REMOTE_HOST:$REMOTE_PATH"
|
||||
rsync -avz --delete \
|
||||
--exclude='.git' \
|
||||
--exclude='.DS_Store' \
|
||||
$SSH_OPTS \
|
||||
public/ \
|
||||
$REMOTE_USER@$REMOTE_HOST:$REMOTE_PATH
|
||||
|
||||
echo "Deployment complete!"
|
||||
|
||||
|
|
@ -1,230 +1,139 @@
|
|||
<!DOCTYPE html>
|
||||
<!DOCTYPE HTML>
|
||||
<html>
|
||||
|
||||
<head>
|
||||
<title>Pablo here</title>
|
||||
<meta charset="utf-8" />
|
||||
<meta viewport="width=device-width, initial-scale=1" />
|
||||
<link rel="stylesheet" href="styles.css" />
|
||||
<title>Pablo here</title>
|
||||
<meta charset="utf-8">
|
||||
<meta viewport="width=device-width, initial-scale=1">
|
||||
<link rel="stylesheet" href="styles.css">
|
||||
</head>
|
||||
|
||||
<body>
|
||||
<main>
|
||||
<h1>Hi, Pablo here</h1>
|
||||
<p>
|
||||
Welcome to my website. Here I discuss thoughts and ideas. This is mostly
|
||||
professional.
|
||||
</p>
|
||||
<hr />
|
||||
<h2>What you'll find here:</h2>
|
||||
<ul>
|
||||
<li><a href="#about-me-header">About me</a></li>
|
||||
<li><a href="#contact-header">Contact</a></li>
|
||||
<li><a href="#my-projects-header">My projects</a></li>
|
||||
<li><a href="#writings-header">Writings</a></li>
|
||||
</ul>
|
||||
<hr />
|
||||
<section>
|
||||
<h2 id="about-me-header">About me</h2>
|
||||
<p>A few facts you might care about:</p>
|
||||
<ul>
|
||||
<li>
|
||||
I'm based in Barcelona, although I'm happy working for anyone
|
||||
located anywhere (as long as we can find a time to meet).
|
||||
</li>
|
||||
<li>
|
||||
My career has focused in Data teams and positions, playing roles
|
||||
such as Data Lead, Data Engineer or Data Science Researcher. I've
|
||||
also tinkered quite a bit with many areas and technologies outside
|
||||
of data, but not in professional, production-grade settings.
|
||||
</li>
|
||||
<li>
|
||||
Having said that, I have a lot of weird interests that might mix
|
||||
somehow, including:
|
||||
<ul>
|
||||
<li>
|
||||
Austrian economics and its societal and political implications
|
||||
</li>
|
||||
<li>Bitcoin</li>
|
||||
<li>P2P and privacy friendly applications</li>
|
||||
<li>
|
||||
Self-hosting and lowering the cost of people using advanced IT
|
||||
on a personal level
|
||||
</li>
|
||||
<li>Riding motorcycles</li>
|
||||
<li>BBQ-ing</li>
|
||||
<li>Being annoyingly contrarian</li>
|
||||
<li>3D printing maps</li>
|
||||
<li>Teaching</li>
|
||||
<li>Film photography</li>
|
||||
<li>Tinkering with bicycles</li>
|
||||
<li>Calisthenics</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
<hr />
|
||||
<section>
|
||||
<h2 id="contact-header">Contact</h2>
|
||||
<p>You can contact me on:</p>
|
||||
<ul>
|
||||
<li>
|
||||
<a href="https://www.linkedin.com/in/pablomartincalvo/">On LinkedIn</a>
|
||||
for professional matters.
|
||||
</li>
|
||||
<li>
|
||||
On keybase: <a href="https://keybase.io/pablomartincalvo">https://keybase.io/pablomartincalvo</a>.
|
||||
</li>
|
||||
<li>
|
||||
On Nostr. My npub is:
|
||||
npub1a29gdc6p7c05az2ka3qwwpl9kfcqmws3xlwmjefmtkulfhgd7u6shuqatg
|
||||
</li>
|
||||
</ul>
|
||||
<p>
|
||||
If you are looking for my CV, no need to reach out,
|
||||
<a href="my_cv.pdf" target="_">you can fetch it yourself here.</a>
|
||||
</p>
|
||||
<p>Good reasons to reach out include:</p>
|
||||
<ul>
|
||||
<li>You want to work with me.</li>
|
||||
<li>
|
||||
Some of my interests, projects or writings caught your attention and
|
||||
you want to discuss them with me.
|
||||
</li>
|
||||
<li>Something fun!</li>
|
||||
</ul>
|
||||
<p>Bad reasons to reach out include:</p>
|
||||
<ul>
|
||||
<li>
|
||||
You want to sell something to me, and you mostly care about selling
|
||||
that thing to me, not me loving the thing.
|
||||
</li>
|
||||
<li>
|
||||
You don't like something posted here and want to let me know your
|
||||
feelings.
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
<hr />
|
||||
<section>
|
||||
<h2 id="my-projects-header">My projects</h2>
|
||||
<p>Some of the projects I've shared publicly:</p>
|
||||
<ul>
|
||||
<li>
|
||||
<a href="https://www.meetup.com/bitcoin-barcelona" target="_blank" rel="noopener noreferrer">Barcelona Bitcoin
|
||||
Only, a local Bitcoin meetup and community I've
|
||||
helped organize and run</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="https://github.com/pmartincalvo/ntfy-emergency-app" target="_blank" rel="noopener noreferrer">A micro
|
||||
webapp to let your loved ones grab your attention via ntfy</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="https://github.com/pmartincalvo/dni" target="_blank" rel="noopener noreferrer">My Python package to
|
||||
handle Spanish DNIs better</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="https://bitcoininfra.contrapeso.xyz" target="_blank" rel="noopener noreferrer">My open access Bitcoin infrastructure that you can use freely.</a> It includes access to the peer port of my Bitcoin node, an Electrum server and a mempool.space instance.
|
||||
</li>
|
||||
</ul>
|
||||
<p>
|
||||
There are also some other projects that I generally keep private but
|
||||
might disclose under the right circumstances. Some notable hints:
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
That one time I made a lot of money doing something everyone said
|
||||
was stupid
|
||||
</li>
|
||||
<li>
|
||||
Some work around helping people ignore EU regulations around
|
||||
exchanging Bitcoin and Fiat
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
<hr />
|
||||
<section>
|
||||
<h2 id="writings-header">Writings</h2>
|
||||
<p>Sometimes I like to jot down ideas and drop them here.</p>
|
||||
<ul>
|
||||
<li>
|
||||
<a href="writings/fixing-a-degraded-zfs-mirror.html" target="_blank"
|
||||
rel="noopener noreferrer">Fixing a Degraded ZFS Mirror: Reseat, Resilver, and Scrub</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/a-degraded-pool-with-a-healthy-disk.html" target="_blank"
|
||||
rel="noopener noreferrer">A degraded pool with a healthy disk</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/why-i-put-my-vms-on-a-zfs-mirror.html" target="_blank"
|
||||
rel="noopener noreferrer">Why I Put My VMs on a ZFS Mirror</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/a-note-for-the-future-the-tax-bleeding-in-2025.html" target="_blank"
|
||||
rel="noopener noreferrer">A note for the future: the tax bleeding in 2025</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/notes-and-lessons-from-my-departure-from-superhog.html" target="_blank"
|
||||
rel="noopener noreferrer">Notes and lessons from my departure from Superhog</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/is-your-drug-dealer-a-homophobic-socialist.html" target="_blank"
|
||||
rel="noopener noreferrer">Is your drug dealer a homophobic socialist?</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/gresham-law-has-nothing-to-do-with-bitcoin.html" target="_blank"
|
||||
rel="noopener noreferrer">Gresham's Law has nothing to do with Bitcoin</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/my-tips-and-tricks-when-using-postgres-as-a-dwh.html" target="_blank"
|
||||
rel="noopener noreferrer">My tips and tricks when using Postgres as a DWH</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/dont-hide-it-make-it-beautiful.html" target="_blank" rel="noopener noreferrer">Don't hide
|
||||
it, make it beautiful</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/the-roi-of-toilets.html" target="_blank" rel="noopener noreferrer">The ROI of toilets</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/your-customers-dont-care-that-your-bathroom-is-dirty.html" target="_blank"
|
||||
rel="noopener noreferrer">Your customers don't care that your bathroom is dirty</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/if-i-started-a-data-team-again.html" target="_blank" rel="noopener noreferrer">If I started
|
||||
a Data team again</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/one-efective-but-risky-way-to-find-the-top-budget-for-the-vacancy.html" target="_blank"
|
||||
rel="noopener noreferrer">One efective but risky way to find the top budget for the
|
||||
vacancy</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/bitcoin-mining-is-like-adding-the-final-piece-to-a-puzzle.html" target="_blank"
|
||||
rel="noopener noreferrer">Bitcoin mining is like adding the final piece to a puzzle</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/credit-cards-affairs-and-chatgpt.html" target="_blank" rel="noopener noreferrer">Credit
|
||||
cards, affairs and ChatGPT</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/when-new-is-not-better.html" target="_blank" rel="noopener noreferrer">When new is not
|
||||
better</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/i-want-code-defined-dashboards-so-badly.html" target="_blank" rel="noopener noreferrer">I
|
||||
want code defined dashboards so badly</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/a-simple-solution-to-spam.html" target="_blank" rel="noopener noreferrer">A simple solution
|
||||
to spam</a>
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
</main>
|
||||
<footer>
|
||||
<p>Pablo Martín Calvo</p>
|
||||
</footer>
|
||||
<main>
|
||||
<h1>
|
||||
Hi, Pablo here
|
||||
</h1>
|
||||
<p>
|
||||
Welcome to my website. Here I discuss thoughts and ideas. This is mostly professional.
|
||||
</p>
|
||||
<hr>
|
||||
<h2>What you'll find here:</h2>
|
||||
<ul>
|
||||
<li><a href="#about-me-header">About me</a></li>
|
||||
<li><a href="#contact-header">Contact</a></li>
|
||||
<li><a href="#my-projects-header">My projects</a></li>
|
||||
<li><a href="#writings-header">Writings</a></li>
|
||||
</ul>
|
||||
<hr>
|
||||
<section>
|
||||
<h2 id="about-me-header">About me</h2>
|
||||
<p>A few facts you might care about:</p>
|
||||
<ul>
|
||||
<li>I'm based in Barcelona, although I'm happy working for anyone located anywhere (as long as we can
|
||||
find a time to meet).</li>
|
||||
<li>My career has focused in Data teams and positions, playing roles such as Data Lead, Data Engineer.
|
||||
</li>
|
||||
<li>Having said that, I have a lot of weird interests that might mix somehow, including:
|
||||
<ul>
|
||||
<li>Austrian economics and its societal and political implications</li>
|
||||
<li>Bitcoin</li>
|
||||
<li>P2P and privacy friendly applications</li>
|
||||
<li>Self-hosting and lowering the cost of people using advanced IT on a personal level</li>
|
||||
<li>Riding motorcycles</li>
|
||||
<li>Being annoyingly contrarian</li>
|
||||
<li>3D printing maps</li>
|
||||
<li>Teaching</li>
|
||||
<li>Film photography</li>
|
||||
<li>Calisthenics</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
<hr>
|
||||
<section>
|
||||
<h2 id="contact-header">Contact</h2>
|
||||
<p>You can contact me on:</p>
|
||||
<ul>
|
||||
<li>
|
||||
<p><a href="https://www.linkedin.com/in/pablomartincalvo/">On LinkedIn</a> for professional matters.
|
||||
</p>
|
||||
</li>
|
||||
<li>
|
||||
<p>At this stage I'm not open to other contacts.</p>
|
||||
</li>
|
||||
</ul>
|
||||
<p>If you are looking for my CV, no need to reach out, <a href="my_cv.pdf" target="_">you can fetch it
|
||||
yourself here.</a></p>
|
||||
<p>Good reason to reach out include:</p>
|
||||
<ul>
|
||||
<li>You want to work with me.</li>
|
||||
<li>Some of my interests, projects or writings caught your attention and you want to discuss them with
|
||||
me.</li>
|
||||
<li>Something fun!</li>
|
||||
</ul>
|
||||
<p>Bad reasons to reach out include:</p>
|
||||
<ul>
|
||||
<li>You want to sell me something, and you mostly care about selling that thing to me, not me loving the
|
||||
thing.</li>
|
||||
<li>You don't like something posted here and want to let me know your feelings.</li>
|
||||
</ul>
|
||||
</section>
|
||||
<hr>
|
||||
<section>
|
||||
<h2 id="my-projects-header">My projects</h2>
|
||||
<p>Some of the projects I've shared publicly:</p>
|
||||
<ul>
|
||||
<li><a href="https://github.com/pmartincalvo/dni" target="_blank" rel="noopener noreferrer">My Python
|
||||
package to handle Spanish DNIs better</a></li>
|
||||
<li><a href="https://www.meetup.com/bitcoin-barcelona" target="_blank"
|
||||
rel="noopener noreferrer">Barcelona Bitcoin Only, a local Bitcoin meetup and community I've
|
||||
helped organize and run</a></li>
|
||||
</ul>
|
||||
<p>There are also some other projects that I generally keep private but might disclose under the right
|
||||
circumstances. Some notable hints:</p>
|
||||
<ul>
|
||||
<li>That one time I made a lot of money doing something everyone said was stupid</li>
|
||||
<li>Some work around helping people ignore EU regulations around exchanging Bitcoin and Fiat
|
||||
</li>
|
||||
</ul>
|
||||
</section>
|
||||
<hr>
|
||||
<section>
|
||||
<h2 id="writings-header">Writings</h2>
|
||||
<p>Sometimes I like to jot down ideas and drop them here.</p>
|
||||
<ul>
|
||||
<li>
|
||||
<a href="writings/one-efective-but-risky-way-to-find-the-top-budget-for-the-vacancy.html"
|
||||
target="_blank" rel="noopener noreferrer">One efective but risky way to find the top budget for
|
||||
the vacancy</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/bitcoin-mining-is-like-adding-the-final-piece-to-a-puzzle.html" target="_blank"
|
||||
rel="noopener noreferrer">Bitcoin mining is like adding the final piece to a puzzle</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/credit-cards-affairs-and-chatgpt.html" target="_blank"
|
||||
rel="noopener noreferrer">Credit cards, affairs and ChatGPT</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/when-new-is-not-better.html" target="_blank" rel="noopener noreferrer">When new is
|
||||
not better</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/i-want-code-defined-dashboards-so-badly.html" target="_blank"
|
||||
rel="noopener noreferrer">I want code defined dashboards so badly</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="writings/a-simple-solution-to-spam.html" target="_blank" rel="noopener noreferrer">A
|
||||
simple solution to spam</a>
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
</section>
|
||||
</main>
|
||||
<footer>
|
||||
<p>Pablo Martín Calvo</p>
|
||||
</footer>
|
||||
</body>
|
||||
|
||||
</html>
|
||||
|
|
@ -1,56 +0,0 @@
|
|||
==================================================================
|
||||
https://keybase.io/pablomartincalvo
|
||||
--------------------------------------------------------------------
|
||||
|
||||
I hereby claim:
|
||||
|
||||
* I am an admin of https://pablohere.contrapeso.xyz
|
||||
* I am pablomartincalvo (https://keybase.io/pablomartincalvo) on keybase.
|
||||
* I have a public key ASDgHxztDlU_R4hjxbkO21-rS4Iv1gABa3BPb_Aff7aNAgo
|
||||
|
||||
To do so, I am signing this object:
|
||||
|
||||
{
|
||||
"body": {
|
||||
"key": {
|
||||
"eldest_kid": "0120d9bde13d9012e681cef2edd668d70426f1f6ef69ce7dfae20b404096eca5b06f0a",
|
||||
"host": "keybase.io",
|
||||
"kid": "0120e01f1ced0e553f478863c5b90edb5fab4b822fd600016b704f6ff01f7fb68d020a",
|
||||
"uid": "8e71277fbc0fb1fea28d60308f495d19",
|
||||
"username": "pablomartincalvo"
|
||||
},
|
||||
"merkle_root": {
|
||||
"ctime": 1755635067,
|
||||
"hash": "4f91af0b9c674e0f1d74a7cfad7abd15a7065cded92b96ac8a6abeb5c8553318599aa1bf7b065a3312e303506256b729b8b60b3a5dd06b68694423f4341a6a14",
|
||||
"hash_meta": "6472dbf2ed33341fb30b6a0c5c5c7fb39c219dd0ffd03c6e08b68c788e0de60a",
|
||||
"seqno": 27031070
|
||||
},
|
||||
"service": {
|
||||
"entropy": "LEFJJ4FMmlJQWPPFEO4xHE5y",
|
||||
"hostname": "pablohere.contrapeso.xyz",
|
||||
"protocol": "https:"
|
||||
},
|
||||
"type": "web_service_binding",
|
||||
"version": 2
|
||||
},
|
||||
"client": {
|
||||
"name": "keybase.io go client",
|
||||
"version": "6.5.1"
|
||||
},
|
||||
"ctime": 1755635082,
|
||||
"expire_in": 504576000,
|
||||
"prev": "37f12270050ab037897ccf6ef9451b1911cb505eca7c3842993b0b8925bc79b8",
|
||||
"seqno": 31,
|
||||
"tag": "signature"
|
||||
}
|
||||
|
||||
which yields the signature:
|
||||
|
||||
hKRib2R5hqhkZXRhY2hlZMOpaGFzaF90eXBlCqNrZXnEIwEg4B8c7Q5VP0eIY8W5Dttfq0uCL9YAAWtwT2/wH3+2jQIKp3BheWxvYWTESpcCH8QgN/EicAUKsDeJfM9u+UUbGRHLUF7KfDhCmTsLiSW8ebjEIAnIWTmufZ017e9WLdI1LhKBPaZ3HzmTrgyASDvY3PwoAgHCo3NpZ8RA9a3xgkSTU6Ht7M7DCsy4ClMmoWFtDEqzX9/dqskeoH2DrJUZYVymBQE1nyB0p1GuXiZA1cP5WY5SDURWZ5bBC6hzaWdfdHlwZSCkaGFzaIKkdHlwZQildmFsdWXEIEJZ4g4HC5qXcqbFf6sJ8XuZyMtoppazFqr1zPu0LH5co3RhZ80CAqd2ZXJzaW9uAQ==
|
||||
|
||||
And finally, I am proving ownership of this host by posting or
|
||||
appending to this document.
|
||||
|
||||
View my publicly-auditable identity here: https://keybase.io/pablomartincalvo
|
||||
|
||||
==================================================================
|
||||
BIN
public/my_cv.pdf
|
Before Width: | Height: | Size: 7.5 MiB |
|
Before Width: | Height: | Size: 246 KiB |
|
Before Width: | Height: | Size: 3.2 MiB |
|
Before Width: | Height: | Size: 10 MiB |
|
Before Width: | Height: | Size: 15 MiB |
|
Before Width: | Height: | Size: 15 MiB |
|
Before Width: | Height: | Size: 10 MiB |
|
Before Width: | Height: | Size: 15 MiB |
|
|
@ -7,8 +7,7 @@ body {
|
|||
|
||||
h1,
|
||||
h2,
|
||||
h3,
|
||||
h4 {
|
||||
h3 {
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
|
|
@ -22,8 +21,7 @@ img {
|
|||
display: block;
|
||||
}
|
||||
|
||||
figcaption {
|
||||
figcaption a {
|
||||
font-style: italic;
|
||||
font-size: small;
|
||||
text-align: center;
|
||||
}
|
||||
|
|
@ -1,133 +0,0 @@
|
|||
<!DOCTYPE HTML>
|
||||
<html>
|
||||
|
||||
<head>
|
||||
<title>Pablo here</title>
|
||||
<meta charset="utf-8">
|
||||
<meta viewport="width=device-width, initial-scale=1">
|
||||
<link rel="stylesheet" href="../styles.css">
|
||||
</head>
|
||||
|
||||
|
||||
<body>
|
||||
<main>
|
||||
<h1>
|
||||
Hi, Pablo here
|
||||
</h1>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
<section>
|
||||
<h2>A degraded pool with a healthy disk</h2>
|
||||
<p><em>Part 2 of 3 in my "First ZFS Degradation" series. See also <a href="why-i-put-my-vms-on-a-zfs-mirror.html">Part 1: The Setup</a> and <a href="fixing-a-degraded-zfs-mirror.html">Part 3: The Fix</a>.</em></p>
|
||||
<h3>The "Oh Shit" Moment</h3>
|
||||
<p>I wasn't even looking for trouble. I was clicking around the Proxmox web UI, exploring some storage views I hadn't noticed before, when I saw it: my ZFS pool was in <strong>DEGRADED</strong> state.</p>
|
||||
<p>I opened the details. One of my two mirrored drives was listed as <strong>FAULTED</strong>.</p>
|
||||
<p>I was very surprised. This box and disks were brand new and didn't even have three months of running on them. I was not expecting HW issues to come at me that fast. I SSH'd into the server and ran the command that would become my best friend over the next 24 hours:</p>
|
||||
<pre><code>zpool status -v proxmox-tank-1</code></pre>
|
||||
<p>No glitch. The pool was degraded. The drive had racked up over 100 read errors, 600+ write errors, and 129 checksum errors. ZFS had given up on it.</p>
|
||||
<pre><code> NAME STATE READ WRITE CKSUM
|
||||
proxmox-tank-1 DEGRADED 0 0 0
|
||||
mirror-0 DEGRADED 0 0 0
|
||||
ata-ST4000NT001-3M2101_WX11TN0Z FAULTED 108 639 129 too many errors
|
||||
ata-ST4000NT001-3M2101_WX11TN2P ONLINE 0 0 0</code></pre>
|
||||
<p>The good news: <code>errors: No known data errors</code>. ZFS was serving all my data from the healthy drive. Nothing was lost yet.</p>
|
||||
<p>The bad news: I was running on a single point of failure. If AGAPITO2 decided to have a bad day too, I'd be in real trouble.</p>
|
||||
<p>I tried the classic IT move: rebooting. The system came back up and ZFS immediately started trying to resilver (rebuild) the degraded drive. But within minutes, the errors started piling up again and the resilver stalled.</p>
|
||||
<p>Time to actually figure out what was wrong.</p>
|
||||
<h3>The Diagnostic Toolbox</h3>
|
||||
<p>When a ZFS drive acts up, you have two main sources of truth: what the <strong>kernel</strong> sees happening at the hardware level, and what the <strong>drive itself</strong> reports about its health. This can be looked up with <code>dmesg</code> and <code>smartctl</code>.</p>
|
||||
<h4>dmesg: The Kernel's Diary</h4>
|
||||
<p>The Linux kernel maintains a ring buffer of messages about hardware events, driver activities, and system operations. The <code>dmesg</code> command lets you read it. For disk issues, you want to grep for SATA-related keywords:</p>
|
||||
<pre><code>dmesg -T | egrep -i 'ata[0-9]|sata|reset|link|i/o error' | tail -100</code></pre>
|
||||
<p>The <code>-T</code> flag gives you human-readable timestamps instead of seconds-since-boot.</p>
|
||||
<p>What I saw was... weird. Here's an excerpt:</p>
|
||||
<pre><code>[Fri Jan 2 22:25:13 2026] ata4.00: exception Emask 0x50 SAct 0x70220001 SErr 0xe0802 action 0x6 frozen
|
||||
[Fri Jan 2 22:25:13 2026] ata4.00: irq_stat 0x08000000, interface fatal error
|
||||
[Fri Jan 2 22:25:13 2026] ata4.00: failed command: READ FPDMA QUEUED
|
||||
[Fri Jan 2 22:25:13 2026] ata4: hard resetting link
|
||||
[Fri Jan 2 22:25:14 2026] ata4: SATA link down (SStatus 0 SControl 300)</code></pre>
|
||||
<p>Let me translate: the kernel tried to read from the drive on <code>ata4</code>, got a "fatal error," and responded by doing a hard reset of the SATA link. Then the link went down entirely. The drive just... disappeared.</p>
|
||||
<p>But it didn't stay gone. A few seconds later:</p>
|
||||
<pre><code>[Fri Jan 2 22:25:24 2026] ata4: link is slow to respond, please be patient (ready=0)
|
||||
[Fri Jan 2 22:25:24 2026] ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)</code></pre>
|
||||
<p>The drive came back! At full speed! But then...</p>
|
||||
<pre><code>[Fri Jan 2 22:25:29 2026] ata4.00: qc timeout after 5000 msecs (cmd 0xec)
|
||||
[Fri Jan 2 22:25:29 2026] ata4.00: failed to IDENTIFY (I/O error, err_mask=0x4)
|
||||
[Fri Jan 2 22:25:29 2026] ata4: limiting SATA link speed to 3.0 Gbps</code></pre>
|
||||
<p>It failed again. The kernel, trying to be helpful, dropped the link speed from 6.0 Gbps to 3.0 Gbps. Maybe a slower speed would be more stable?</p>
|
||||
<p>It wasn't. The pattern repeated: connect, fail, reset, reconnect at a slower speed. 6.0 Gbps, then 3.0 Gbps, then 1.5 Gbps. Eventually:</p>
|
||||
<pre><code>[Fri Jan 2 22:27:06 2026] ata4.00: disable device</code></pre>
|
||||
<p>The kernel gave up entirely.</p>
|
||||
<p>This wasn't what a dying drive looks like. A dying drive throws read errors on specific bad sectors. This drive was connecting and disconnecting like someone was jiggling the cable. The kernel was calling it "interface fatal error", emphasis on <em>interface</em>.</p>
|
||||
<h4>smartctl: Asking the Drive Directly</h4>
|
||||
<p>Every modern hard drive has S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) — basically a built-in health monitor. The <code>smartctl</code> command lets you get info out of it.</p>
|
||||
<p>First, the overall health check:</p>
|
||||
<pre><code>smartctl -H /dev/sdb</code></pre>
|
||||
<pre><code>SMART overall-health self-assessment test result: PASSED</code></pre>
|
||||
<p>Okay, that looks great. But if the disk is healthy, what the hell is going on, and where are all those errors that ZFS was spotting coming from?</p>
|
||||
<p>Let's dig deeper with the extended info:</p>
|
||||
<pre><code>smartctl -x /dev/sdb</code></pre>
|
||||
<p>The key attributes I was looking for:</p>
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Attribute</th>
|
||||
<th>Value</th>
|
||||
<th>What it means</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td>Reallocated_Sector_Ct</td>
|
||||
<td>0</td>
|
||||
<td>Bad sectors the drive has swapped out. Zero is good.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Current_Pending_Sector</td>
|
||||
<td>0</td>
|
||||
<td>Sectors waiting to be checked. Zero is good.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>UDMA_CRC_Error_Count</td>
|
||||
<td>0</td>
|
||||
<td>Data corruption during transfer. Zero is good.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Number of Hardware Resets</td>
|
||||
<td>39</td>
|
||||
<td>Times the connection has been reset. Uh...</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<p>All the sector-level health metrics looked perfect. No bad blocks, no pending errors, no CRC errors. The drive's magnetic platters and read/write heads were fine.</p>
|
||||
<p>But 39 hardware resets? That's not normal. That's the drive (or its connection) getting reset nearly 40 times.</p>
|
||||
<p>I ran the short self-test to be sure:</p>
|
||||
<pre><code>smartctl -t short /dev/sdb
|
||||
# Wait a minute...
|
||||
smartctl -l selftest /dev/sdb</code></pre>
|
||||
<pre><code># 1 Short offline Completed without error 00%</code></pre>
|
||||
<p>The drive passed its own self-test. The platters spin, the heads move, the firmware works, and it can read its own data just fine.</p>
|
||||
<h3>Hypothesis</h3>
|
||||
<p>At this point, the evidence was pointing clearly away from "the drive is dying" and toward "something is wrong with the connection."</p>
|
||||
<p>What the kernel logs told me: the drive keeps connecting and disconnecting. Each time it reconnects, the kernel tries slower speeds. Eventually it gives up entirely. This is what you see with an unstable physical connection.</p>
|
||||
<p>What SMART told me: the drive itself is healthy. No bad sectors, no media errors, no signs of wear. But there have been dozens of hardware resets — the connection keeps getting interrupted.</p>
|
||||
<p>The suspects, in order of likelihood:</p>
|
||||
<ol>
|
||||
<li><strong>SATA data cable</strong>: the most common culprit for intermittent connection issues. Cables go bad, or weren't seated properly in the first place.</li>
|
||||
<li><strong>Power connection</strong>: if the drive isn't getting stable power, it might brown out intermittently.</li>
|
||||
<li><strong>SATA port on the motherboard</strong>: less likely, but possible.</li>
|
||||
<li><strong>PSU</strong>: power supply issues could affect the power rail feeding the drive. Unlikely, since both disks where feeding from the same cable tread, but still an option.</li>
|
||||
</ol>
|
||||
<p>Given that I had just built this server a few weeks earlier, and a good part of that happened after midnight... I was beginning to suspect that perhaps I simply might not have plugged in the disk properly.</p>
|
||||
<h3>The Verdict</h3>
|
||||
<p>I was pretty confident now: the drive was fine, but the connection was bad. Most likely the SATA data cable, and most probably simply not connected properly.</p>
|
||||
<p>The fix would require shutting down the server, opening the case, and reseating (or replacing) cables. Before doing that, I wanted to take the drive offline cleanly and document everything.</p>
|
||||
<p>In <a href="fixing-a-degraded-zfs-mirror.html">Part 3</a>, I'll walk through exactly how I fixed it: the ZFS commands, the physical work, and the validation to make sure everything was actually okay afterward.</p>
|
||||
<p><em>Continue to <a href="fixing-a-degraded-zfs-mirror.html">Part 3: The Fix</a> →</em></p>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
</section>
|
||||
</main>
|
||||
|
||||
</body>
|
||||
|
||||
</html>
|
||||
|
||||
|
|
@ -1,188 +0,0 @@
|
|||
<!doctype html>
|
||||
<html>
|
||||
<head>
|
||||
<title>Pablo here</title>
|
||||
<meta charset="utf-8" />
|
||||
<meta viewport="width=device-width, initial-scale=1" />
|
||||
<link rel="stylesheet" href="../styles.css" />
|
||||
</head>
|
||||
|
||||
<body>
|
||||
<main>
|
||||
<h1>Hi, Pablo here</h1>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
<hr />
|
||||
<section>
|
||||
<h2>A note for the future: the tax bleeding in 2025</h2>
|
||||
<p>
|
||||
I hate taxes deeply. I fell through the rabbit hole of libertarian and
|
||||
anarcocapitalist ideas some years ago, and taxes have been repulsive
|
||||
to me ever since. I go to great lengths to not pay them, and feel
|
||||
deeply hurt everytime they sting my wallet against my will.
|
||||
</p>
|
||||
<p>
|
||||
I know life goes by fast, and what today is vivid in your memory fades
|
||||
away bit by bit until it's gone. I'm truly hoping that, some day in
|
||||
the future, the world will have changed to the better and people won't
|
||||
be paying as much tax as we're doing today in the West. Since in that
|
||||
bright, utopical future I'm dreaming of I might have forgotten about
|
||||
how bad things were on this matter in 2025, I've decided to make a
|
||||
little entry here making an estimate on how many taxes I'm
|
||||
theoretically bleeding on a yearly basis right now. So that we can
|
||||
someday look back in time and wonder: "how the fuck did we tolerate
|
||||
that pillaging".
|
||||
</p>
|
||||
<h3>Inventory</h3>
|
||||
<p>
|
||||
Before going hard into the number crunching let's list all the tax
|
||||
items I'm aware of being subject to:
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
Income Tax: for the sin of making money, the state takes a hefty
|
||||
bite of my salary.
|
||||
</li>
|
||||
<li>
|
||||
Social Security: the state runs a forceful Social Security
|
||||
programme. If you work, it is illegal to not pay for it. It is
|
||||
specially unnerving since it is quite literally a ponzi scheme. At
|
||||
least Madoff lured you into it with pretty words, not violence.
|
||||
</li>
|
||||
<li>
|
||||
VAT Tax: for the sin of buying stuff, the state takes another hefty
|
||||
bite.
|
||||
</li>
|
||||
<li>
|
||||
Real State Tax: for the sin of owning an apartment, the state
|
||||
charges me rent. Do I own it actually?
|
||||
</li>
|
||||
<li>
|
||||
Vehicle Tax: for the sin of owning a motorcycle, the state charges
|
||||
me a yearly fee.
|
||||
</li>
|
||||
<li>
|
||||
Wealth Transfer Tax: when you buy real state, you must pay 10% of
|
||||
its value in taxes. This is a one off fee if you only buy one house
|
||||
in your lifetime, but it is such a slap on the face that it would be
|
||||
dishonest to not consider it.
|
||||
</li>
|
||||
<li>
|
||||
Inheritance tax: you thought you were going to keep daddy's loot all
|
||||
for yourself? When you inherit, you'll go through the register
|
||||
again. Like the wealth transfer tax, is not a frequent one, but it's
|
||||
big so let's consider it.
|
||||
</li>
|
||||
</ul>
|
||||
<p>
|
||||
There may be some other small, less frequent taxes that I'm not
|
||||
considering. These are the ones that will hit most people in my
|
||||
country.
|
||||
</p>
|
||||
<h3>The numbers</h3>
|
||||
<p>
|
||||
Okay, let's go compute the hideous bill. I'll make a hypothetical
|
||||
profile that's roughly close to mine, with a few assumptions along the
|
||||
way.
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
<em>Salary</em>: online sources say the typical salary for my job
|
||||
position in my area is 70k€ yearly. Including the Social Security
|
||||
paid by the company, the sum rises to ~85K€. I consider this way of
|
||||
measuring honest, since I think that all the money paid out by the
|
||||
employer reflects what's the true salary and value of the employee.
|
||||
I read it as, "the company is willing to pay 85K€ for this. What
|
||||
ends up in the employees pocket, and what in the State's, they don't
|
||||
mind".
|
||||
</li>
|
||||
<li><em>Expenses</em>: I'll assume I spend half of my salary.</li>
|
||||
<li>
|
||||
<em>Home Purchase</em>: I'll assume that, during my adult life, I
|
||||
would buy once the average home in my town. From what I could find
|
||||
online, that's somewhere around 500K€.
|
||||
</li>
|
||||
<li>
|
||||
<em>Vehicles</em>: I own a motorcycle and share the expenses of a
|
||||
car with my partner, so I'll count 1.5 vehicles.
|
||||
</li>
|
||||
<li>
|
||||
<em>Inheritance tax</em>: I found a figure stating the average
|
||||
windfall in my country is 250K€. We'll go with that.
|
||||
</li>
|
||||
</ul>
|
||||
<p>With those clear, let's see the actual figures:</p>
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Tax</th>
|
||||
<th>€/year</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td>Income Tax (IRPF)</td>
|
||||
<td>22,401 €</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Social Security (worker + employer)</td>
|
||||
<td>
|
||||
25,375 €
|
||||
<small
|
||||
>(worker 4,445 € + employer 20,930 €)</small
|
||||
>
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>VAT (blended basket)</td>
|
||||
<td>5,250 €</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Real Estate Tax (IBI)</td>
|
||||
<td>1,000 €</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Vehicle Tax (1.5 vehicles)</td>
|
||||
<td>225 €</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Wealth Transfer (10% home, spread 50y)</td>
|
||||
<td>1,000 €</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Inheritance (7% of 250k, spread 50y)</td>
|
||||
<td>350 €</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
<tfoot>
|
||||
<tr>
|
||||
<th>Total</th>
|
||||
<th>55,602 €</th>
|
||||
</tr>
|
||||
</tfoot>
|
||||
</table>
|
||||
<p>
|
||||
So there you go. A peaceful existence as a tech professional living a
|
||||
normal life leads to bleeding at least 55K€ per year, all while
|
||||
getting an 85K€ salary. The tax rate sits at a wonderful 64%. How far
|
||||
away is this from hardcore USSR-grade communism?
|
||||
</p>
|
||||
<p>
|
||||
And this is generous, since I didn't model (1) what gets stolen
|
||||
through inflation diluting savings and (2) any capital gains that this
|
||||
profile might end up paying for whatever investments he is doing with
|
||||
his savings.
|
||||
</p>
|
||||
<p>
|
||||
Then you'll see mainstream media puppets discussing why young people
|
||||
don't have children. As if it was some kind of mistery. They're being
|
||||
robbed their children's bread left and right, while getting hypnotized
|
||||
into believing that protecting themselves against this outrageous
|
||||
robbery is somehow morally despicable.
|
||||
</p>
|
||||
<p>Motherfuckers.</p>
|
||||
<hr />
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
</section>
|
||||
</main>
|
||||
</body>
|
||||
</html>
|
||||
|
|
@ -1,125 +0,0 @@
|
|||
<!DOCTYPE HTML>
|
||||
<html>
|
||||
|
||||
<head>
|
||||
<title>Pablo here</title>
|
||||
<meta charset="utf-8">
|
||||
<meta viewport="width=device-width, initial-scale=1">
|
||||
<link rel="stylesheet" href="../styles.css">
|
||||
</head>
|
||||
|
||||
|
||||
<body>
|
||||
<main>
|
||||
<h1>
|
||||
Hi, Pablo here
|
||||
</h1>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
<hr>
|
||||
<section>
|
||||
<h2>Don't hide it, make it beautiful</h2>
|
||||
<p>I'm currently living in a flat, and my internet connection physically comes in through my living room. That's where
|
||||
my home router is placed. However, my main workspace is not in my living room but in my working room,
|
||||
which is a few meters away. I would love to have a wired internet connection for my laptop, but unfortunately, with
|
||||
the router being so far away, setting it up would require running a lot of cable through the walls and
|
||||
ceilings. I could either leave the cable visible or go through some serious construction work to poke holes through walls and fake ceilings and
|
||||
tunnel the cable through there. The latter is out of the table, since I don't even know where would I start.</p>
|
||||
|
||||
<p>The first option being the only available one, there is one fundamental and unavoidable reason I don't do this: aesthetics. My partner is very conscious about
|
||||
keeping our home visually pleasing. I care too, though she probably values aesthetics even more than I
|
||||
do. She likely doesn't find a wired internet connection to be as essential as I do. So, for now, I have to rely on
|
||||
wifi to connect from my workspace to the home router.</p>
|
||||
|
||||
<p>When I was on holiday in Thailand a few years ago, I noticed that Thai homes are far more practical than
|
||||
European ones in such matters. In Thailand, plumbing, electrical systems, and other maintenance-requiring
|
||||
installations are typically very visible, just out there on the wall. They don't hide these things behind
|
||||
fake walls or ceilings. I believe they do this because they highly value the ability to access and work
|
||||
on their home's systems themselves. Many Thai people build and maintain their own homes, so they leave
|
||||
everything exposed for easy access.</p>
|
||||
|
||||
<p>I sometimes envy this approach. Which is funny because I don't think they do it for pleasure but out of necessity.
|
||||
Still, when I saw a Thai homeowner fixing their plumbing outside their house, I thought to myself: "Damn, you're so
|
||||
in control of your home". If something bad happens—like a fallen tree damaging the plumbing—they can fix
|
||||
it themselves. Meanwhile, if that happened to me, I wouldn't even know where to start. I don't even know
|
||||
where my plumbing is because it's all hidden behind walls.</p>
|
||||
|
||||
<p>That makes me wonder: Is there a way to make these essential systems both accessible and aesthetically
|
||||
pleasing? Could we have the convenience of exposed infrastructure without it looking ugly? I believe we
|
||||
can.</p>
|
||||
|
||||
<p>I find the problem is that we have decided certain things—plumbing, electrical wiring, visible
|
||||
infrastructure—are inherently ugly. But they don't have to be. Some household items, like lamps, must be
|
||||
visible by their very nature. Since they can't be hidden, we put effort into making them look good. We choose stylish designs
|
||||
that complement our home's aesthetics. Why can't we do the same for cables and pipes?</p>
|
||||
|
||||
<p>Imagine if all the wiring in your home was encased in beautifully braided, colorful ropes, arranged in
|
||||
elegant geometric patterns. The connections, junction boxes, and fittings could be crafted from
|
||||
high-quality materials like metal and wood with artistic designs. Wouldn't that be nice?</p>
|
||||
|
||||
<p>Now, you might think I'm crazy—that these things are just ugly by nature. But they're not. In fact, many
|
||||
aspects of modern design have become uglier over time, and we've just accepted it.</p>
|
||||
|
||||
<p>Consider street lamps. In most cities today, they are dull, industrial-looking poles—rusty, ugly,
|
||||
and purely functional. Yet, in older parts of my city, we still have beautiful, ornate lamp posts from
|
||||
over a hundred years ago. They were designed with care, meant to serve a purpose, to be visually
|
||||
appealing, and to last ages. Take a look:</p>
|
||||
|
||||
<figure style="width: 75%; margin: 10px auto;">
|
||||
<img width="100%" height="auto" src="../static/streetlamps.png" alt="">
|
||||
<figcaption>On the left, your ugly, could-be-anywhere post 1971 streetlamp. On the right, a 19th century bad body from Gaudí.</figcaption>
|
||||
</figure>
|
||||
|
||||
<p>The same goes for train stations. Modern stations are bleak, sterile spaces—metal, plastic, and harsh
|
||||
lighting. They resemble hospital emergency rooms. But look at the older ones, like this one.
|
||||
Those stations are masterpieces, designed like grand halls with chandeliers and intricate details.</p>
|
||||
|
||||
<figure style="width: 75%; margin: 10px auto;">
|
||||
<img width="100%" height="auto" src="../static/stations.png" alt="">
|
||||
<figcaption>On the left, Sants Station, built in 1975. On the right, France Station, built in 1848.</figcaption>
|
||||
</figure>
|
||||
|
||||
<p>And talking about hospitals, they are also a good example. Most modern hospitals have the same white, cold, spaceship-like
|
||||
aesthetic. While cleanliness is important, there's no reason they have to be so uninviting. In my city,
|
||||
there's a hospital built over a hundred years ago that's so beautiful people visit it as a tourist
|
||||
attraction. On the other hand, the hospitals I visit personally are plain depressing, soviet style
|
||||
atrocities.</p>
|
||||
|
||||
<figure style="width: 75%; margin: 10px auto;">
|
||||
<img width="100%" height="auto" src="../static/hospitals-outside.png" alt="">
|
||||
<figcaption>A random modern clinic in Barcelona vs A small section of the outside of Hospital de Sant Pau. I can skip the left and right thing now, right?</figcaption>
|
||||
</figure>
|
||||
<figure style="width: 75%; margin: 10px auto;">
|
||||
<img width="100%" height="auto" src="../static/hospitals-inside.png" alt="">
|
||||
<figcaption> Some random room in that same modern clinic vs Your regular corridor in Sant Pau.</figcaption>
|
||||
</figure>
|
||||
|
||||
|
||||
|
||||
<p>I think we can bring things back, if we care enough.</p>
|
||||
|
||||
<p>Look at computers. Most office desktop cases are dull, gray boxes—uninspired and purely functional.
|
||||
Naturally, many of them end up buried inside desks, or if they are small enough, simply hidden behind
|
||||
the screen on a VESA mount. But gamers, who deeply care about their PCs, go the extra mile to make their
|
||||
setups look amazing. They invest in custom cases, LED lighting, and stylish cooling systems. They turn
|
||||
their computers into art. They are testament to the fact that we can make practical things also be
|
||||
beautiful if we choose to.</p>
|
||||
|
||||
<figure style="width: 75%; margin: 10px auto;">
|
||||
<img width="100%" height="auto" src="../static/computers.png" alt="">
|
||||
<figcaption> The all-present ugly office optiplex vs A beautiful case from a passionate man.</figcaption>
|
||||
</figure>
|
||||
|
||||
<p>If we put the same effort into our homes, we wouldn't need to hide cables and pipes. We could proudly
|
||||
display them as part of our interior design. Infrastructure could be both functional and beautiful,
|
||||
giving us accessibility without sacrificing aesthetics. </p>
|
||||
|
||||
<p>I guess the point I want to make is... Don't hide it. Instead, make it beautiful.</p>
|
||||
|
||||
<hr>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
</section>
|
||||
</main>
|
||||
|
||||
</body>
|
||||
|
||||
</html>
|
||||
|
|
@ -1,188 +0,0 @@
|
|||
<!DOCTYPE HTML>
|
||||
<html>
|
||||
|
||||
<head>
|
||||
<title>Pablo here</title>
|
||||
<meta charset="utf-8">
|
||||
<meta viewport="width=device-width, initial-scale=1">
|
||||
<link rel="stylesheet" href="../styles.css">
|
||||
</head>
|
||||
|
||||
|
||||
<body>
|
||||
<main>
|
||||
<h1>
|
||||
Hi, Pablo here
|
||||
</h1>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
<section>
|
||||
<h2>Fixing a Degraded ZFS Mirror: Reseat, Resilver, and Scrub</h2>
|
||||
<p><em>Part 3 of 3 in my "First ZFS Degradation" series. See also <a href="why-i-put-my-vms-on-a-zfs-mirror.html">Part 1: The Setup</a> and <a href="a-degraded-pool-with-a-healthy-disk.html">Part 2: Diagnosing the Problem</a>.</em></p>
|
||||
<h3>The Game Plan</h3>
|
||||
<p>By now I was pretty confident about what was wrong: not a dying drive, but a flaky SATA connection. The fix should be straightforward. Just take the drive offline, shut down, reseat the cables, bring it back up, and let ZFS heal itself.</p>
|
||||
<p>But I wanted to do this methodically. ZFS is forgiving, but I didn't want to make things worse by rushing.</p>
|
||||
<p>Here was my plan:</p>
|
||||
<ol>
|
||||
<li>Take the faulty drive offline in ZFS (tell ZFS "stop trying to use this drive")</li>
|
||||
<li>Power down the server</li>
|
||||
<li>Open the case, inspect and reseat cables</li>
|
||||
<li>Boot up, verify the drive is detected</li>
|
||||
<li>Bring the drive back online in ZFS</li>
|
||||
<li>Let the resilver complete</li>
|
||||
<li>Run a scrub to verify data integrity</li>
|
||||
<li>Check SMART one more time</li>
|
||||
</ol>
|
||||
<p>Let's walk through each step.</p>
|
||||
<h3>Step 1: Taking the Drive Offline</h3>
|
||||
<p>Before touching hardware, I wanted ZFS to stop trying to use the problematic drive.</p>
|
||||
<p>First, I set up some variables to avoid typos with that long disk ID:</p>
|
||||
<pre><code>DISKID="ata-ST4000NT001-3M2101_WX11TN0Z"
|
||||
DISKPATH="/dev/disk/by-id/$DISKID"</code></pre>
|
||||
<p>Then I took it offline:</p>
|
||||
<pre><code>zpool offline proxmox-tank-1 "$DISKID"</code></pre>
|
||||
<p>Checking the status afterward:</p>
|
||||
<pre><code>zpool status -v proxmox-tank-1</code></pre>
|
||||
<pre><code> NAME STATE READ WRITE CKSUM
|
||||
proxmox-tank-1 DEGRADED 0 0 0
|
||||
mirror-0 DEGRADED 0 0 0
|
||||
ata-ST4000NT001-3M2101_WX11TN0Z OFFLINE 108 639 129
|
||||
ata-ST4000NT001-3M2101_WX11TN2P ONLINE 0 0 0</code></pre>
|
||||
<p>The state changed from FAULTED to OFFLINE. ZFS knows I intentionally took it offline rather than it failing on its own. The error counts are still there as a historical record, but ZFS isn't actively trying to use the drive anymore.</p>
|
||||
<p>Time to shut down and get my hands dirty.</p>
|
||||
<h3>Step 2: Opening the Case</h3>
|
||||
<p>I powered down the server and opened up the Fractal Node 804. This case has a lovely design with drive bays accessible from the side, which I love. No reaching out into weird corners in the case, just unscrew a couple screws, slide the drive bay out and there they are, handy and reachable.</p>
|
||||
<p>I located AGAPITO1 (I had handwritten labels on the drives, lesson learned after many sessions of playing "which drive is which") and inspected the connections.</p>
|
||||
<p>Here's the honest truth: everything looked fine. The SATA data cable was plugged in. The power connector was plugged in. Nothing was obviously loose or damaged. There was a bit of tension in the cable as it moved from one area of the case (where the motherboard is) to the drives area, but I really didn't think that was affecting the connection to either the drive or the motherboard itself.</p>
|
||||
<p>But "looks fine" doesn't mean "is fine". So I did a full reseat:</p>
|
||||
<ul>
|
||||
<li>Unplugged and firmly replugged the SATA data cable at both ends (drive and motherboard).</li>
|
||||
<li>Unplugged and firmly replugged the power connector.</li>
|
||||
<li>While I was in there, checked the connections on the other disk of the mirror as well.</li>
|
||||
</ul>
|
||||
<p>I made sure each connector clicked in solidly. Then I closed up the case and hit the power button.</p>
|
||||
<h3>Step 3: Verifying Detection</h3>
|
||||
<p>The server booted up. Would Linux see the drive?</p>
|
||||
<pre><code>ls -l /dev/disk/by-id/ | grep WX11TN0Z</code></pre>
|
||||
<pre><code>lrwxrwxrwx 1 root root 9 Jan 2 23:15 ata-ST4000NT001-3M2101_WX11TN0Z -> ../../sdb</code></pre>
|
||||
<p>The drive was there, mapped to <code>/dev/sdb</code>.</p>
|
||||
<p>I opened a second terminal and started watching the kernel log in real time:</p>
|
||||
<pre><code>dmesg -Tw</code></pre>
|
||||
<p>This would show me immediately if the connection started acting flaky again. For now, it was quiet, showing just normal boot messages, the drive being detected successfully, etc. Nothing alarming.</p>
|
||||
<h3>Step 4: Bringing It Back Online</h3>
|
||||
<p>Moment of truth. I told ZFS to start using the drive again:</p>
|
||||
<pre><code>zpool online proxmox-tank-1 "$DISKID"</code></pre>
|
||||
<p>Immediately checked the status:</p>
|
||||
<pre><code>zpool status -v proxmox-tank-1</code></pre>
|
||||
<pre><code> pool: proxmox-tank-1
|
||||
state: DEGRADED
|
||||
status: One or more devices is currently being resilvered.
|
||||
action: Wait for the resilver to complete.
|
||||
scan: resilver in progress since Fri Jan 2 23:17:35 2026
|
||||
0B resilvered, 0.00% done, no estimated completion time
|
||||
|
||||
NAME STATE READ WRITE CKSUM
|
||||
proxmox-tank-1 DEGRADED 0 0 0
|
||||
mirror-0 DEGRADED 0 0 0
|
||||
ata-ST4000NT001-3M2101_WX11TN0Z DEGRADED 0 0 0 too many errors
|
||||
ata-ST4000NT001-3M2101_WX11TN2P ONLINE 0 0 0</code></pre>
|
||||
<p>Two things to notice: the drive's error counters are now at zero (we're starting fresh), and ZFS immediately started resilvering. It shows "too many errors" as the reason for the degraded state, which is historical, it remembers why the drive was marked bad before.</p>
|
||||
<p>I kept watching both the status and the kernel log. No errors, no link resets.</p>
|
||||
<h3>Step 5: The Resilver</h3>
|
||||
<p>Resilvering is ZFS's term for rebuilding redundancy. Copying data from the healthy drive to the one that fell behind. In my case, the drive had been desynchronized for who knows how long (the pool had drifted 524GB out of sync before I noticed), so there was a lot to copy.</p>
|
||||
<p>I shut down my VMs to reduce I/O contention and let the resilver have the disk bandwidth. Progress:</p>
|
||||
<pre><code>scan: resilver in progress since Fri Jan 2 23:17:35 2026
|
||||
495G / 618G scanned, 320G / 618G issued at 100M/s
|
||||
320G resilvered, 51.78% done, 00:50:12 to go</code></pre>
|
||||
<p>The kernel log stayed quiet the whole time. Everything was indicating the cable reseat had worked.</p>
|
||||
<p>I went to bed and let it run overnight. The next morning:</p>
|
||||
<pre><code>scan: resilvered 495G in 01:07:58 with 0 errors on Sat Jan 3 00:25:33 2026</code></pre>
|
||||
<p>495 gigabytes resilvered in about an hour, zero errors. But the pool still showed DEGRADED with a warning about "unrecoverable error." I was very confused about this, but I solved that with some research. Apparently, ZFS is cautious and wants human acknowledgement before declaring everything healthy again.</p>
|
||||
<pre><code>zpool clear proxmox-tank-1 ata-ST4000NT001-3M2101_WX11TN0Z</code></pre>
|
||||
<p>This command clears the error flags. Immediately:</p>
|
||||
<pre><code> pool: proxmox-tank-1
|
||||
state: ONLINE
|
||||
scan: resilvered 495G in 01:07:58 with 0 errors on Sat Jan 3 00:25:33 2026
|
||||
|
||||
NAME STATE READ WRITE CKSUM
|
||||
proxmox-tank-1 ONLINE 0 0 0
|
||||
mirror-0 ONLINE 0 0 0
|
||||
ata-ST4000NT001-3M2101_WX11TN0Z ONLINE 0 0 0
|
||||
ata-ST4000NT001-3M2101_WX11TN2P ONLINE 0 0 0</code></pre>
|
||||
<p>Damn, seeing this felt nice.</p>
|
||||
<h3>Step 6: The Scrub</h3>
|
||||
<p>A resilver copies data to bring the drives back in sync, but it doesn't verify that all the existing data is still good. For that, you run a scrub. ZFS reads every block on the pool, verifies checksums, and repairs anything that doesn't match.</p>
|
||||
<pre><code>zpool scrub proxmox-tank-1</code></pre>
|
||||
<p>I let this run while I brought my VMs back up (scrubs can run in the background without blocking normal operations, though performance takes a hit). A few hours later:</p>
|
||||
<pre><code>scan: scrub repaired 13.0M in 02:14:22 with 0 errors on Sat Jan 3 11:03:54 2026
|
||||
|
||||
NAME STATE READ WRITE CKSUM
|
||||
proxmox-tank-1 ONLINE 0 0 0
|
||||
mirror-0 ONLINE 0 0 0
|
||||
ata-ST4000NT001-3M2101_WX11TN0Z ONLINE 0 0 992
|
||||
ata-ST4000NT001-3M2101_WX11TN2P ONLINE 0 0 0</code></pre>
|
||||
<p>Interesting. The scrub repaired 13MB of data and found 992 checksum mismatches on AGAPITO1. From what I read, checksum errors are typically a sign of the disk being in terrible shape and needing a replacement ASAP. That sounds scary, but I took the risk and assumed those were blocks that had been written incorrectly (or not at all) during the period when the connection was flaky, and not an issue with the disk itself. ZFS detected the bad checksums and healed them using the good copies from AGAPITO2.</p>
|
||||
<p>I cleared the errors again and the pool was clean:</p>
|
||||
<pre><code>zpool clear proxmox-tank-1 ata-ST4000NT001-3M2101_WX11TN0Z</code></pre>
|
||||
<h3>Step 7: Final Validation with SMART</h3>
|
||||
<p>One more check. I wanted to see if SMART had anything new to say about the drive after all that activity:</p>
|
||||
<pre><code>smartctl -x /dev/sdb | egrep -i 'overall|Reallocated|Pending|CRC|Hardware Resets'</code></pre>
|
||||
<pre><code>SMART overall-health self-assessment test result: PASSED
|
||||
5 Reallocated_Sector_Ct PO--CK 100 100 010 - 0
|
||||
197 Current_Pending_Sector -O--C- 100 100 000 - 0
|
||||
199 UDMA_CRC_Error_Count -OSRCK 200 200 000 - 0
|
||||
0x06 0x008 4 41 --- Number of Hardware Resets</code></pre>
|
||||
<p>Still passing. The hardware reset count went from 39 to 41 — just the reboots I did during this process.</p>
|
||||
<p>For completeness, I ran the long self-test. The short test only takes a minute and does basic checks, the long test actually reads every sector on the disk, which for a 4TB drive takes... a while.</p>
|
||||
<pre><code>smartctl -t long /dev/sdb</code></pre>
|
||||
<p>The estimated time was about 6 hours. In practice, it took closer to 12. Running VMs in parallel probably didn't help.</p>
|
||||
<p>But eventually:</p>
|
||||
<pre><code>SMART Self-test log structure revision number 1
|
||||
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
|
||||
# 1 Extended offline Completed without error 00% 1563 -
|
||||
# 2 Short offline Completed without error 00% 1551 -
|
||||
# 3 Short offline Completed without error 00% 1462 -</code></pre>
|
||||
<p>The extended test passed. Every sector on the disk is readable. The drive is genuinely healthy — it was just the connection that was bad.</p>
|
||||
<h3>Lessons Learned</h3>
|
||||
<ul>
|
||||
<li><strong>ZFS did exactly what it's supposed to do:</strong> Despite 524+ gigabytes of desync and nearly a thousand checksum errors, I lost zero data and was back on action while keeping my VMs running. The healthy drive kept serving everything while the flaky drive was acting up, and once the connection was fixed, ZFS healed itself automatically. Also, I was operating for an unknown amount of time with only one drive. In this case it seems it was due to stupid me messing up cable management, but I'm very happy knowing if the disk had been genuinely faulty, services would have continued just fine.</li>
|
||||
<li><strong>Physical connections matter:</strong> It's easy to not pay that much attention when building a new box. Well, it bites back.</li>
|
||||
<li><strong>Monitor your pools.</strong> I only found this issue by accident, clicking around in the Proxmox UI. The pool had been degraded for who knows how long before I noticed. I'm already working in setting up a monitor to my Uptime Kuma instance so that next time the pool status stops being ONLINE I get notified immediately.</li>
|
||||
</ul>
|
||||
<p>I'm happy I was able to test out recoverying from a faulty disk with such a tiny issue. I learned a lot fixing it, and now I'm even more happy than before having decided to go for this ZFS pool setup.</p>
|
||||
<h3>Quick Reference: The Commands</h3>
|
||||
<p>For future me (and anyone else who ends up here with a degraded pool):</p>
|
||||
<pre><code># Check pool status
|
||||
zpool status -v <pool>
|
||||
|
||||
# Watch kernel logs in real time
|
||||
dmesg -Tw
|
||||
|
||||
# Check SMART health
|
||||
smartctl -H /dev/sdX
|
||||
smartctl -x /dev/sdX
|
||||
|
||||
# Take a drive offline before physical work
|
||||
zpool offline <pool> <device-id>
|
||||
|
||||
# Bring a drive back online
|
||||
zpool online <pool> <device-id>
|
||||
|
||||
# Clear error flags after recovery
|
||||
zpool clear <pool> <device-id>
|
||||
|
||||
# Run a scrub to verify all data
|
||||
zpool scrub <pool>
|
||||
|
||||
# Run SMART self-tests
|
||||
smartctl -t short /dev/sdX # Quick test (~1 min)
|
||||
smartctl -t long /dev/sdX # Full surface scan (hours)
|
||||
smartctl -l selftest /dev/sdX # Check test results</code></pre>
|
||||
<p><em>Thanks for reading! This was <a href="fixing-a-degraded-zfs-mirror.html">Part 3: The Fix</a>. You might also enjoy <a href="why-i-put-my-vms-on-a-zfs-mirror.html">Part 1: The Setup</a> and <a href="a-degraded-pool-with-a-healthy-disk.html">Part 2: Diagnosing the Problem</a>.</em></p>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
</section>
|
||||
</main>
|
||||
|
||||
</body>
|
||||
|
||||
</html>
|
||||
|
||||
|
|
@ -1,138 +0,0 @@
|
|||
<!DOCTYPE html>
|
||||
<html>
|
||||
|
||||
<head>
|
||||
<title>Pablo here</title>
|
||||
<meta charset="utf-8" />
|
||||
<meta viewport="width=device-width, initial-scale=1" />
|
||||
<link rel="stylesheet" href="../styles.css" />
|
||||
</head>
|
||||
|
||||
<body>
|
||||
<main>
|
||||
<h1>Hi, Pablo here</h1>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
<hr />
|
||||
<section>
|
||||
<h2>Gresham's Law has nothing to do with Bitcoin</h2>
|
||||
<p>
|
||||
This is going to be a thorough explanation for a simple thing, but we
|
||||
will take it slow since this topic somehow causes loads of confusion.
|
||||
</p>
|
||||
<p>
|
||||
Okay, so there are a lot of people in Bitcoin circles who talk about
|
||||
<a href="https://en.wikipedia.org/wiki/Gresham%27s_law" target="_blank" rel="noopener noreferrer">Gresham's
|
||||
Law</a>. They often say, “Gresham's Law states that bad money drives out
|
||||
good money”, then relate it to Bitcoin and the USD, and finally
|
||||
proceed to reason all sort of of things on top of that. But here's
|
||||
some very much needed clarification: Gresham's law has nothing to do
|
||||
with Bitcoin's relationship to the USD. In fact, it actually has
|
||||
nothing to do Bitcoin, or with the current USD for that matter.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Gresham's Law is relevant to a very specific type of monetary system:
|
||||
when we used coins that contained precious metals (spoiler: we don't
|
||||
live in that period of history anymore). The law states that bad money
|
||||
drives out good money, but what a lot of Bitcoiners seem to miss is
|
||||
the actual meaning of “good” and “bad” in this context. People tend to
|
||||
interpret “good” and “bad” as meaning “hard” and "easy" money, so they
|
||||
reason something like: “Because Bitcoin is harder than the USD,
|
||||
Gresham's law applies here.” But that is not what Gresham's law is
|
||||
about at all.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
In the context of Gresham's law, “good” and “bad” refer to face value
|
||||
versus commodity value. That doesn't ring a bell? Let me explain:
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Imagine a magic land where there is only one type of coin. There's no
|
||||
other money — just this one coin. These coin states on themselves that
|
||||
they contain one gram of gold, and right now, they really do contain
|
||||
one gram of gold. Everyone uses it, and everyone is happy. There's no
|
||||
“bad” money, no “good” money — it's all nice and simple.
|
||||
</p>
|
||||
|
||||
<p>Now, let's spice it up a bit.</p>
|
||||
|
||||
<p>
|
||||
After some time, a cheeky bastard (typically, a king) comes along and
|
||||
starts making coins that look exactly like the original coins. I'll
|
||||
call these the bad coins. The original coins will be the good coins.
|
||||
Both types of coins say on them “one gram of gold,” but the bad coins
|
||||
only have half a gram of gold actually in them (hence why they are
|
||||
bad).
|
||||
</p>
|
||||
|
||||
<p>
|
||||
So, to recap:<br />
|
||||
- Good coins: one gram of gold on the coin, and actually one gram of
|
||||
gold inside.<br />
|
||||
- Bad coins: one gram of gold on the coin, but only 0.5 grams of gold
|
||||
inside.
|
||||
</p>
|
||||
|
||||
<p>This is where Gresham's Law applies.</p>
|
||||
|
||||
<p>
|
||||
People in this coiny fantasy land are not stupid — they know that the
|
||||
gold content is what matters. At some point, someone will realize the
|
||||
bad coins don't have as much gold as they claim and will develop a
|
||||
preference for the good ones. So, if I'm John the Blacksmith and I
|
||||
want to buy some iron, and I have a stash of coins — some good, some
|
||||
bad — I would rather keep the good coins and spend the bad coins. Why?
|
||||
Because I want to keep as much gold as possible, of course.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
What happens eventually is that people grow into the habit of trying to get
|
||||
rid of the bad coins and hold on to the good coins. They exploit the
|
||||
confusion created by the fact that all coins have the same face value
|
||||
(it says “one gram” on all coins, so everyone assumes they're worth
|
||||
the same), even though the actual commodity value (the gold inside)
|
||||
differs.<a href="#footnote-1">[1]</a>
|
||||
</p>
|
||||
|
||||
<p>That is the quick explanation of Gresham's law.</p>
|
||||
|
||||
<p>
|
||||
Now, back to the original point: what are the face value and commodity
|
||||
value of Bitcoin?
|
||||
</p>
|
||||
|
||||
<p>
|
||||
That makes no sense! Bitcoin is not a physical coin with metal in
|
||||
it. It has no concept of face and commodity value. And neither does the
|
||||
USD nowadays. Therefore, Gresham's law has absolutely nothing to do
|
||||
with Bitcoin, the USD and any preferences the world might develop
|
||||
between the two.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Hopefully, this explanation helps make things clear. From now on, if
|
||||
you want to keep your public image intact, please refrain from
|
||||
invoking Gresham's law when discussing Bitcoin and USD — because doing it
|
||||
shows you don't know what Gresham's Law is actually about. Don't feel
|
||||
too bad if it happened to you though: it can happen even to
|
||||
<a href="https://river.com/learn/terms/g/greshams-law/" target="_blank" rel="noopener noreferrer">massive
|
||||
exchanges with a great reputation.</a>
|
||||
</p>
|
||||
|
||||
<p id="footnote-1" class="footnote">
|
||||
<em>[1] Not relevant to the point of this post, but it's worth noting
|
||||
that Gresham's Law situation is not always sure to happen in the
|
||||
described scenario. If the difference between the good and bad coins
|
||||
is massive, and no force opposes it, the market might jump into
|
||||
<a href="https://en.wikipedia.org/wiki/Gresham%27s_law#Reverse_of_Gresham's_law_(Thiers'_law)" target="_blank"
|
||||
rel="noopener noreferrer">Thier's Law</a>
|
||||
instead.</em>
|
||||
</p>
|
||||
<hr />
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
</section>
|
||||
</main>
|
||||
</body>
|
||||
|
||||
</html>
|
||||
|
|
@ -1,150 +0,0 @@
|
|||
<!DOCTYPE HTML>
|
||||
<html>
|
||||
|
||||
<head>
|
||||
<title>Pablo here</title>
|
||||
<meta charset="utf-8">
|
||||
<meta viewport="width=device-width, initial-scale=1">
|
||||
<link rel="stylesheet" href="../styles.css">
|
||||
</head>
|
||||
|
||||
|
||||
<body>
|
||||
<main>
|
||||
<h1>
|
||||
Hi, Pablo here
|
||||
</h1>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
<hr>
|
||||
<section>
|
||||
<h2>If I started a Data team again</h2>
|
||||
<p>
|
||||
In November 2023, I joined <a href="https://truvi.com/">Truvi</a> (<a
|
||||
href="https://truvi.com/blog/superhog-becomes-truvi/">back then called Superhog</a>) as the first
|
||||
member of its new Data team, effectively making my hiring and the team's birth the same thing. The CEO
|
||||
and COO brought me in to help a young, upcoming and quite chaotic UK SaaS company make it out of the
|
||||
void of Data darkness, where no one knew shit, figures came in with three months of delay and VLOOKUP
|
||||
was the closest thing there was to integrating data from two systems.
|
||||
</p>
|
||||
<p>
|
||||
The team is nowadays very much established and critical for the business. We've built our spot within
|
||||
the org, and many colleagues wonder how life could ever exist before. We can tell many tales on
|
||||
delivering value to Truvi, and we have ambitious plans to keep doing more and better. And I'm having a
|
||||
blast leading it.
|
||||
</p>
|
||||
<p>
|
||||
For me, it was quite a journey: it was the first time I started a Data team from a greenfield situation.
|
||||
Some day I'll make a longish writing on the whole experience, but today, I wanted to focus on a few
|
||||
regrets I have. I personally think I've done a good work, I'm proud of where the team is at today, and
|
||||
the words (and actions) of the company board show that they are aligned with this. But still, there's
|
||||
always room to fuck up, and I do have a few parts of the story where, looking back in retrospective, I
|
||||
think I took the wrong turn.
|
||||
</p>
|
||||
<p>
|
||||
The first thing I regret is not hiring faster which, taking personal responsibility, means I didn't
|
||||
focus enough on it and I wasn't as effective as I should have been. When I started the team, I quickly
|
||||
secured consensus from the founders to go hire two more members for it, and we all agreed it made sense
|
||||
to spend those bullets to find two Data Analysts. We found <a
|
||||
href="https://es.linkedin.com/in/oriol-roqu%C3%A9-paniagua-708946158" target="_blank"
|
||||
rel="noopener noreferrer">Uri</a> and <a href="https://es.linkedin.com/in/joaquin-ossa"
|
||||
target="_blank" rel="noopener noreferrer">Joaquín</a>, which are still with us today doing a
|
||||
terrific job.
|
||||
</p>
|
||||
<p>
|
||||
The bad news is they joined in May 2024, which means it took a whopping ~6 months from decision to
|
||||
result. It didn't feel terrible at that moment. I was very much busy with the onslaught of work that
|
||||
comes with kickstarting the team (meeting everyone, aligning with other teams, designing and starting
|
||||
out infra, doing basic, urgent starting deliveries, etc.). But now I realise some of those things would
|
||||
have been easier to do with the guys already around. So there goes regret number one. Lesson: if you
|
||||
have agreement to get more hands on the team, make it priority #1.
|
||||
</p>
|
||||
<p>
|
||||
The second thing I'm not happy about is not completely dropping and rebuilding a few legacy bits I
|
||||
inherited when joining. Truvi had an insane amount of unmet needs in data and reporting right before I
|
||||
joined, so a bit of capacity from the development teams was derailed at some point to create a couple of
|
||||
basic reporting and data exporting tools to support other teams. Those were small and done in a rather
|
||||
crude way, as a result of (1) being done by backend guys with no previous experience in the Data domain,
|
||||
and (2) in parallel to other very important product development work. I don't blame them, <a
|
||||
href="https://retrospectivewiki.org/index.php?title=The_Prime_Directive" target="_blank"
|
||||
rel="noopener noreferrer">I assume they did the best they could given the circumstances.</a>
|
||||
</p>
|
||||
<p>
|
||||
My mistake was to decide early to inherit and continue those, instead of bulldozering them and building
|
||||
things from scratch with better tools and practices. It would have been quite easy to do at the time
|
||||
given that those data products were rather small (and coming back to my first regret, it would probably
|
||||
have been even easier to bulldozer them if the new hires would have been around already). At the time, I
|
||||
traded off leaving those be and have a continuist approach to building incrementally in order to get
|
||||
going with some other fresh new scopes faster. I'm not fully sure that the call was wrong (perhaps in a
|
||||
parallel universe, I would think we shouldn't have rebuild those from scratch because it delayed
|
||||
delivering other important things). But I do feel it was wrong right now. As a consequence, today we
|
||||
have several data products and architectural constraints which are a royal pain in the ass to maintain
|
||||
and grow with the company. For example, we're currently stuck with a lot of reporting in Power BI, which
|
||||
I hate for multiple reasons, one of them being <a href="i-want-code-defined-dashboards-so-badly.html"
|
||||
target="_blank" rel="noopener noreferrer">how badly I want to have code defined dashboards</a>.
|
||||
Bulldozering and replacing now will be much more painful than it would have been back then, so the
|
||||
velocity of the team is now paying for this mistake.
|
||||
</p>
|
||||
<p>
|
||||
There goes another lesson: if you're inheriting a few, small legacy products that don't fit in your view
|
||||
and you can afford to remove or re-build them to allow your plans to be pristine, do it now. Don't wait.
|
||||
</p>
|
||||
<p>
|
||||
A third thing I would have done differently is to start working on Data literacy across the company
|
||||
earlier and more intensely. In part of the work we do in the Data team, we come in with a very polished,
|
||||
mature result: this is exactly what you need to know, or the exact pristine, read-only grade data you
|
||||
had to check, or the informed decision you are after. But in many other cases, our delivery ends at the
|
||||
start of what I like to call the analytical last-mile: we produce some curated piece of insight and/or
|
||||
data that a colleague will grab and work a bit more before getting to the final business outcome. For
|
||||
example, I might export a set of KPIs around certain client accounts, and an account manager will pivot
|
||||
and fuck around with them to decide certain things around how he will handle some comms with his
|
||||
accounts.
|
||||
</p>
|
||||
<figure>
|
||||
<img src="../static/data-alley-oop.jpeg" width="75%">
|
||||
</figure>
|
||||
<p>
|
||||
This kind of two-step deliveries sometimes are insanely valuable: there are some analysis where
|
||||
the final user of the data is the most capable of juicing the right way, such as when someone takes
|
||||
the data to use it in real time in a negotiation meeting.
|
||||
</p>
|
||||
<p>
|
||||
The bad news is that, if John from Marketing fucking sucks at Power BI, Excel, or whatever tooling you
|
||||
rely on to interact with your colleagues, the whole plot falls down. I'm not expecting John to crank out
|
||||
a monster workbook with thirty layers of business logic and seven modules of VBA, but pivoting a bit,
|
||||
filtering, multiply this col by that row, etc. You get the gist. If you're thinking to yourself: "anyone
|
||||
can do such basic things!", I would kindly invite you to sit down 15 min with some rando in your company
|
||||
and see them use Excel live, potentially having chunked some tranquilizer down.
|
||||
</p>
|
||||
<p>
|
||||
Training more colleagues outside the Data team to be more proficient working with data (let it be
|
||||
working on Excel, writing SQL or just reasoning) is something we're actively working on now, but I think
|
||||
we should have started out earlier. I think this because of two reasons: the first, the ROI, measured in
|
||||
time, is tremendous. 5 hours of working with Jane from Finance to skill her up from 0 to Excel basics
|
||||
are probably much more impactful than that extra dashboard you are building, which you already sense
|
||||
won't be that used or valuable. The second is, this is one of the things that has a capped max pace to
|
||||
it, and won't happen faster than that no matter how hard you would like it to, or how many resources and
|
||||
money you want to throw at it. People learn at a certain pace, so it's better to start early and slow,
|
||||
than to trick yourself into believing you'll be able to turn Jimmy into a Power BI God in 2 days if you
|
||||
try hard enough when the time comes.
|
||||
</p>
|
||||
<p>
|
||||
So, final lesson: I would start running Data literacy initiatives early. It doesn't need to take a big
|
||||
chunk of your capacity: sprinkling a few open sessions here and there, and running some 1:1 or small
|
||||
group ones with bright people, should be more than enough. It will compound over time, leading to the
|
||||
company better leveraging your data and tools, and the central Data team having more capacity to keep
|
||||
building since more end-users will be more independent.
|
||||
</p>
|
||||
<p>
|
||||
There goes my non-exhaustive list of things I would do differently when starting a Data team from
|
||||
scratch. I hope it serves you if you are in a similar spot. I'm personally delighted with the idea of
|
||||
not screwing up in these ways if I ever find myself starting another team from scratch. I would very
|
||||
much rather screw up in new ways.
|
||||
</p>
|
||||
<hr>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
</section>
|
||||
</main>
|
||||
|
||||
</body>
|
||||
|
||||
</html>
|
||||
|
|
@ -1,94 +0,0 @@
|
|||
<!DOCTYPE html>
|
||||
<html>
|
||||
|
||||
<head>
|
||||
<title>Pablo here</title>
|
||||
<meta charset="utf-8" />
|
||||
<meta viewport="width=device-width, initial-scale=1" />
|
||||
<link rel="stylesheet" href="../styles.css" />
|
||||
</head>
|
||||
|
||||
<body>
|
||||
<main>
|
||||
<h1>Hi, Pablo here</h1>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
<hr />
|
||||
<section>
|
||||
<h2>Is your drug dealer a homophobic socialist?</h2>
|
||||
<p>
|
||||
Lately, I've noticed a branch of
|
||||
<a href="https://en.wikipedia.org/wiki/Cancel_culture" target="_blank" rel="noopener noreferrer">cancel
|
||||
culture</a>
|
||||
I've come to find quite disturbing. I think it has mainly extended in
|
||||
the US, though I think it's starting to happen in Europe too. It's
|
||||
this tendency for people at companies to politically and morally judge
|
||||
business counterparties and come to the conclusion that business
|
||||
shouldn't be done with them because of it.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
I experienced this first hand during some afterwork beers, and for
|
||||
some reason the scene got burned into my retina. A colleague of mine,
|
||||
beer in hand, said something like, “We're working with this customer,
|
||||
and they're unbearable because they complain a lot and challenge us
|
||||
all the time when we run the monthly reconciliation. Plus, they're
|
||||
from Israel.” I was mindblown at how casually that was dropped, with
|
||||
not even a footnote-like explanation deemed necessary. I played my 5
|
||||
year old child attitude card and asked, "What's the problem with them
|
||||
being in Israel?" She said, "Well, you know, they're in Israel and the
|
||||
whole thing is happening. It's terrible. We shouldn't deal with them."
|
||||
</p>
|
||||
|
||||
<p>
|
||||
I couldn't hold it in: I asked her if her hairdresser was from Israel.
|
||||
She looked at me completely puzzled: “I don't know. Why does that
|
||||
matter?” I told her, “I don't know. Apparently, you're upset about
|
||||
dealing people from Israel, so I'm assuming you need to check if
|
||||
everyone you do business with is from there to not do it if that's the
|
||||
case.” Silent stood and the air got thick. Someone jumped in with a
|
||||
nervous joke to break up the tension that my child like questions had
|
||||
somehow brought to the room, and the conversation moved on.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Ever since that day, I've come across this kind of
|
||||
social-justice-business-censor thinking pop up a lot. Since that fun
|
||||
first encounter, whenever someone points out at how business should
|
||||
not be done with <whatever ideology/country/demographic they don't
|
||||
like>, I started jokingly triggering them by asking, “Actually, are
|
||||
you making sure your drug dealer a homophobic socialist?” They
|
||||
generally laugh, not grasping how their stances on politically
|
||||
deciding to do or not do business with someone sound as ridicolous to
|
||||
me.
|
||||
</p>
|
||||
|
||||
<img src="../static/homophobic-socialist-drug-dealer.png" alt="" style="width: 50%" />
|
||||
|
||||
<p>
|
||||
Here's what disturbs me: trade is a very civilized act. When we
|
||||
trade—whether it's goods, services, or anything else—we're putting
|
||||
aside our differences and doing something mutually beneficial. We both
|
||||
walk away better off. We hurt no one. We make things a tiny bit better
|
||||
overall. Deciding not to trade with someone because of some political
|
||||
detail which is completely irrelevant to the trade itself is
|
||||
backwards. Even if I didn't like communists, I wouldn't care if a
|
||||
communist is selling me bananas. It just doesn't matter.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Seeing people blow up trade over politics makes me sad. I think it's
|
||||
ignorant and hateful. And I don't think they realize where that kind
|
||||
of thinking can lead.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
In the end, I just hope people can leave politics out of business.
|
||||
Let's do business and all be better off thanks to it.
|
||||
</p>
|
||||
<hr />
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
</section>
|
||||
</main>
|
||||
</body>
|
||||
|
||||
</html>
|
||||
|
|
@ -1,173 +0,0 @@
|
|||
<!DOCTYPE HTML>
|
||||
<html>
|
||||
|
||||
<head>
|
||||
<title>Pablo here</title>
|
||||
<meta charset="utf-8">
|
||||
<meta viewport="width=device-width, initial-scale=1">
|
||||
<link rel="stylesheet" href="../styles.css">
|
||||
</head>
|
||||
|
||||
|
||||
<body>
|
||||
<main>
|
||||
<h1>
|
||||
Hi, Pablo here
|
||||
</h1>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
<hr>
|
||||
<section>
|
||||
<h2>My tips and tricks when using Postgres as a DWH</h2>
|
||||
<p>In November 2023, I joined Superhog (now called Truvi) to start out the Data team. As part of that, I
|
||||
also drafted and deployed the first version of its data platform.
|
||||
</p>
|
||||
<p>The context led me to choose Postgres for our DWH. In a time of Snowflakes, Bigqueries and Redshifts,
|
||||
this might surprise some. But I can confidently say Postgres has done a great job for us, and I can even
|
||||
dare to say it has provided a better experience than other, more trendy alternatives could have. I'll
|
||||
jot down my rationale for picking Postgres one of these days.</p>
|
||||
<p>
|
||||
Back to the topic: Postgres is not intended to act as a DWH, so using it as such might feel a bit hacky
|
||||
at times. There are multiple ways to make your life better with it, as well as related tools and
|
||||
practices that you might enjoy, which I'll try to list here.
|
||||
</p>
|
||||
<h3>Use <code>unlogged</code> tables</h3>
|
||||
<p>The <a href="https://www.postgresql.org/docs/current/wal-intro.html" target="_blank"
|
||||
rel="noopener noreferrer">Write Ahead Log</a> comes active by default for the tables you create, and
|
||||
for good reasons. But in the context of an ELT DWH, it is probably a good idea to deactivate it by
|
||||
making your tables <code>unlogged</code>. <a
|
||||
href="https://www.crunchydata.com/blog/postgresl-unlogged-tables" target="_blank">Unlogged
|
||||
tables</a> will provide you with much faster writes (roughly, twice as fast) which will make data
|
||||
loading and transformation jobs inside your DWH much faster.
|
||||
</p>
|
||||
<p>You pay a price for this with a few trade offs, the most notable being that if your Postgres server
|
||||
crashes, <a href="https://www.postgresql.org/docs/current/sql-createtable.html#SQL-CREATETABLE-UNLOGGED"
|
||||
target="_blank" rel="noopener noreferrer">the contents of the unlogged tables will be lost</a>. But,
|
||||
again, if you have an ELT DWH, you can survive by running a backfill. In Truvi, we made the decision to
|
||||
have the landing area for our DWH be logged, and everything else unlogged. This means if we experienced
|
||||
a crash (which still hasn't happened, btw), we would recover by running a full-refresh dbt run.</p>
|
||||
<p>If you are using dbt, you can easily apply this by adding this bit in your <code>dbt_project.yml</code>
|
||||
:</p>
|
||||
<pre><code>
|
||||
models:
|
||||
+unlogged: true
|
||||
</code></pre>
|
||||
|
||||
<h3>Tuning your server's parameters</h3>
|
||||
<p><a href="https://www.postgresql.org/docs/current/runtime-config.html" target="_blank"
|
||||
rel="noopener noreferrer">Postgres has many parameters you can fiddle with</a>, with plenty of
|
||||
chances to either improve or destroy your server's performance.</p>
|
||||
<p>Postgres ships with some default values for it, which are almost surely not the optimal ones for
|
||||
your needs, <em>specially</em> if you are going to use it as a DWH. Simple changes like adjusting the
|
||||
<code>work_mem</code> will do wonders to speed up some of your heavier queries.
|
||||
</p>
|
||||
<p>There are many parameters to get familiar with and proper adjustment must be done taking your specific
|
||||
context and needs into account. If you have no clue at all, <a href="https://pgtune.leopard.in.ua"
|
||||
target="_blank" rel="noopener noreferrer">this little web app</a> can give you some suggestions you
|
||||
canstart from.
|
||||
</p>
|
||||
<h3>Running <code>VACUUM ANALYZE</code> right after building your tables</h3>
|
||||
<p>Out of the box, Postgres will automatically run
|
||||
<code><a href="https://www.postgresql.org/docs/current/sql-vacuum.html" target="_blank" rel="noopener noreferrer">VACUUM</a></code>
|
||||
and
|
||||
<code><a href="https://www.postgresql.org/docs/current/sql-analyze.html" target="_blank" rel="noopener noreferrer">ANALYZE</a></code>
|
||||
jobs <a href="https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM" target="_blank"
|
||||
rel="noopener noreferrer">automatically</a>. The triggers that determine when each of those gets
|
||||
triggered can be adjusted with a few server parameters. If you follow an ELT pattern, most surely
|
||||
re-building your non-staging tables will cause Postgres to run them.
|
||||
</p>
|
||||
<p>But there's a detail that is easy to overlook. Postgres automatic triggers will start those quite fast,
|
||||
but not right after you build each table. This poses a performance issue: if your intermediate sections
|
||||
of the DWH have tables that build upon tables, rebuilding a table and then trying to rebuild a dependant
|
||||
without having an <code>ANALYZE</code> on the first one before might hurt you.</p>
|
||||
<p>Let me describe this with an example, because this one is a bit of a tongue twister: let's assume we have
|
||||
tables <code>int_orders</code> and <code>int_order_kpis</code>. <code>int_orders</code> holds all of our
|
||||
orders, and <code>int_order_kpis</code> derives some kpis from them. Naturally, first you will
|
||||
materialize <code>int_orders</code> from some upstream staging tables, and once that is complete, you
|
||||
will use its contents to build <code>int_order_kpis</code>.
|
||||
</p>
|
||||
<p>
|
||||
Having <code>int_orders</code> <code>ANALYZE</code>-d before you start building
|
||||
<code>int_order_kpis</code> is highly benefitial for your performance in building
|
||||
<code>int_order_kpis</code>. Why? Because having perfectly updated statistics and metadata on
|
||||
<code>int_orders</code> will help Postgres' query optimizer better plan the necessary query to
|
||||
materialize <code>int_order_kpis</code>. This can improve performance by orders of magnitude in some
|
||||
queries by allowing Postgres to pick the right kind of join strategy for the specific data you have, for
|
||||
example.
|
||||
</p>
|
||||
<p>Now, will Postgres auto <code>VACUUM ANALYZE</code> the freshly built <code>int_orders</code> before you
|
||||
start building <code>int_order_kpis</code>? Hard to tell. It depends on how you build your DWH, and how
|
||||
you've tuned your server's parameters. And the most dangerous bit is you're not in full control: it can
|
||||
be that <em>sometimes</em> it happens, and other times it doesn't. Flaky and annoying. Some day I'll
|
||||
write a post on how this behaviour drove me mad for two months because it made a model sometimes built
|
||||
in a few seconds, and other times in >20min.
|
||||
</p>
|
||||
<p>
|
||||
My advice is to make sure you always <code>VACUUM ANALYZE</code> right after building your tables. If
|
||||
you're using dbt, you can easily achieve this by adding this to your project's
|
||||
<code>dbt_project.yml</code>:
|
||||
<pre><code>
|
||||
models:
|
||||
+post-hook:
|
||||
sql: "VACUUM ANALYZE {{ this }}"
|
||||
transaction: false
|
||||
# ^ This makes dbt run a VACUUM ANALYZE on the models after building each.
|
||||
# It's pointless for views, but it doesn't matter because Postgres fails
|
||||
# silently withour raising an unhandled exception.
|
||||
</code></pre>
|
||||
</p>
|
||||
<h3>Monitor queries with <code>pg_stats_statements</code></h3>
|
||||
<p><a href="https://www.postgresql.org/docs/current/pgstatstatements.html" target="_blank"
|
||||
rel="noopener noreferrer">pg_stats_statements</a> is an extension that nowadays ships with Postgres
|
||||
by default. If activated, it will log info on the queries executed in the server which you can check
|
||||
afterward. This includes many details, with how frequently does the query get called and what's the min,
|
||||
max and mean execution time being the ones you probably care about the most. Looking at those allows you
|
||||
to find queries that take long each time they run, and queries that get run a lot.
|
||||
</p>
|
||||
<p>Another important piece of info that gets recorded is <em>who</em> ran the query. This is helpful
|
||||
because, if you use users in a smart way, it can help you isolate expensive queries on different uses
|
||||
cases or areas. For example, if you use different users to build the DWH and to give your BI tool read
|
||||
access (you do that... right?), you can easily tell apart dashboard related queries from internal, DWH
|
||||
transformation ones. Another example could be internal reporting vs embedded analytics in your product:
|
||||
you might have stricter performance SLAs for product-embedded, customer-facing queries than for internal
|
||||
dashboards. Using different users and <code>pg_stats_statements</code> makes it possible for you to
|
||||
dissect performance issues on those separate areas independently.</p>
|
||||
<h3>Dalibo's wonderful execution plan visualizer</h3>
|
||||
<p>Sometimes you'll have some nasty query you just need to sit down with and optimize. In my experience, in
|
||||
a DWH this ends up happening with queries that involve many large tables in sequential joining and
|
||||
aggregation steps (as in, you join a few tables, group to some granularity, join some more, group again,
|
||||
etc).
|
||||
</p>
|
||||
<p>You can get the query's real execution details with <code>EXPLAIN ANALYZE</code>, but the output's
|
||||
readability is on par with morse-encoded regex patterns. I always had headaches dealing with them until
|
||||
I came across <a href="https://dalibo.com/" target="_blank" rel="noopener noreferrer">Dalibo</a>'s <a
|
||||
href="https://explain.dalibo.com/" target="_blank" rel="noopener noreferrer">execution plan
|
||||
visualizer</a>. You can paste the output of <code>EXPLAIN ANALYZE</code> there and see the query
|
||||
execution presented as a diagram. No amount of words will portray accurately how awesome the UX is, so
|
||||
I encourage you to try the tool with some nasty query and see for yourself.</p>
|
||||
<h3>Local dev env + Foreign Data Wrapper</h3>
|
||||
<p>One of the awesome things of using Postgres is how trivial it is to spin up an instance. This makes
|
||||
goofing around much more simpler than whenever setting up a new instance means paperwork, $$$, etc.</p>
|
||||
<p>Data team members at Truvi have a dockerized Postgres running in their laptops that they can use when
|
||||
they are developing on our DWH dbt project. In the early days, you could grab some production dump with
|
||||
some subset of tables from our staging layer and run significant chunks of our dbt DAG in your laptop if
|
||||
you were patient.</p>
|
||||
<p>A few hundreds of models later, this evolved to increasingly difficult and finally became impossible.
|
||||
</p>
|
||||
<p>Luckily, we came across Postgres' <a
|
||||
href="https://www.postgresql.org/docs/current/postgres-fdw.html">Foreign Data Wrapper</a>. There's
|
||||
quite a bit to it, but to keep it short here, just be aware that FDW allows you to make a Postgres
|
||||
server give access to some table in a different Postgres server while pretending they are local. So, you
|
||||
query table X in Postgres server A, even though table X is actually stored in Postgres server B. But
|
||||
your query works just the same as if it was a local genuine table.</p>
|
||||
<p>Setting these up is fairly trivial, and has allowed our dbt project contributors to be able to execute
|
||||
hybrid dbt runs where some data and tables is local to their laptop, whereas some upstream data is being
|
||||
read from production server's. The approach has been great so far, enabling them to actually test models
|
||||
before commiting them to master in a convenient way.</p>
|
||||
<hr>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
</section>
|
||||
</main>
|
||||
|
||||
</body>
|
||||
|
||||
</html>
|
||||
|
|
@ -1,203 +0,0 @@
|
|||
<!DOCTYPE HTML>
|
||||
<html>
|
||||
|
||||
<head>
|
||||
<title>Pablo here</title>
|
||||
<meta charset="utf-8">
|
||||
<meta viewport="width=device-width, initial-scale=1">
|
||||
<link rel="stylesheet" href="../styles.css">
|
||||
</head>
|
||||
|
||||
|
||||
<body>
|
||||
<main>
|
||||
<h1>
|
||||
Hi, Pablo here
|
||||
</h1>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
<hr>
|
||||
<section>
|
||||
<h2>Notes for myself during my departure from Superhog</h2>
|
||||
<p>I'm writing this a few days before my last day at Superhog (now called Truvi). Having a few company
|
||||
departures under my belt already, I know a bit on what will come next. I know one part of the drill is
|
||||
that 99% of the details of what happened during my tenure at the company will completely disappear from
|
||||
my memory for the most part, only triggered by eerily coincidental cues here and there every few years.
|
||||
I will remember clearly a few crucial, exciting days and situations. I will also hold well the names and
|
||||
faces of those with who I worked closely, as well as my personal impression and judgement of them. I
|
||||
will remember the office, and some details of how my daily life was when I went there.</p>
|
||||
<p>But most other things will be gone from my brain, surprisingly fast.</p>
|
||||
<p>Knowing that experience is a great teacher, and regretting not doing this in the past, I've decided to
|
||||
collect a few notes from my time at Superhog, hoping they will serve me in making the lessons I've
|
||||
learnt here stick properly.</p>
|
||||
<ul>
|
||||
<li>Growing really fast an organization without an incredibly solid vision you're going to stick to is
|
||||
terrible. Time, money and effort will be wasted left and right, and unless you have some magic tric
|
||||
up your sleeve, you'll run out of money and panic. Growth is about having a great vision, going for
|
||||
it and hoping for the best, not about collecting resources and hoping that they will somehow align
|
||||
themselves towards making money.</li>
|
||||
<li>
|
||||
If you're in the leadership of a company, you make decisions, and then things go badly because of
|
||||
them, people are going to think you've fucked up and won't be happy about it. Should you publicly
|
||||
retrospect in an intelligent way and clearly show you've learnt the lesson, you have a chance at
|
||||
some degree of redemption, and you might even make some of the employees hopeful again that there's
|
||||
still a second chance to go for success. If you don't retrospect at all and pretend the mishaps have
|
||||
nothing to do with your management, they won't just think you're incompentent: they simply won't
|
||||
take you seriously anymore, and won't be honest to you because smart people don't invest calories in
|
||||
arguing with people who they consider idiots.
|
||||
</li>
|
||||
<li>
|
||||
If you're in a B2B business where customers will have a long-term relationship with you, and you
|
||||
have sales people, giving them incentives that are all about getting people onboard, and not about
|
||||
long term performance, might be an expensive mistake. I've observed sales people who only care about
|
||||
scoring deals engage in undesirable behaviours such as:
|
||||
<ul>
|
||||
<li>Sell to anyone, regardless of whether there's a good fit between your offering and the
|
||||
customer needs.</li>
|
||||
<li>Cut corners, surely do and say things that are in moral gray areas. If you're unlucky, even
|
||||
clearly cross moral red lines.</li>
|
||||
<li>
|
||||
Drive the people who build and deliver your offering crazy. Since they have no incentive to
|
||||
care about what happens after a deal is signed, they don't care if their actions in the
|
||||
sales pipeline turn into landmines during the long-term business relationship and execution
|
||||
of the service.
|
||||
</li>
|
||||
</ul>
|
||||
I think many of these issues get solved by structuring compensation so that sales people do well
|
||||
once the leads they convert have been doing well for some time, however you want to measure that.
|
||||
Not only nasty behaviour can be avoided, but even new, good and constructive actions might arise.
|
||||
For example, your sales people will care more about building a great product, and so they'll
|
||||
regularly feedback to engineers and operations and care deeply about collaborating in improving
|
||||
things.
|
||||
</li>
|
||||
<li>
|
||||
If you're lucky to find talented employees, go crazy about retaining them.
|
||||
</li>
|
||||
<li>
|
||||
The unexpected death of collegues can be a great blow to the business.
|
||||
</li>
|
||||
<li>Non-technical founders need CTOs with strong characters nearby to protect them from themselves.
|
||||
Soft-hearted CTOs with a pleasing attitude and aversion to conflict will feel sweet at first, but
|
||||
their lack of fightning back on certain topics will lead to sour consequences down the line.</li>
|
||||
<li>During my tenure, my team and I managed to deliver astonishing amounts of value with extremely
|
||||
simple tooling that was just enough for what we needed. This had many advantages and was a silent
|
||||
win. It's not sexy, but I think it should be.</li>
|
||||
<li>If you are silently efficient budget wise, as in you manage to achieve something consuming way less
|
||||
money than whatever is average for your context, but you don't explain it are notably noisy about
|
||||
it, nobody will give a damn. Even worse, your levels of efficiency may be taken for granted and you
|
||||
might encounter trouble when asking for more bucks, even if you're still way below average.</li>
|
||||
<li>When there's a feeling that a ship is going down, I've observed there's a direct correlation between
|
||||
how talented an employee is and the chances he departs early. The less gifted will stay until the
|
||||
end.</li>
|
||||
<li>
|
||||
If you're a SaaS and want to scale, don't leave your Finance team orphan of IT resources. Invoicing,
|
||||
gathering customer payment details, the most frequent accounting journals, etc. should be treated as
|
||||
first class requirements of your architecture, not as an afterthought. Your finance team needs to
|
||||
grow in engineers, not accountants. And if you have the feeling that the number of accountants is
|
||||
growing linearly with the volume of the business, you are in serious trouble and need to do
|
||||
something. Failing to do this will lead to some very nasty tech debt that will kill your speed and
|
||||
potentially make you lose a lot of money.
|
||||
</li>
|
||||
<li>If you've had employees rotating through various departments in your org, doing very different jobs,
|
||||
their views and opinions are worth solid gold and should be valued as such.</li>
|
||||
<li>Right befores starting in this company, I had just read the book It doesn't have to be crazy at work
|
||||
by Jason Fried and DHH. At that time I thought I believed by then that it's worth creating a calm
|
||||
environment to think clearly, since doing the right thing is way more important than executing fast,
|
||||
and fast paced environments are not great to keep your head clear. After my time here, I'm a x100
|
||||
times more of a believer.</li>
|
||||
<li>Giving people in the business some basics on SQL is really useful, but that usefulness gets
|
||||
multiplied by the tidyness and documentation of your DWH. If they need to call you up every time
|
||||
because there's no way they can find and understand what they need in the DWH, teaching them SQL is
|
||||
pointless and only leads to frustration.</li>
|
||||
<li>
|
||||
If you were part of the decision to hire someone, and then they decide to leave, you should talk
|
||||
with them. Even if you're not working together every day anymore and the org has changed quite a
|
||||
bit. You had a stake in this person's entrance, they remember it vividly, and not calling them to
|
||||
grab a coffee and say bye properly will disappoint them.
|
||||
</li>
|
||||
<li>If a manager gets fired and you get their direct reports now reporting to you, and you know they had
|
||||
strong respect for him, make sure to recognize that feeling the first thing. Saying something along
|
||||
the lines of "Guys, I know you respected X and were fond of working with him, and that you might not
|
||||
be happy with his departure and having to report to me instead. I understand that and think it's
|
||||
natural", will go a long way in helping with the grieving and making them feel more comfortable.
|
||||
</li>
|
||||
<li>
|
||||
Engineering leadership is quite a bit like parenting when it comes to mirroring. Regardless of what
|
||||
you say should be done, people will ignore that a lot and tend to do what you do. If senior
|
||||
engineers do patchy shit on the database, don't document a thing, cut corners instead of building
|
||||
properly, mindlessly submit to absurd requests instead of collaborating productively with their
|
||||
non-tech colleagues, etc, the rest of engineers will do it as well, regardless on how many training
|
||||
sessions on best practices you run. Conversely, if you focus on quality, give time and room to do
|
||||
things right, reward ingenious solutions to problems, treat incidents in professional and serious
|
||||
ways, push back from stupid managerial situations to work things out in a way that is good
|
||||
for everyone, document your work properly, etc. you will soon find the rest of your colleagues
|
||||
(specially, the most junior ones) following your lead, often times without you even needing to
|
||||
insist on good practices.
|
||||
</li>
|
||||
<li>People care little about having an office on the beachfront.</li>
|
||||
<li>If you manage to raise a team to have team-ownership mentality (as in, they know what's their high
|
||||
level goal and will always strive for it, even if you don't provide strict guidance or are not
|
||||
around at all), you might get an ego hit because it suddenly feels like you are not needed. Rest
|
||||
assured, even if you're not immediately needed, your team values you, and other managers and
|
||||
executives will notice the work you've done with the team, even if they don't voice it.
|
||||
Unfortunately, it's likely that the appreciation for your achievement only becomes visible when you
|
||||
decide to leave and people panic.</li>
|
||||
<li>People with cowardly and anxious characters might seem harmless. But don't undersestimate their
|
||||
ability to allow terrible things to happen precisely because stepping up and acting would require
|
||||
courage and they have none. Basically the way <a href="https://www.youtube.com/watch?v=uW9Q1cm_Tnw"
|
||||
target="_blank" rel="noopener noreferrer">Upham gets Mellish killed in Saving Private Ryan</a>.
|
||||
The attrocities that can be tolerated or even supported by cowards in this manner can be terrible,
|
||||
and even more depressing that the loud acts of bold, evil men. Plus, if there are many cowards
|
||||
clustered together, group thinking will give them comfort and normalize the behaviour in the fashion
|
||||
of the bystander effect.
|
||||
</li>
|
||||
<li>
|
||||
Only badmouth a colleague in front of others if you are going to eventually raise the issue to
|
||||
either him or his superiors. Don't ever pointlessly complain with third parties if you will never
|
||||
act on the issue. Specially with your direct reports. It generates a nasty victim culture, and once
|
||||
you get it started, it's hard to stop.
|
||||
</li>
|
||||
<li>
|
||||
I have found that the lack of easy access to data and skills like SQL and data analysis are orders
|
||||
of magnitude less of an issue to organizational data literacy compared to managers who don't expect
|
||||
a data-driven approach from their reports. People start caring about data when their boss demands
|
||||
they do their work using data. People continue ignoring data when their boss tolerates the lack of
|
||||
data. And when observing this, it's vital to also keep in mind that reports adjust their behaviour
|
||||
to what their manager actually rewards/punishes, not what their manager <em>says</em> he will
|
||||
reward/punish.
|
||||
</li>
|
||||
<li>Joel Spolsky wrote <a
|
||||
href="https://www.joelonsoftware.com/2000/04/06/things-you-should-never-do-part-i/">this thing
|
||||
you should never do</a>. The company decided to do it anyway. We fucked up and we paid for the
|
||||
consequences exactly as Spolsky lays out.</li>
|
||||
<li>Employees with little retrospective capacity are dangerous. Partly related to the previous point.
|
||||
This quote from Spolsky hits right on the nail: <em>It's important to remember that when you start
|
||||
from scratch there is absolutely no reason to believe that you are going to do a better job than
|
||||
you did the first time. First of all, you probably don't even have the same programming team
|
||||
that worked on version one, so you don't actually have “more experience”. You're just going to
|
||||
make most of the old mistakes again, and introduce some new problems that weren't in the
|
||||
original version.</em> I would only nuance that if you can retrospect deeply and learn from the
|
||||
mistakes you made on the first run, perhaps there's some hope you may do a better job starting from
|
||||
scratch. But I guess if you manage to retrospect and learn properly, you could also do a better job
|
||||
working yourself out incrementally from the fucked up situation you failed yourself into.</li>
|
||||
<li>If you set up a variable/bonus scheme, refrain from changing the structure frequently, even if you
|
||||
are not really making it more stingy. Too much shuffling on that area will get employees thinking
|
||||
you're playing games on them, even if it's not the case.</li>
|
||||
<li>When an engineer who designed, deployed, and since then operated a non-trivial production system is
|
||||
about to leave, ask him to finish his handover and lock himself out days before his actual last
|
||||
working day. Challenge him to a nice treat by making it clear that, once he's locked out, he doesn't
|
||||
have any other duties other than help his colleagues should something not work after his departure.
|
||||
This way, you provide him with an incentive to handover as perfect and fast as possible (paid
|
||||
holidays), and you make sure that you get a chance to try to operate the system without him
|
||||
<em>before</em> he leaves. Not doing this, and instead simply waiting for his last day to remove his
|
||||
creds and users from the infra, is dangerous.
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
<hr>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
</section>
|
||||
</main>
|
||||
|
||||
</body>
|
||||
|
||||
</html>
|
||||
|
|
@ -1,78 +0,0 @@
|
|||
<!DOCTYPE HTML>
|
||||
<html>
|
||||
|
||||
<head>
|
||||
<title>Pablo here</title>
|
||||
<meta charset="utf-8">
|
||||
<meta viewport="width=device-width, initial-scale=1">
|
||||
<link rel="stylesheet" href="../styles.css">
|
||||
</head>
|
||||
|
||||
|
||||
<body>
|
||||
<main>
|
||||
<h1>
|
||||
Hi, Pablo here
|
||||
</h1>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
<hr>
|
||||
<section>
|
||||
<h2>The ROI of toilets</h2>
|
||||
<p>Years ago I worked under the organizational umbrella of this COO. He was my boss' boss. Sometimes we
|
||||
bumped into each other for big meetings and presentations.</p>
|
||||
<p>The COO had a background in finance and audit, which gave him certain management quirks that coupled in
|
||||
rather funny ways with the nature of our data and analytics departments. There was this specific one
|
||||
that was always itchy to me. At the time I was still a very junior and inexperienced professional, and
|
||||
my default stance on things was to humble out, shut the fuck up and listen. But I always had my opinions
|
||||
locked in my brain, and in cases like this one, I couldn't hold them back.</p>
|
||||
<p>
|
||||
The quirk this gentleman had was to try to measure the ROI of every little thing. He would ask for the
|
||||
ROI of projects, the ROI of developments, the ROI of acquiring licenses, the ROI of going out for a
|
||||
smoke. It was an understandable quirk for a financier who had never actually built or serviced anything,
|
||||
but rather always looked, judged and measured from the outside. He wasn't that interested in the things
|
||||
themselves, but rather in measuring them in units that would fit in his Excel sheets.
|
||||
</p>
|
||||
<p>
|
||||
I generally thought (and still think) that assessing ROI is a good thing to aim for. But intuitively, I
|
||||
found his obsession with it misplaced and counterproductive. I now have much better words to critique
|
||||
and argue against his stance, but at the time I lacked those and only had a gut feeling of "this is
|
||||
stupid".
|
||||
</p>
|
||||
<p>
|
||||
One day we were in one of those meetings where he would start asking about the ROI of something while I
|
||||
thought to myself: "We just need this thing and it's obviously more valuable that the money it will
|
||||
cost, why are we having this conversation uuuugh". As I spiritually (not physically) rolled my eyes, I
|
||||
couldn't hold it in anymore and just shot: "What's the ROI of the office toilets?"
|
||||
</p>
|
||||
<p>
|
||||
The COO and my boss suddenly stared at me, mouths open, eyebrows pressed down as they squinted their
|
||||
eyes: "What?"
|
||||
</p>
|
||||
<p>
|
||||
"We have toilets. We have to pay for them. We could use them for desk space, but instead we put toilets.
|
||||
Then we have to do plumbing and stuff. We need to pay people to clean them. it's a nuisance. How can we
|
||||
know that they are the best use of shareholder funds? Who has measured the ROI of those toilets?". It
|
||||
all came out naturally out of the blue. I had a great relationship with these people, but I still was
|
||||
clenching my but, wondering if I had gone a bit too hard. Oh how nice it is to be young.
|
||||
</p>
|
||||
<p>
|
||||
They chuckled and got my point. The COO stopped insisting on specific figures for the cost
|
||||
element we had at hand, although he didn't surrender a good old "write a business case for this so we
|
||||
can refer to it later", which probably was a wise thing to do.
|
||||
</p>
|
||||
<p>
|
||||
I've faced similar situations a few times since then, and I've found myself in many others where it was
|
||||
up to me if and how precisely should the ROI of something be measured. I now have a much clear mental
|
||||
model and opinion of when it should and shouldn't be done. But that's for another day.
|
||||
</p>
|
||||
<p>
|
||||
Nobody ever told me what was the ROI of the toilets, though. Perhaps we should remove them?
|
||||
</p>
|
||||
<hr>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
</section>
|
||||
</main>
|
||||
|
||||
</body>
|
||||
|
||||
</html>
|
||||
132
public/writings/valuing-data-teams-output.html
Normal file
|
|
@ -0,0 +1,132 @@
|
|||
<!DOCTYPE HTML>
|
||||
<html>
|
||||
|
||||
<head>
|
||||
<title>Pablo here</title>
|
||||
<meta charset="utf-8">
|
||||
<meta viewport="width=device-width, initial-scale=1">
|
||||
<link rel="stylesheet" href="../styles.css">
|
||||
</head>
|
||||
|
||||
|
||||
<body>
|
||||
<main>
|
||||
<h1>
|
||||
Hi, Pablo here
|
||||
</h1>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
<hr>
|
||||
<section>
|
||||
<h2>Valuating data teams output</h2>
|
||||
<p>
|
||||
- Freedom and fucking up
|
||||
- How work looks like
|
||||
- This is an economics problem
|
||||
- Fairy tail organizational designs
|
||||
</p>
|
||||
<p>
|
||||
In 2023, I had the chance to do something not a lot of people get to do: I started a Data team in a
|
||||
startup (<a href="https://truvi.com/">Truvi, formerly Superhog</a>) from scratch.
|
||||
</p>
|
||||
<p>
|
||||
Being in a greenfield situation, both in organization and technical terms, was equally challenging and
|
||||
rewarding. It gave me the right space and craving to spend time thinking on stuff I hadn't before. This
|
||||
included very foundational questions such as... what should the Data team do? The kind of stuff you
|
||||
don't think about much when you land in a cruise ship that's already been rolling for a while, and you
|
||||
get told your job is to pull that lever up and down when the light tells you to. Ever since, I've had
|
||||
the
|
||||
chance to learn and think a lot about embedding a Data team in a small SaaS company.
|
||||
</p>
|
||||
<p>
|
||||
One of the hard and interesting topics is how do you measure the success of the team. How do you look at
|
||||
what the team has done and answer the following questions:
|
||||
</p>
|
||||
<ul>
|
||||
<li>How valuable is this thing we delivered?</li>
|
||||
<li>Was it the most valuable think we could have done?</li>
|
||||
</ul>
|
||||
<p>
|
||||
These are not trivial questions. Because it's easy to fuck up. Being a nimble team in a small company,
|
||||
the amount of flexibility you enjoy is ecstatic. You can (and usually need to) pivot a lot, very fast.
|
||||
But with freedom comes responsibility, and the pleasure of having many choices comes with the pain of
|
||||
wondering if you're screwing up in what you choose.
|
||||
</p>
|
||||
<h2>
|
||||
How work looks like
|
||||
</h2>
|
||||
<p>
|
||||
I find experience and real situations make abstract rants like this one much more interesting, so let me
|
||||
explain a bit what the Data team at Truvi faces on a daily basis to give some context.
|
||||
</p>
|
||||
<p>
|
||||
Truvi is a SaaS company that services short-term rentals (STR) hosts and guests. Our goal is to help
|
||||
both parties reduce and manage risk in their bookings. Risk here means, for the other part, the other
|
||||
party doing something nasty to you (e.g. your guest burns down your BnBs kitchen, or your host let's you
|
||||
know the property you booked is flooded right when you show up at the door on a Monday night at
|
||||
11:30PM). We offer multiple services, like screening and protection, to help both parties manage this,
|
||||
and we charge fees for it.
|
||||
</p>
|
||||
<p>
|
||||
We deliver our services through a couple of in-house developed applications and some API integrations
|
||||
with <a href="https://en.wikipedia.org/wiki/Property_management_system">PMSs</a>, <a
|
||||
href="https://en.wikipedia.org/wiki/Online_travel_agency">OTAs</a> and other funky acronym-named
|
||||
types of companies involved in the STR industry.
|
||||
</p>
|
||||
<p>
|
||||
The Data team's main responsibility, as defined by me, is to ensure people in the company know what they
|
||||
need to know. We deliver this in multiple ways:
|
||||
</p>
|
||||
<ul>
|
||||
<li>We maintain a lot of reporting. Some of it might be company-wide KPIs all management looks at,
|
||||
some others are more operational detail that only affect certain teams or functions.</li>
|
||||
<li>
|
||||
We keep ourselves available for adhoc, quick and dirty, one off requests. We rotate this through the
|
||||
different members of the team since it's quite disruptive for one's agenda and focus.
|
||||
</li>
|
||||
<li>
|
||||
We deliver adhoc, slow and steady, brainy reports whenever people not only need Data, but someone
|
||||
who knows what he's doing because the analysis requires above average data literacy.
|
||||
</li>
|
||||
<li>
|
||||
We support data heavy projects, such as A/B testing or the acquisition of external data sources.
|
||||
</li>
|
||||
</ul>
|
||||
<p>
|
||||
Even if this categorization looks neat, the reality is more of a barrage of a million different things,
|
||||
coming through the door all at once without any order.
|
||||
</p>
|
||||
<p>
|
||||
Given our humble capacity for delivering and our colleagues heavy appetite for asking, only a sliver
|
||||
of what gets requested will be done soon. One of my jobs is to decide, together with the company
|
||||
leadership, what makes it in. It's a tough job at Truvi, and it's been a tough job at previous companies
|
||||
I've been at. I think that is the case because of poor organizational design. And I think we have a lot
|
||||
to learn from economics.
|
||||
</p>
|
||||
<h2>Economic calculation</h2>
|
||||
<p>
|
||||
The situation we have in my team is an economical one. We have lots of needs and we can't satisfy all
|
||||
of them.
|
||||
</p>
|
||||
<p>
|
||||
This is the same situation society faces at scale: there's plenty of capital and man hours we can put up
|
||||
to good use, but we have infinite options. What do we do more, hospitals, more schools or more beers?
|
||||
</p>
|
||||
<p>
|
||||
In society, despite what statists and bureacrats would like, these decisions are not made by a bunch of
|
||||
all knowing intellectuals in their parties office. They are made on the streets, by individuals that
|
||||
decide how to spend their own money and time.
|
||||
</p>
|
||||
<p>
|
||||
People spend their own very wisely. Even if it might look like they do stupid stuff, they don't. They do
|
||||
what's good for them, with their resources and preferences. Even if we don't share their choices. Even
|
||||
if we think we know better than them.
|
||||
</p>
|
||||
|
||||
<hr>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
</section>
|
||||
</main>
|
||||
|
||||
</body>
|
||||
|
||||
</html>
|
||||
|
|
@ -1,120 +0,0 @@
|
|||
<!DOCTYPE HTML>
|
||||
<html>
|
||||
|
||||
<head>
|
||||
<title>Pablo here</title>
|
||||
<meta charset="utf-8">
|
||||
<meta viewport="width=device-width, initial-scale=1">
|
||||
<link rel="stylesheet" href="../styles.css">
|
||||
</head>
|
||||
|
||||
|
||||
<body>
|
||||
<main>
|
||||
<h1>
|
||||
Hi, Pablo here
|
||||
</h1>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
<section>
|
||||
<h2>Why I Put My VMs on a ZFS Mirror</h2>
|
||||
<p><em>Part 1 of 3 in my "First ZFS Degradation" series. Also read <a href="a-degraded-pool-with-a-healthy-disk.html">Part 2: Diagnosing the Problem</a> and <a href="fixing-a-degraded-zfs-mirror.html">Part 3: The Fix</a>.</em></p>
|
||||
<h3>Why This Series Exists</h3>
|
||||
<p>A few weeks into running my new homelab server, I stumbled upon something I wasn't expecting to see that early: my ZFS pool was in "DEGRADED" state. One of my two mirrored drives had gone FAULTED.</p>
|
||||
<p>This was the first machine I had set up with a ZFS mirror, precisely to be able to deal with disk issues smoothly, without losing data and having downtime. Although it felt like a pain in the ass to spot the problem, I was also happy because it gave me a chance to drill the kind of disk maintenance I was hoping to do in this new server.</p>
|
||||
<p>But here's the thing: when I was in the middle of it, I couldn't find a single resource that walked through the whole experience in detail. Plenty of docs explain what ZFS <em>is</em>. Plenty of forum posts have people asking "help my pool is degraded." But nothing that said "here's what it actually feels like to go through this, step by step, with all the commands and logs and reasoning behind the decisions."</p>
|
||||
<p>So I wrote it down. I took a lot of notes during the process and crafted a more or less organized story from them. This three-part series is for fellow amateur homelabbers who are curious about ZFS, maybe a little intimidated by it, and want to know what happens when things go sideways. I wish I had found a very detailed log like this when I was researching ZFS initially. Hope it helps you.</p>
|
||||
<h3>The server and disks</h3>
|
||||
<p>My homelab server is a modest but capable box I built in late 2025. It has decent consumer hardware, but nothing remarkable. I'll only specify that I have currently three disks on it:</p>
|
||||
<ul>
|
||||
<li><strong>OS Drive</strong>: Kingston KC3000 512GB NVMe. Proxmox lives here.</li>
|
||||
<li><strong>Data Drives</strong>: Two Seagate IronWolf Pro 4TB drives (ST4000NT001). This is where my Proxmox VMs get their disks stored.</li>
|
||||
</ul>
|
||||
<p>The two IronWolf drives are where this story takes place. I labeled them AGAPITO1 and AGAPITO2 because... well, every pair of drives deserves a silly name. I have issues remembering serial numbers.</p>
|
||||
<p>The server runs Proxmox and hosts most of my self-hosted life: personal services, testing VMs, and my Bitcoin infrastructure (which I share over at <a href="https://bitcoininfra.contrapeso.xyz" target="_blank" rel="noopener noreferrer">bitcoininfra.contrapeso.xyz</a>). If this pool goes down, everything goes down.</p>
|
||||
<h3>Why ZFS?</h3>
|
||||
<p>I'll be honest: I didn't overthink this decision. ZFS is the default storage recommendation for Proxmox, it has a reputation for being rock-solid, and I'd heard enough horror stories about silent data corruption to want something with checksumming built in.</p>
|
||||
<p>What I was most interested in was the ability to define RAID setups in software and deal easily with disks going in and out of them. I had never gone beyond the naive "one disk for the OS, one disk for data" setup in previous servers. After having disks failing on me in previous boxes, I decided it was time to gear up and do it proper this time. My main concern initially was just saving time: it's messy when a "simple" host has disk issues, and I hoped mirroring would allow me to invest less time in cleaning up disasters.</p>
|
||||
<h3>Why a Mirror?</h3>
|
||||
<p>When I set up the pool, I had two 4TB drives. That gave me a few options:</p>
|
||||
<ol>
|
||||
<li><strong>Single disk</strong>: Maximum space (8TB usable), zero redundancy. One bad sector and you're crying.</li>
|
||||
<li><strong>Mirror</strong>: Half the space (4TB usable from 8TB raw), but everything is written to both drives. One drive can completely die and you lose nothing.</li>
|
||||
<li><strong>RAIDZ</strong>: Needs at least 3 drives, gives you parity-based redundancy. More space-efficient than mirrors at scale.</li>
|
||||
</ol>
|
||||
<p>I went with the mirror for a few reasons.</p>
|
||||
<p>First, I only had two drives to start with, so RAIDZ wasn't even an option yet.</p>
|
||||
<p>Second, mirrors are <em>simple</em>. Data goes to both drives. If one dies, the other has everything. No parity calculations, no write penalties, no complexity.</p>
|
||||
<p>Third (and this is the one that sold me), <strong>mirrors let you expand incrementally</strong>. With ZFS, you can add more mirror pairs (called "vdevs") to your pool later. You can even mix sizes: start with two 4TB drives, add two 8TB drives later, and ZFS will use all of it. RAIDZ doesn't give you that flexibility; once you set your vdev width, you're stuck with it.</p>
|
||||
<h4>When Would RAIDZ Make More Sense?</h4>
|
||||
<p>If you're starting with 4+ drives and you want to maximize usable space, RAIDZ starts looking attractive:</p>
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Configuration</th>
|
||||
<th>Drives</th>
|
||||
<th>Usable Space</th>
|
||||
<th>Fault Tolerance</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td>Mirror</td>
|
||||
<td>2</td>
|
||||
<td>50%</td>
|
||||
<td>1 drive</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>RAIDZ1</td>
|
||||
<td>3</td>
|
||||
<td>~67%</td>
|
||||
<td>1 drive</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>RAIDZ1</td>
|
||||
<td>4</td>
|
||||
<td>75%</td>
|
||||
<td>1 drive</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>RAIDZ2</td>
|
||||
<td>4</td>
|
||||
<td>50%</td>
|
||||
<td>2 drives</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>RAIDZ2</td>
|
||||
<td>6</td>
|
||||
<td>~67%</td>
|
||||
<td>2 drives</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<p>RAIDZ2 is popular for larger arrays because it can survive <em>two</em> drive failures, which matters more as you add drives (more drives = higher chance of one failing during a resilver).</p>
|
||||
<p>But for a two-drive homelab that might grow to four drives someday, I felt a mirror was the right call. I can always add another mirror pair later.</p>
|
||||
<h3>The Pool: proxmox-tank-1</h3>
|
||||
<p>My ZFS pool is called <code>proxmox-tank-1</code>. Here's what it looks like when everything is healthy:</p>
|
||||
<pre><code> pool: proxmox-tank-1
|
||||
state: ONLINE
|
||||
config:
|
||||
|
||||
NAME STATE READ WRITE CKSUM
|
||||
proxmox-tank-1 ONLINE 0 0 0
|
||||
mirror-0 ONLINE 0 0 0
|
||||
ata-ST4000NT001-3M2101_WX11TN0Z ONLINE 0 0 0
|
||||
ata-ST4000NT001-3M2101_WX11TN2P ONLINE 0 0 0</code></pre>
|
||||
<p>That's it. One pool, one mirror vdev, two drives. The drives are identified by their serial numbers (the <code>WX11TN0Z</code> and <code>WX11TN2P</code> parts), which is important — ZFS uses stable identifiers so it doesn't get confused if Linux decides to shuffle around <code>/dev/sda</code> and <code>/dev/sdb</code>.</p>
|
||||
<p>All my Proxmox VMs store their virtual disks on this pool. When I create a new VM, I point its storage at <code>proxmox-tank-1</code> and ZFS handles the rest.</p>
|
||||
<h3>What Could Possibly Go Wrong?</h3>
|
||||
<p>Everything was humming along nicely. VMs were running fine and I was feeling pretty good about my setup.</p>
|
||||
<p>Then, a few weeks in, I was poking around the Proxmox web UI and noticed something that caught my eye.</p>
|
||||
<p>The ZFS pool was DEGRADED. One of my drives — AGAPITO1, serial <code>WX11TN0Z</code> — was FAULTED.</p>
|
||||
<p>In <a href="a-degraded-pool-with-a-healthy-disk.html">Part 2</a>, I'll walk through how I diagnosed what was actually wrong. Spoiler: the drive itself was fine. The problem was much dumber than that.</p>
|
||||
<p><em>Continue to <a href="a-degraded-pool-with-a-healthy-disk.html">Part 2: Diagnosing the Problem</a></em></p>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
</section>
|
||||
</main>
|
||||
|
||||
</body>
|
||||
|
||||
</html>
|
||||
|
||||
|
|
@ -1,93 +0,0 @@
|
|||
<!DOCTYPE HTML>
|
||||
<html>
|
||||
|
||||
<head>
|
||||
<title>Pablo here</title>
|
||||
<meta charset="utf-8">
|
||||
<meta viewport="width=device-width, initial-scale=1">
|
||||
<link rel="stylesheet" href="../styles.css">
|
||||
</head>
|
||||
|
||||
|
||||
<body>
|
||||
<main>
|
||||
<h1>
|
||||
Hi, Pablo here
|
||||
</h1>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
<hr>
|
||||
<section>
|
||||
<h2>Your customers don't care that your bathroom is dirty</h2>
|
||||
<p>The other night I went out with the missus and we went to a fancy pants restaurants, which is unusual for
|
||||
us. We prefer neighbourhood, simple places.</p>
|
||||
<p>
|
||||
During the dinner, she went to the bathroom and came back horrified: "God, their bathroom is fucking
|
||||
disgusting". "Much worse than the usual one?", I asked. And she said: "No, but I would expect an upscale
|
||||
place like this to have it squeaky clean".
|
||||
</p>
|
||||
<p>
|
||||
I then laid down my thesis on why all restaurant bathrooms, even in really posh places, are always
|
||||
terrible: "They don't care because you don't really care". "I do care!", she hit me back. "No, you
|
||||
don't. You think you do: you obviously don't like it, and you would love to see it clean instead of all
|
||||
filthy. But the truth is, when next month you're thinking about where to go out for dinner, you'll judge
|
||||
this place and remember the meal, the waiters, how you felt. But not the bathroom. What was the last
|
||||
time you discarded a restaurant because the bathroom was gross". At this point she agreed, and quickly
|
||||
drew her attention to the desserts menu. Sometimes I invest too much energy and talk in things people
|
||||
find boring.
|
||||
</p>
|
||||
<h2>The bathrooms of products</h2>
|
||||
<p>
|
||||
There are a couple of things we can learn here.
|
||||
</p>
|
||||
<p>
|
||||
Your product surely has <em>bathrooms</em>. Those little corners that are not the main course, and
|
||||
your customers don't care about much. You need them. Not having them would be problematic. I don't fuss
|
||||
over a dirty bathroom in a restaurant, but I'm pretty confident I would remember a restaurant not having
|
||||
a bathroom at all if it was responsible for some desperate run-for-it trip in search of a place to drop
|
||||
my bombs.
|
||||
</p>
|
||||
<p>
|
||||
Your product's bathrooms are those secondary features your customers kind of need, but don't care much
|
||||
about. It's that export to CSV button. Your customer John needs it to push the data into his accounting
|
||||
books. The formats of the date columns are weird, and the columns names are confusing, and the fact that
|
||||
you send a link to his email to download it instead of just triggering a download in his browser the
|
||||
moment he hits the button, make it all quite cumbersome. But, all in all, it's minor pain. The moment he
|
||||
uploads it into the accounting software, he forgets about it.
|
||||
</p>
|
||||
<p>
|
||||
I think it's important to be aware of what those are in your product, so you can prioritise accordingly
|
||||
and avoid some feature-prioritisation bike shedding. Theoretically, it should be obvious, because you
|
||||
know what's important (right? Right?!?), and whatever is not important, is probably not important. But
|
||||
then somehow I still see mistakes made around this type of feature.
|
||||
</p>
|
||||
<p>
|
||||
I recently had a conversation with my company's CTO about a situation like this. I had some frustration
|
||||
to vent. We had invested so much time and effort in improving the UI of one of our applications. And it
|
||||
was so pointless. "There's a good chunk of our customer base that pretty much never go into this UI", I
|
||||
told him. "They only contact us through a form when they need the service they hired. I don't think they
|
||||
care about this, and I don't think the nicer UI is going to bring any value to them, nor any money to
|
||||
us".
|
||||
</p>
|
||||
<p>
|
||||
That UI has to be there. It's where they check some settings. Reset their password. The boring stuff.
|
||||
But having achieved being functional, there isn't much more value to provide in improving it.
|
||||
</p>
|
||||
<p>
|
||||
I think it's important to identify which are your bathrooms and make sure you act accordingly. I find
|
||||
it's not enough to only care about making the important stuff top priority: it helps to also make it
|
||||
clear what's not important, and be explicit about it being low priority. Just like when I define the
|
||||
scope for something, I like to both think in terms of what are we including, and also making a explicit
|
||||
list of what we are NOT including for the sake of clarity. Theoretically, just listing the positive list
|
||||
should be enough. In reality, my experience tells me making the negative helps a lot.
|
||||
</p>
|
||||
<p>
|
||||
So, what are your bathrooms? Are you cleaning them with a toothbrush? Or you have them nice and dirty?
|
||||
</p>
|
||||
<hr>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
</section>
|
||||
</main>
|
||||
|
||||
</body>
|
||||
|
||||
</html>
|
||||