new zfs article
This commit is contained in:
parent
e9c607e63e
commit
5887c54b46
1 changed files with 25 additions and 27 deletions
|
|
@ -18,8 +18,8 @@
|
|||
<section>
|
||||
<h2>Replacing a Failed Disk in a ZFS Mirror</h2>
|
||||
<p>If you've been following along, you know the story: I set up a <a href="why-i-put-my-vms-on-a-zfs-mirror.html">ZFS mirror for my Proxmox VMs</a>, then one of the drives <a href="a-degraded-pool-with-a-healthy-disk.html">started acting flaky</a>, and I <a href="fixing-a-degraded-zfs-mirror.html">diagnosed and fixed what turned out to be a bad SATA connection</a>.</p>
|
||||
<p>Well, the connection wasn't the whole story. A few weeks after that fix, the same drive — AGAPITO1 — started dropping off again. Same symptoms: link resets, speed downgrades, kernel giving up on the connection. I went through the cable swap dance again, tried different SATA ports on the motherboard, tried different cables. Nothing helped. The SATA PHY on the drive itself was failing.</p>
|
||||
<p>I contacted Seagate, RMA'd the drive, and ran degraded on AGAPITO2 alone for about two weeks. Then the replacement arrived. This article covers the process of physically installing a new drive and getting it into the ZFS mirror — from "box on the desk" to "pool healthy, mirror whole."</p>
|
||||
<p>Well, the connection wasn't the whole story. A few weeks after that fix, the same drive, AGAPITO1, started dropping off again. Same symptoms: link resets, speed downgrades, kernel giving up on the connection. I went through the cable swap dance again, tried different SATA ports on the motherboard, tried different cables. Nothing helped. The SATA PHY on the drive itself was failing.</p>
|
||||
<p>I contacted PcComponentes (where I bought it), RMA'd the drive, and ran degraded on AGAPITO2 alone for about two weeks. Then the replacement arrived. This article covers the process of physically installing a new drive and getting it into the ZFS mirror, from "box on the desk" to "pool healthy, mirror whole."</p>
|
||||
|
||||
<h3>The starting point</h3>
|
||||
<p>Before doing anything, this is what the pool looked like:</p>
|
||||
|
|
@ -40,26 +40,25 @@ config:
|
|||
ata-ST4000NT001-3M2101_WX11TN2P ONLINE 0 0 0
|
||||
|
||||
errors: No known data errors</code></pre>
|
||||
<p>DEGRADED with one drive REMOVED. The old drive (WX11TN0Z) was physically gone — shipped back to Seagate. AGAPITO2 (WX11TN2P) was holding down the fort alone.</p>
|
||||
<p><code>DEGRADED</code> with one drive <code>REMOVED</code>. The old drive (WX11TN0Z) was physically gone, shipped back to PcComponentes. AGAPITO2 (WX11TN2P) was holding down the fort alone.</p>
|
||||
<p>This is the beauty and the terror of a degraded mirror: everything works fine. Your VMs keep running, your data is intact, reads and writes happen normally. But you have zero redundancy. If that surviving drive has a bad day, you lose everything. Two weeks of running like this was two weeks of hoping AGAPITO2 stayed healthy.</p>
|
||||
|
||||
<h3>Before you touch hardware</h3>
|
||||
<p>Before doing anything physical, I wanted to capture the current state. When things go wrong during maintenance, you want to be able to compare "before" and "after."</p>
|
||||
<p>Three things to record while the server is still running:</p>
|
||||
<p><strong>Pool status</strong> — the <code>zpool status</code> output above. You want to know exactly what ZFS thinks the world looks like right now.</p>
|
||||
<p><strong>SATA layout</strong> — which drive is on which port:</p>
|
||||
<p><strong>Pool status</strong>, the <code>zpool status</code> output above. You want to know exactly what ZFS thinks the world looks like right now.</p>
|
||||
<p><strong>SATA layout</strong>, which drive is on which port:</p>
|
||||
<pre><code>dmesg -T | grep -E 'ata[0-9]+\.[0-9]+: ATA-|ata[0-9]+: SATA link up'</code></pre>
|
||||
<p>In my case, AGAPITO2 was on ata4 and ata3 was empty (the old drive's port). This matters because after you install the new drive, you want to confirm it shows up on the expected port.</p>
|
||||
<p><strong>Surviving drive health</strong> — make sure the drive you're depending on is actually healthy before you start:</p>
|
||||
<p><strong>Surviving drive health</strong>, to make sure the drive you're depending on is actually healthy before you start:</p>
|
||||
<pre><code>smartctl -H /dev/disk/by-id/ata-ST4000NT001-3M2101_WX11TN2P</code></pre>
|
||||
<pre><code>SMART overall-health self-assessment test result: PASSED</code></pre>
|
||||
<p>If this says anything other than PASSED, stop and deal with that first. You don't want to discover your only remaining copy of data is on a failing drive while you're in the middle of hardware work.</p>
|
||||
<p>Once you've got your reference snapshots, shut down gracefully. Stop your VMs first, then power off the server. You want a clean shutdown, not a yank-the-plug situation.</p>
|
||||
<pre><code>qm shutdown <VMID> # for each running VM
|
||||
shutdown -h now</code></pre>
|
||||
<p>If this says anything other than <code>PASSED</code>, stop and deal with that first. You don't want to discover your only remaining copy of data is on a failing drive while you're in the middle of hardware work.</p>
|
||||
<p>Once you've got your reference snapshots, shut down the server gracefully:</p>
|
||||
<pre><code>shutdown -h now</code></pre>
|
||||
|
||||
<h3>Physical installation</h3>
|
||||
<p>I won't write a hardware installation tutorial — every case and drive bay is different. But a few practical tips for homelabbers doing this for the first time:</p>
|
||||
<p>I won't write a hardware installation tutorial, every case and drive bay is different. But a few practical tips for homelabbers doing this for the first time:</p>
|
||||
<ul>
|
||||
<li><strong>Inspect your cables before connecting them.</strong> If the SATA data cable has been sitting disconnected in the case, check the connector pins. Bent pins or dust can cause exactly the kind of intermittent issues that started this whole saga.</li>
|
||||
<li><strong>Label the new drive.</strong> I labeled mine "TOMMY" with its serial number (WX120LHQ) written on a sticker. Yes, I name my drives. It makes debugging much easier than squinting at serial numbers.</li>
|
||||
|
|
@ -74,7 +73,7 @@ shutdown -h now</code></pre>
|
|||
[Fri Feb 20 22:57:06 2026] ata3.00: ATA-11: ST4000NT001-3M2101, EN01, max UDMA/133
|
||||
[Fri Feb 20 22:57:07 2026] ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
|
||||
[Fri Feb 20 22:57:07 2026] ata4.00: ATA-11: ST4000NT001-3M2101, EN01, max UDMA/133</code></pre>
|
||||
<p>Two drives detected. TOMMY on ata3, AGAPITO2 on ata4. Both at full 6.0 Gbps. Good.</p>
|
||||
<p>Both drives detected at full 6.0 Gbps: TOMMY on ata3, AGAPITO2 on ata4.</p>
|
||||
<p>Next, verify it shows up with its expected serial in <code>/dev/disk/by-id/</code>:</p>
|
||||
<pre><code>ls -l /dev/disk/by-id/ | grep WX120LHQ</code></pre>
|
||||
<pre><code>ata-ST4000NT001-3M2101_WX120LHQ -> ../../sda</code></pre>
|
||||
|
|
@ -85,17 +84,17 @@ Serial Number: WX120LHQ
|
|||
Firmware Version: EN01
|
||||
User Capacity: 4,000,787,030,016 bytes [4.00 TB]
|
||||
SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)</code></pre>
|
||||
<p>Right model, right serial, right firmware, full speed. That's our drive.</p>
|
||||
<p>Correct model, serial, firmware, and running at full speed.</p>
|
||||
<p>One more critical check: look for SATA errors in the kernel log.</p>
|
||||
<pre><code>dmesg -T | grep -E 'ata[0-9]' | grep -iE 'error|fatal|reset|link down|slow|limiting'</code></pre>
|
||||
<p>I saw <code>ata1: SATA link down</code> and <code>ata2: SATA link down</code> — those are unused ports, perfectly normal. Nothing on ata3 or ata4. If you see errors on the port your new drive is on, <strong>stop</strong>. A brand new drive throwing SATA errors on a known-good cable means the drive is likely DOA.</p>
|
||||
<p>I saw <code>ata1: SATA link down</code> and <code>ata2: SATA link down</code>, which are just unused ports. Nothing on ata3 or ata4. If you see errors on the port your new drive is on, <strong>stop</strong>. A brand new drive throwing SATA errors on a known-good cable is likely dead on arrival.</p>
|
||||
|
||||
<h3>Health-check before trusting it</h3>
|
||||
<p>A drive can be detected and still be dead on arrival. Before resilvering 1.3 terabytes of data onto it, I wanted to know it was actually healthy.</p>
|
||||
<p><strong>SMART overall health:</strong></p>
|
||||
<pre><code>smartctl -H /dev/disk/by-id/ata-ST4000NT001-3M2101_WX120LHQ</code></pre>
|
||||
<pre><code>SMART overall-health self-assessment test result: PASSED</code></pre>
|
||||
<p><strong>Baseline SMART attributes</strong> — the important ones to check on a new drive:</p>
|
||||
<p><strong>Baseline SMART attributes</strong>, the important ones to check on a new drive:</p>
|
||||
<pre><code>smartctl -A /dev/disk/by-id/ata-ST4000NT001-3M2101_WX120LHQ | grep -E 'Reallocated|Pending|Offline_Uncorrect|CRC'</code></pre>
|
||||
<pre><code> 5 Reallocated_Sector_Ct ... - 0
|
||||
197 Current_Pending_Sector ... - 0
|
||||
|
|
@ -107,12 +106,12 @@ SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)</code></pre>
|
|||
# Wait ~2 minutes...
|
||||
smartctl -l selftest /dev/disk/by-id/ata-ST4000NT001-3M2101_WX120LHQ</code></pre>
|
||||
<pre><code># 1 Short offline Completed without error 00% 0 -</code></pre>
|
||||
<p>Passed with 0 power-on hours — a fresh drive. If any of these checks fail, don't proceed. Contact the seller and get another replacement.</p>
|
||||
<p>Passed with 0 power-on hours, a fresh drive. If any of these checks fail, don't proceed. Contact the seller and get another replacement.</p>
|
||||
|
||||
<h3>The replacement — <code>zpool replace</code></h3>
|
||||
<h3>The replacement: <code>zpool replace</code></h3>
|
||||
<p>This is the moment. One command:</p>
|
||||
<pre><code>zpool replace proxmox-tank-1 ata-ST4000NT001-3M2101_WX11TN0Z ata-ST4000NT001-3M2101_WX120LHQ</code></pre>
|
||||
<p>What this does: it tells ZFS "the drive identified as WX11TN0Z (currently REMOVED) is being replaced by WX120LHQ." ZFS starts resilvering immediately — copying all data from the surviving drive (AGAPITO2) onto the new one (TOMMY).</p>
|
||||
<p>This tells ZFS "the drive identified as WX11TN0Z (currently <code>REMOVED</code>) is being replaced by WX120LHQ." ZFS starts resilvering immediately, copying all data from the surviving drive (AGAPITO2) onto the new one (TOMMY).</p>
|
||||
<p>Checking status right after:</p>
|
||||
<pre><code> pool: proxmox-tank-1
|
||||
state: DEGRADED
|
||||
|
|
@ -127,8 +126,8 @@ config:
|
|||
ata-ST4000NT001-3M2101_WX11TN0Z REMOVED 0 0 0
|
||||
ata-ST4000NT001-3M2101_WX120LHQ ONLINE 0 0 7.73K
|
||||
ata-ST4000NT001-3M2101_WX11TN2P ONLINE 0 0 0</code></pre>
|
||||
<p>Notice the <code>replacing-0</code> vdev — that's a temporary structure ZFS creates during the replacement. It shows both the old (REMOVED) and new (ONLINE) drive while the resilver is in progress.</p>
|
||||
<p>The 7.73K cksum count on the new drive might look alarming, but it's expected during a resilver. Those are blocks that haven't been written yet — ZFS is aware of them and they'll clear up as the resilver progresses.</p>
|
||||
<p>Notice the <code>replacing-0</code> vdev. That's a temporary structure ZFS creates during the replacement, showing both the old (<code>REMOVED</code>) and new (<code>ONLINE</code>) drive while the resilver is in progress.</p>
|
||||
<p>The 7.73K cksum count on the new drive might look alarming, but it's expected during a resilver. Those are blocks that haven't been written yet. ZFS is aware of them and they'll clear up as the resilver progresses.</p>
|
||||
<p>I monitored progress with:</p>
|
||||
<pre><code>watch -n 30 "zpool status -v proxmox-tank-1"</code></pre>
|
||||
<p>I also kept <code>dmesg -Tw</code> running in another terminal, watching for any SATA errors. The kernel log stayed quiet the entire time.</p>
|
||||
|
|
@ -149,7 +148,7 @@ config:
|
|||
ata-ST4000NT001-3M2101_WX11TN2P ONLINE 0 0 0
|
||||
|
||||
errors: No known data errors</code></pre>
|
||||
<p>ONLINE. The <code>replacing-0</code> vdev is gone, replaced by the normal mirror with the new drive. The 7.73K cksum on TOMMY is a residual counter from the resilver — let's clear it:</p>
|
||||
<p><code>ONLINE</code>. The <code>replacing-0</code> vdev is gone and the mirror now has the new drive in place. The 7.73K cksum on TOMMY is a residual counter from the resilver, so let's clear it:</p>
|
||||
<pre><code>zpool clear proxmox-tank-1</code></pre>
|
||||
<p>Now for the real test. A resilver copies data to rebuild the mirror, but a <strong>scrub</strong> reads every block on the pool, verifies all checksums, and repairs any mismatches. This is the definitive integrity check:</p>
|
||||
<pre><code>zpool scrub proxmox-tank-1</code></pre>
|
||||
|
|
@ -163,8 +162,8 @@ errors: No known data errors</code></pre>
|
|||
ata-ST4000NT001-3M2101_WX11TN2P ONLINE 0 0 0
|
||||
|
||||
errors: No known data errors</code></pre>
|
||||
<p>Zero bytes repaired. Zero errors. Both drives at 0/0/0. That's a clean bill of health.</p>
|
||||
<p>One last thing: a post-I/O SMART check on the new drive. After hours of heavy writes (resilver) and reads (scrub), any hardware weakness should have surfaced:</p>
|
||||
<p>Zero bytes repaired, zero errors, both drives at 0/0/0. Clean.</p>
|
||||
<p>One last thing: a post-I/O SMART check on the new drive. After hours of heavy writes during the resilver and reads during the scrub, any hardware weakness should have surfaced:</p>
|
||||
<pre><code>smartctl -x /dev/disk/by-id/ata-ST4000NT001-3M2101_WX120LHQ | grep -E 'Reallocated|Pending|Offline_Uncorrect|CRC|Hardware Resets|COMRESET|Interface'</code></pre>
|
||||
<pre><code>Reallocated_Sector_Ct ... 0
|
||||
Current_Pending_Sector ... 0
|
||||
|
|
@ -173,7 +172,7 @@ UDMA_CRC_Error_Count ... 0
|
|||
Number of Hardware Resets ... 2
|
||||
Number of Interface CRC Errors ... 0
|
||||
COMRESET ... 2</code></pre>
|
||||
<p>All clean. The 2 hardware resets and 2 COMRESETs are just from the server booting — perfectly normal. No reallocated sectors, no CRC errors. The drive is healthy.</p>
|
||||
<p>All clean. The 2 hardware resets and 2 COMRESETs are just from the server booting, perfectly normal.</p>
|
||||
|
||||
<h3>The commands, all in one place</h3>
|
||||
<p>For future me and anyone else replacing a disk in a ZFS mirror:</p>
|
||||
|
|
@ -188,8 +187,7 @@ dmesg -T | grep -E 'ata[0-9]+\.[0-9]+: ATA-|ata[0-9]+: SATA link up'
|
|||
# Check surviving drive health
|
||||
smartctl -H /dev/disk/by-id/<surviving-disk-id>
|
||||
|
||||
# Stop VMs and shut down
|
||||
qm shutdown <VMID>
|
||||
# Shut down
|
||||
shutdown -h now
|
||||
|
||||
# --- After boot with new drive ---
|
||||
|
|
@ -233,7 +231,7 @@ zpool scrub <pool>
|
|||
# Post-I/O SMART check
|
||||
smartctl -x /dev/disk/by-id/<new-disk-id> | grep -E 'Reallocated|Pending|Offline_Uncorrect|CRC'</code></pre>
|
||||
|
||||
<p>The mirror degradation that started on February 8th is resolved. Two weeks of running on a single drive, an RMA, and one evening of work later, the pool is whole again. All VMs running, full redundancy restored, zero data lost throughout the entire saga. ZFS did exactly what it was designed to do.</p>
|
||||
<p>The mirror degradation that started on February 8th is resolved. Two weeks of running on a single drive, an RMA, and one evening of work later, the pool is whole again. Full redundancy restored, zero data lost throughout the entire saga. ZFS did exactly what it was designed to do.</p>
|
||||
<p><em>This is the fourth and final article in this series. If you're just arriving, start with <a href="why-i-put-my-vms-on-a-zfs-mirror.html">Part 1: Why I Put My VMs on a ZFS Mirror</a>, then <a href="a-degraded-pool-with-a-healthy-disk.html">Part 2: A Degraded Pool with a Healthy Disk</a>, and <a href="fixing-a-degraded-zfs-mirror.html">Part 3: Fixing a Degraded ZFS Mirror</a>.</em></p>
|
||||
<p><a href="../index.html">back to home</a></p>
|
||||
</section>
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue