diff --git a/20251230_first_zfs_degradation.md b/20251230_first_zfs_degradation.md index 6710968..41896a3 100644 --- a/20251230_first_zfs_degradation.md +++ b/20251230_first_zfs_degradation.md @@ -5,7 +5,7 @@ On 2025-12-30 I was snooping around the Proxmox UI where I accidentally bumped into a storage view that showed that my ZFS pool (which I use for the disks of all my VMs) was in degraded state. Opening up the detail, it appeared one of the disks was in FAULTED state. -I attempted rebooting the host, which trigger an attempt at resilvering. But the disk remained in the same state. +I attempted rebooting the host, which triggered an attempt at resilvering. But the disk remained in the same state. ## First diagnostic @@ -399,12 +399,788 @@ Rough plan: - The writing side of the resilvering is running at ~50MB/s. I'll shut down all the VMs in hopes of preventing contention for the disk IO. - After around 30min, speed has increased to 100MB/s. - The resilvering will take a long time and it's late, so I'l go to sleep and continue tomorrow. + - The next morning, status read this: + ``` + pool: proxmox-tank-1 + state: DEGRADED + status: One or more devices has experienced an unrecoverable error. An + attempt was made to correct the error. Applications are unaffected. + action: Determine if the device needs to be replaced, and clear the errors + using 'zpool clear' or replace the device with 'zpool replace'. + see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P + scan: resilvered 495G in 01:07:58 with 0 errors on Sat Jan 3 00:25:33 2026 + config: + + NAME STATE READ WRITE CKSUM + proxmox-tank-1 DEGRADED 0 0 0 + mirror-0 DEGRADED 0 0 0 + ata-ST4000NT001-3M2101_WX11TN0Z DEGRADED 0 0 0 too many errors + ata-ST4000NT001-3M2101_WX11TN2P ONLINE 0 0 0 + + errors: No known data errors + ``` + - Apparently, after a little crisis like the one this disk had, ZFS will only mark it clear with human acknowledgement. + - To do that, I run: `zpool clear proxmox-tank-1 ata-ST4000NT001-3M2101_WX11TN0Z` + - Status immediately becomes: + ``` + pool: proxmox-tank-1 + state: ONLINE + scan: resilvered 495G in 01:07:58 with 0 errors on Sat Jan 3 00:25:33 2026 + config: + + NAME STATE READ WRITE CKSUM + proxmox-tank-1 ONLINE 0 0 0 + mirror-0 ONLINE 0 0 0 + ata-ST4000NT001-3M2101_WX11TN0Z ONLINE 0 0 0 + ata-ST4000NT001-3M2101_WX11TN2P ONLINE 0 0 0 + + errors: No known data errors + + ``` +- Scrubbing + - I trigger the scrub with: `sudo zpool scrub proxmox-tank-1` + - This is the final message once the scrub finished: + ``` + pool: proxmox-tank-1 + state: ONLINE + status: One or more devices has experienced an unrecoverable error. An + attempt was made to correct the error. Applications are unaffected. + action: Determine if the device needs to be replaced, and clear the errors + using 'zpool clear' or replace the device with 'zpool replace'. + see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P + scan: scrub repaired 13.0M in 02:14:22 with 0 errors on Sat Jan 3 11:03:54 2026 + config: + + NAME STATE READ WRITE CKSUM + proxmox-tank-1 ONLINE 0 0 0 + mirror-0 ONLINE 0 0 0 + ata-ST4000NT001-3M2101_WX11TN0Z ONLINE 0 0 992 + ata-ST4000NT001-3M2101_WX11TN2P ONLINE 0 0 0 + + errors: No known data errors + ``` + - I clear the error messages with `zpool clear proxmox-tank-1 ata-ST4000NT001-3M2101_WX11TN0Z` + - Nevertheless, those checksum errors might be of concern. +- Checking disk with smartctl + - I run `smartctl -x "$(readlink -f "$DISKPATH")" | egrep -i 'Reallocated|Pending|Offline_Uncorrect|CRC|Hardware Resets|COMRESET|SATA Phy'` and get back: + ``` + 5 Reallocated_Sector_Ct PO--CK 100 100 010 - 0 + 197 Current_Pending_Sector -O--C- 100 100 000 - 0 + 198 Offline_Uncorrectable ----C- 100 100 000 - 0 + 199 UDMA_CRC_Error_Count -OSRCK 200 200 000 - 0 + 0x0c GPL R/O 2048 Pending Defects log + 0x11 GPL R/O 1 SATA Phy Event Counters log + If Selective self-test is pending on power-up, resume after 0 minute delay. + 0x03 0x020 4 0 --- Number of Reallocated Logical Sectors + 0x06 0x008 4 41 --- Number of Hardware Resets + 0x06 0x018 4 0 --- Number of Interface CRC Errors + Pending Defects log (GP Log 0x0c) + SATA Phy Event Counters (GP Log 0x11) + 0x000a 2 2 Device-to-host register FISes sent due to a COMRESET + 0x0001 2 0 Command failed due to ICRC error + + ``` + - The full output of `smartctl -x /dev/sdb`: + ``` + smartctl 7.4 2024-10-15 r5620 [x86_64-linux-6.14.8-2-pve] (local build) + Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org + + === START OF INFORMATION SECTION === + Device Model: ST4000NT001-3M2101 + Serial Number: WX11TN0Z + LU WWN Device Id: 5 000c50 0fb8869af + Firmware Version: EN01 + User Capacity: 4,000,787,030,016 bytes [4.00 TB] + Sector Sizes: 512 bytes logical, 4096 bytes physical + Rotation Rate: 7200 rpm + Form Factor: 3.5 inches + Device is: Not in smartctl database 7.3/5528 + ATA Version is: ACS-4 (minor revision not indicated) + SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s) + Local Time is: Sat Jan 3 11:16:38 2026 CET + SMART support is: Available - device has SMART capability. + SMART support is: Enabled + AAM feature is: Unavailable + APM feature is: Unavailable + Rd look-ahead is: Enabled + Write cache is: Enabled + DSN feature is: Disabled + ATA Security is: Disabled, NOT FROZEN [SEC1] + Write SCT (Get) Feature Control Command failed: scsi error unsupported field in scsi command + Wt Cache Reorder: Unknown (SCT Feature Control command failed) + + === START OF READ SMART DATA SECTION === + SMART overall-health self-assessment test result: PASSED + + General SMART Values: + Offline data collection status: (0x82) Offline data collection activity + was completed without error. + Auto Offline Data Collection: Enabled. + Self-test execution status: ( 0) The previous self-test routine completed + without error or no self-test has ever + been run. + Total time to complete Offline + data collection: ( 567) seconds. + Offline data collection + capabilities: (0x7b) SMART execute Offline immediate. + Auto Offline data collection on/off support. + Suspend Offline collection upon new + command. + Offline surface scan supported. + Self-test supported. + Conveyance Self-test supported. + Selective Self-test supported. + SMART capabilities: (0x0003) Saves SMART data before entering + power-saving mode. + Supports SMART auto save timer. + Error logging capability: (0x01) Error logging supported. + General Purpose Logging supported. + Short self-test routine + recommended polling time: ( 1) minutes. + Extended self-test routine + recommended polling time: ( 372) minutes. + Conveyance self-test routine + recommended polling time: ( 2) minutes. + SCT capabilities: (0x50bd) SCT Status supported. + SCT Error Recovery Control supported. + SCT Feature Control supported. + SCT Data Table supported. + + SMART Attributes Data Structure revision number: 10 + Vendor Specific SMART Attributes with Thresholds: + ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE + 1 Raw_Read_Error_Rate POSR-- 080 064 044 - 89842680 + 3 Spin_Up_Time PO---- 097 093 000 - 0 + 4 Start_Stop_Count -O--CK 100 100 020 - 237 + 5 Reallocated_Sector_Ct PO--CK 100 100 010 - 0 + 7 Seek_Error_Rate POSR-- 078 060 045 - 58464314 + 9 Power_On_Hours -O--CK 099 099 000 - 1551 + 10 Spin_Retry_Count PO--C- 100 100 097 - 0 + 12 Power_Cycle_Count -O--CK 100 100 020 - 237 + 18 Unknown_Attribute PO-R-- 100 100 050 - 0 + 187 Reported_Uncorrect -O--CK 100 100 000 - 0 + 188 Command_Timeout -O--CK 100 100 000 - 0 + 190 Airflow_Temperature_Cel -O---K 060 054 000 - 40 (Min/Max 26/43) + 192 Power-Off_Retract_Count -O--CK 100 100 000 - 229 + 193 Load_Cycle_Count -O--CK 100 100 000 - 1964 + 194 Temperature_Celsius -O---K 040 046 000 - 40 (0 23 0 0 0) + 197 Current_Pending_Sector -O--C- 100 100 000 - 0 + 198 Offline_Uncorrectable ----C- 100 100 000 - 0 + 199 UDMA_CRC_Error_Count -OSRCK 200 200 000 - 0 + 240 Head_Flying_Hours ------ 100 100 000 - 516 (189 160 0) + 241 Total_LBAs_Written ------ 100 253 000 - 19110016931 + 242 Total_LBAs_Read ------ 100 253 000 - 9057450849 + ||||||_ K auto-keep + |||||__ C event count + ||||___ R error rate + |||____ S speed/performance + ||_____ O updated online + |______ P prefailure warning + + General Purpose Log Directory Version 1 + SMART Log Directory Version 1 [multi-sector log support] + Address Access R/W Size Description + 0x00 GPL,SL R/O 1 Log Directory + 0x01 SL R/O 1 Summary SMART error log + 0x02 SL R/O 5 Comprehensive SMART error log + 0x03 GPL R/O 5 Ext. Comprehensive SMART error log + 0x04 GPL R/O 256 Device Statistics log + 0x04 SL R/O 8 Device Statistics log + 0x06 SL R/O 1 SMART self-test log + 0x07 GPL R/O 1 Extended self-test log + 0x08 GPL R/O 2 Power Conditions log + 0x09 SL R/W 1 Selective self-test log + 0x0a GPL R/W 8 Device Statistics Notification + 0x0c GPL R/O 2048 Pending Defects log + 0x10 GPL R/O 1 NCQ Command Error log + 0x11 GPL R/O 1 SATA Phy Event Counters log + 0x13 GPL R/O 1 SATA NCQ Send and Receive log + 0x21 GPL R/O 1 Write stream error log + 0x22 GPL R/O 1 Read stream error log + 0x24 GPL R/O 768 Current Device Internal Status Data log + 0x2f GPL R/O 1 Set Sector Configuration + 0x30 GPL,SL R/O 9 IDENTIFY DEVICE data log + 0x80-0x9f GPL,SL R/W 16 Host vendor specific log + 0xa1 GPL,SL VS 160 Device vendor specific log + 0xa2 GPL VS 16320 Device vendor specific log + 0xa4 GPL,SL VS 160 Device vendor specific log + 0xa6 GPL VS 192 Device vendor specific log + 0xa8-0xa9 GPL,SL VS 136 Device vendor specific log + 0xab GPL VS 1 Device vendor specific log + 0xad GPL VS 16 Device vendor specific log + 0xb1 GPL,SL VS 160 Device vendor specific log + 0xb6 GPL VS 1920 Device vendor specific log + 0xbe-0xbf GPL VS 65535 Device vendor specific log + 0xc1 GPL,SL VS 8 Device vendor specific log + 0xc3 GPL,SL VS 24 Device vendor specific log + 0xc6 GPL VS 5184 Device vendor specific log + 0xc7 GPL,SL VS 8 Device vendor specific log + 0xc9 GPL,SL VS 8 Device vendor specific log + 0xca GPL,SL VS 16 Device vendor specific log + 0xcd GPL,SL VS 1 Device vendor specific log + 0xce GPL VS 1 Device vendor specific log + 0xcf GPL VS 512 Device vendor specific log + 0xd1 GPL VS 656 Device vendor specific log + 0xd2 GPL VS 10256 Device vendor specific log + 0xd4 GPL VS 2048 Device vendor specific log + 0xda GPL,SL VS 1 Device vendor specific log + 0xe0 GPL,SL R/W 1 SCT Command/Status + 0xe1 GPL,SL R/W 1 SCT Data Transfer + + SMART Extended Comprehensive Error Log Version: 1 (5 sectors) + No Errors Logged + + SMART Extended Self-test Log Version: 1 (1 sectors) + Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error + # 1 Short offline Completed without error 00% 1462 - + + SMART Selective self-test log data structure revision number 1 + SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS + 1 0 0 Not_testing + 2 0 0 Not_testing + 3 0 0 Not_testing + 4 0 0 Not_testing + 5 0 0 Not_testing + Selective self-test flags (0x0): + After scanning selected spans, do NOT read-scan remainder of disk. + If Selective self-test is pending on power-up, resume after 0 minute delay. + + SCT Status Version: 3 + SCT Version (vendor specific): 522 (0x020a) + Device State: Active (0) + Current Temperature: 40 Celsius + Power Cycle Min/Max Temperature: 26/43 Celsius + Lifetime Min/Max Temperature: 23/46 Celsius + Under/Over Temperature Limit Count: 0/2 + SMART Status: 0xc24f (PASSED) + Vendor specific: + 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 + 00 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 + + SCT Temperature History Version: 2 + Temperature Sampling Period: 4 minutes + Temperature Logging Interval: 59 minutes + Min/Max recommended Temperature: 10/40 Celsius + Min/Max Temperature Limit: 5/60 Celsius + Temperature History Size (Index): 128 (123) + + Index Estimated Time Temperature Celsius + 124 2025-12-29 05:52 34 *************** + 125 2025-12-29 06:51 34 *************** + 126 2025-12-29 07:50 33 ************** + ... ..( 3 skipped). .. ************** + 2 2025-12-29 11:46 33 ************** + 3 2025-12-29 12:45 35 **************** + 4 2025-12-29 13:44 35 **************** + 5 2025-12-29 14:43 35 **************** + 6 2025-12-29 15:42 ? - + 7 2025-12-29 16:41 36 ***************** + 8 2025-12-29 17:40 ? - + 9 2025-12-29 18:39 36 ***************** + 10 2025-12-29 19:38 ? - + 11 2025-12-29 20:37 36 ***************** + 12 2025-12-29 21:36 36 ***************** + 13 2025-12-29 22:35 35 **************** + ... ..( 4 skipped). .. **************** + 18 2025-12-30 03:30 35 **************** + 19 2025-12-30 04:29 34 *************** + ... ..( 2 skipped). .. *************** + 22 2025-12-30 07:26 34 *************** + 23 2025-12-30 08:25 33 ************** + 24 2025-12-30 09:24 33 ************** + 25 2025-12-30 10:23 33 ************** + 26 2025-12-30 11:22 32 ************* + ... ..( 3 skipped). .. ************* + 30 2025-12-30 15:18 32 ************* + 31 2025-12-30 16:17 33 ************** + 32 2025-12-30 17:16 33 ************** + 33 2025-12-30 18:15 32 ************* + ... ..( 10 skipped). .. ************* + 44 2025-12-31 05:04 32 ************* + 45 2025-12-31 06:03 31 ************ + ... ..( 5 skipped). .. ************ + 51 2025-12-31 11:57 31 ************ + 52 2025-12-31 12:56 30 *********** + ... ..( 8 skipped). .. *********** + 61 2025-12-31 21:47 30 *********** + 62 2025-12-31 22:46 31 ************ + ... ..( 3 skipped). .. ************ + 66 2026-01-01 02:42 31 ************ + 67 2026-01-01 03:41 30 *********** + ... ..( 10 skipped). .. *********** + 78 2026-01-01 14:30 30 *********** + 79 2026-01-01 15:29 29 ********** + 80 2026-01-01 16:28 29 ********** + 81 2026-01-01 17:27 29 ********** + 82 2026-01-01 18:26 30 *********** + 83 2026-01-01 19:25 29 ********** + 84 2026-01-01 20:24 29 ********** + 85 2026-01-01 21:23 29 ********** + 86 2026-01-01 22:22 30 *********** + 87 2026-01-01 23:21 30 *********** + 88 2026-01-02 00:20 32 ************* + 89 2026-01-02 01:19 33 ************** + 90 2026-01-02 02:18 33 ************** + 91 2026-01-02 03:17 33 ************** + 92 2026-01-02 04:16 32 ************* + 93 2026-01-02 05:15 31 ************ + 94 2026-01-02 06:14 ? - + 95 2026-01-02 07:13 30 *********** + 96 2026-01-02 08:12 ? - + 97 2026-01-02 09:11 30 *********** + 98 2026-01-02 10:10 ? - + 99 2026-01-02 11:09 30 *********** + 100 2026-01-02 12:08 ? - + 101 2026-01-02 13:07 30 *********** + 102 2026-01-02 14:06 ? - + 103 2026-01-02 15:05 30 *********** + 104 2026-01-02 16:04 ? - + 105 2026-01-02 17:03 30 *********** + 106 2026-01-02 18:02 ? - + 107 2026-01-02 19:01 31 ************ + 108 2026-01-02 20:00 ? - + 109 2026-01-02 20:59 31 ************ + 110 2026-01-02 21:58 ? - + 111 2026-01-02 22:57 26 ******* + 112 2026-01-02 23:56 38 ******************* + 113 2026-01-03 00:55 36 ***************** + 114 2026-01-03 01:54 34 *************** + 115 2026-01-03 02:53 33 ************** + ... ..( 4 skipped). .. ************** + 120 2026-01-03 07:48 33 ************** + 121 2026-01-03 08:47 37 ****************** + 122 2026-01-03 09:46 42 *********************** + 123 2026-01-03 10:45 43 ************************ + + SCT Error Recovery Control: + Read: 70 (7.0 seconds) + Write: 70 (7.0 seconds) + + Device Statistics (GP Log 0x04) + Page Offset Size Value Flags Description + 0x01 ===== = = === == General Statistics (rev 1) == + 0x01 0x008 4 237 --- Lifetime Power-On Resets + 0x01 0x010 4 1551 --- Power-on Hours + 0x01 0x018 6 18855234811 --- Logical Sectors Written + 0x01 0x020 6 38962968 --- Number of Write Commands + 0x01 0x028 6 9004753896 --- Logical Sectors Read + 0x01 0x030 6 148517033 --- Number of Read Commands + 0x01 0x038 6 - --- Date and Time TimeStamp + 0x03 ===== = = === == Rotating Media Statistics (rev 1) == + 0x03 0x008 4 1203 --- Spindle Motor Power-on Hours + 0x03 0x010 4 516 --- Head Flying Hours + 0x03 0x018 4 1964 --- Head Load Events + 0x03 0x020 4 0 --- Number of Reallocated Logical Sectors + 0x03 0x028 4 0 --- Read Recovery Attempts + 0x03 0x030 4 0 --- Number of Mechanical Start Failures + 0x03 0x038 4 0 --- Number of Realloc. Candidate Logical Sectors + 0x03 0x040 4 229 --- Number of High Priority Unload Events + 0x04 ===== = = === == General Errors Statistics (rev 1) == + 0x04 0x008 4 0 --- Number of Reported Uncorrectable Errors + 0x04 0x010 4 0 --- Resets Between Cmd Acceptance and Completion + 0x04 0x018 4 0 -D- Physical Element Status Changed + 0x05 ===== = = === == Temperature Statistics (rev 1) == + 0x05 0x008 1 40 --- Current Temperature + 0x05 0x010 1 32 --- Average Short Term Temperature + 0x05 0x018 1 34 --- Average Long Term Temperature + 0x05 0x020 1 46 --- Highest Temperature + 0x05 0x028 1 27 --- Lowest Temperature + 0x05 0x030 1 43 --- Highest Average Short Term Temperature + 0x05 0x038 1 30 --- Lowest Average Short Term Temperature + 0x05 0x040 1 34 --- Highest Average Long Term Temperature + 0x05 0x048 1 34 --- Lowest Average Long Term Temperature + 0x05 0x050 4 0 --- Time in Over-Temperature + 0x05 0x058 1 60 --- Specified Maximum Operating Temperature + 0x05 0x060 4 0 --- Time in Under-Temperature + 0x05 0x068 1 5 --- Specified Minimum Operating Temperature + 0x06 ===== = = === == Transport Statistics (rev 1) == + 0x06 0x008 4 41 --- Number of Hardware Resets + 0x06 0x010 4 8 --- Number of ASR Events + 0x06 0x018 4 0 --- Number of Interface CRC Errors + 0xff ===== = = === == Vendor Specific Statistics (rev 1) == + 0xff 0x010 7 0 --- Vendor Specific + 0xff 0x018 7 0 --- Vendor Specific + |||_ C monitored condition met + ||__ D supports DSN + |___ N normalized value + + Pending Defects log (GP Log 0x0c) + No Defects Logged + + SATA Phy Event Counters (GP Log 0x11) + ID Size Value Description + 0x000a 2 2 Device-to-host register FISes sent due to a COMRESET + 0x0001 2 0 Command failed due to ICRC error + 0x0003 2 0 R_ERR response for device-to-host data FIS + 0x0004 2 0 R_ERR response for host-to-device data FIS + 0x0006 2 0 R_ERR response for device-to-host non-data FIS + 0x0007 2 0 R_ERR response for host-to-device non-data FIS + + Seagate FARM log (GP Log 0xa6) supported [try: -l farm] + ``` + - `smartctl -l error` shows no errors. + - `smartctl -l selftest` after triggering the short test: + ``` + smartctl 7.4 2024-10-15 r5620 [x86_64-linux-6.14.8-2-pve] (local build) + Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org + + === START OF READ SMART DATA SECTION === + SMART Self-test log structure revision number 1 + Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error + # 1 Short offline Completed without error 00% 1551 - + # 2 Short offline Completed without error 00% 1462 - + ``` + - Then I run the longtest. It will be finished at 18:00. + - While the test runs, I start raising the host VMs again since they can run in parallel. + - The selftest is taking way longer than initially planned, but it's progressing. It's 21:00, 20% remaining. + - It finished eventually. Here's the full output: + ``` + smartctl 7.4 2024-10-15 r5620 [x86_64-linux-6.14.8-2-pve] (local build) + Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org + + === START OF INFORMATION SECTION === + Device Model: ST4000NT001-3M2101 + Serial Number: WX11TN0Z + LU WWN Device Id: 5 000c50 0fb8869af + Firmware Version: EN01 + User Capacity: 4,000,787,030,016 bytes [4.00 TB] + Sector Sizes: 512 bytes logical, 4096 bytes physical + Rotation Rate: 7200 rpm + Form Factor: 3.5 inches + Device is: Not in smartctl database 7.3/5528 + ATA Version is: ACS-4 (minor revision not indicated) + SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s) + Local Time is: Sat Jan 3 23:48:15 2026 CET + SMART support is: Available - device has SMART capability. + SMART support is: Enabled + AAM feature is: Unavailable + APM feature is: Unavailable + Rd look-ahead is: Enabled + Write cache is: Enabled + DSN feature is: Disabled + ATA Security is: Disabled, NOT FROZEN [SEC1] + Write SCT (Get) Feature Control Command failed: scsi error unsupported field in scsi command + Wt Cache Reorder: Unknown (SCT Feature Control command failed) + + === START OF READ SMART DATA SECTION === + SMART overall-health self-assessment test result: PASSED + + General SMART Values: + Offline data collection status: (0x82) Offline data collection activity + was completed without error. + Auto Offline Data Collection: Enabled. + Self-test execution status: ( 0) The previous self-test routine completed + without error or no self-test has ever + been run. + Total time to complete Offline + data collection: ( 567) seconds. + Offline data collection + capabilities: (0x7b) SMART execute Offline immediate. + Auto Offline data collection on/off support. + Suspend Offline collection upon new + command. + Offline surface scan supported. + Self-test supported. + Conveyance Self-test supported. + Selective Self-test supported. + SMART capabilities: (0x0003) Saves SMART data before entering + power-saving mode. + Supports SMART auto save timer. + Error logging capability: (0x01) Error logging supported. + General Purpose Logging supported. + Short self-test routine + recommended polling time: ( 1) minutes. + Extended self-test routine + recommended polling time: ( 372) minutes. + Conveyance self-test routine + recommended polling time: ( 2) minutes. + SCT capabilities: (0x50bd) SCT Status supported. + SCT Error Recovery Control supported. + SCT Feature Control supported. + SCT Data Table supported. + + SMART Attributes Data Structure revision number: 10 + Vendor Specific SMART Attributes with Thresholds: + ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE + 1 Raw_Read_Error_Rate POSR-- 082 064 044 - 160614704 + 3 Spin_Up_Time PO---- 097 093 000 - 0 + 4 Start_Stop_Count -O--CK 100 100 020 - 237 + 5 Reallocated_Sector_Ct PO--CK 100 100 010 - 0 + 7 Seek_Error_Rate POSR-- 078 060 045 - 63053692 + 9 Power_On_Hours -O--CK 099 099 000 - 1564 + 10 Spin_Retry_Count PO--C- 100 100 097 - 0 + 12 Power_Cycle_Count -O--CK 100 100 020 - 237 + 18 Unknown_Attribute PO-R-- 100 100 050 - 0 + 187 Reported_Uncorrect -O--CK 100 100 000 - 0 + 188 Command_Timeout -O--CK 100 100 000 - 0 + 190 Airflow_Temperature_Cel -O---K 063 054 000 - 37 (Min/Max 26/45) + 192 Power-Off_Retract_Count -O--CK 100 100 000 - 229 + 193 Load_Cycle_Count -O--CK 100 100 000 - 1965 + 194 Temperature_Celsius -O---K 037 046 000 - 37 (0 23 0 0 0) + 197 Current_Pending_Sector -O--C- 100 100 000 - 0 + 198 Offline_Uncorrectable ----C- 100 100 000 - 0 + 199 UDMA_CRC_Error_Count -OSRCK 200 200 000 - 0 + 240 Head_Flying_Hours ------ 100 100 000 - 529 (206 38 0) + 241 Total_LBAs_Written ------ 100 253 000 - 19648189091 + 242 Total_LBAs_Read ------ 100 253 000 - 9322473897 + ||||||_ K auto-keep + |||||__ C event count + ||||___ R error rate + |||____ S speed/performance + ||_____ O updated online + |______ P prefailure warning + + General Purpose Log Directory Version 1 + SMART Log Directory Version 1 [multi-sector log support] + Address Access R/W Size Description + 0x00 GPL,SL R/O 1 Log Directory + 0x01 SL R/O 1 Summary SMART error log + 0x02 SL R/O 5 Comprehensive SMART error log + 0x03 GPL R/O 5 Ext. Comprehensive SMART error log + 0x04 GPL R/O 256 Device Statistics log + 0x04 SL R/O 8 Device Statistics log + 0x06 SL R/O 1 SMART self-test log + 0x07 GPL R/O 1 Extended self-test log + 0x08 GPL R/O 2 Power Conditions log + 0x09 SL R/W 1 Selective self-test log + 0x0a GPL R/W 8 Device Statistics Notification + 0x0c GPL R/O 2048 Pending Defects log + 0x10 GPL R/O 1 NCQ Command Error log + 0x11 GPL R/O 1 SATA Phy Event Counters log + 0x13 GPL R/O 1 SATA NCQ Send and Receive log + 0x21 GPL R/O 1 Write stream error log + 0x22 GPL R/O 1 Read stream error log + 0x24 GPL R/O 768 Current Device Internal Status Data log + 0x2f GPL R/O 1 Set Sector Configuration + 0x30 GPL,SL R/O 9 IDENTIFY DEVICE data log + 0x80-0x9f GPL,SL R/W 16 Host vendor specific log + 0xa1 GPL,SL VS 160 Device vendor specific log + 0xa2 GPL VS 16320 Device vendor specific log + 0xa4 GPL,SL VS 160 Device vendor specific log + 0xa6 GPL VS 192 Device vendor specific log + 0xa8-0xa9 GPL,SL VS 136 Device vendor specific log + 0xab GPL VS 1 Device vendor specific log + 0xad GPL VS 16 Device vendor specific log + 0xb1 GPL,SL VS 160 Device vendor specific log + 0xb6 GPL VS 1920 Device vendor specific log + 0xbe-0xbf GPL VS 65535 Device vendor specific log + 0xc1 GPL,SL VS 8 Device vendor specific log + 0xc3 GPL,SL VS 24 Device vendor specific log + 0xc6 GPL VS 5184 Device vendor specific log + 0xc7 GPL,SL VS 8 Device vendor specific log + 0xc9 GPL,SL VS 8 Device vendor specific log + 0xca GPL,SL VS 16 Device vendor specific log + 0xcd GPL,SL VS 1 Device vendor specific log + 0xce GPL VS 1 Device vendor specific log + 0xcf GPL VS 512 Device vendor specific log + 0xd1 GPL VS 656 Device vendor specific log + 0xd2 GPL VS 10256 Device vendor specific log + 0xd4 GPL VS 2048 Device vendor specific log + 0xda GPL,SL VS 1 Device vendor specific log + 0xe0 GPL,SL R/W 1 SCT Command/Status + 0xe1 GPL,SL R/W 1 SCT Data Transfer + + SMART Extended Comprehensive Error Log Version: 1 (5 sectors) + No Errors Logged + + SMART Extended Self-test Log Version: 1 (1 sectors) + Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error + # 1 Extended offline Completed without error 00% 1563 - + # 2 Short offline Completed without error 00% 1551 - + # 3 Short offline Completed without error 00% 1462 - + + SMART Selective self-test log data structure revision number 1 + SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS + 1 0 0 Not_testing + 2 0 0 Not_testing + 3 0 0 Not_testing + 4 0 0 Not_testing + 5 0 0 Not_testing + Selective self-test flags (0x0): + After scanning selected spans, do NOT read-scan remainder of disk. + If Selective self-test is pending on power-up, resume after 0 minute delay. + + SCT Status Version: 3 + SCT Version (vendor specific): 522 (0x020a) + Device State: Active (0) + Current Temperature: 37 Celsius + Power Cycle Min/Max Temperature: 26/45 Celsius + Lifetime Min/Max Temperature: 23/46 Celsius + Under/Over Temperature Limit Count: 0/14 + SMART Status: 0xc24f (PASSED) + Vendor specific: + 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 + 00 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 + + SCT Temperature History Version: 2 + Temperature Sampling Period: 4 minutes + Temperature Logging Interval: 59 minutes + Min/Max recommended Temperature: 10/40 Celsius + Min/Max Temperature Limit: 5/60 Celsius + Temperature History Size (Index): 128 (8) + + Index Estimated Time Temperature Celsius + 9 2025-12-29 18:39 36 ***************** + 10 2025-12-29 19:38 ? - + 11 2025-12-29 20:37 36 ***************** + 12 2025-12-29 21:36 36 ***************** + 13 2025-12-29 22:35 35 **************** + ... ..( 4 skipped). .. **************** + 18 2025-12-30 03:30 35 **************** + 19 2025-12-30 04:29 34 *************** + ... ..( 2 skipped). .. *************** + 22 2025-12-30 07:26 34 *************** + 23 2025-12-30 08:25 33 ************** + 24 2025-12-30 09:24 33 ************** + 25 2025-12-30 10:23 33 ************** + 26 2025-12-30 11:22 32 ************* + ... ..( 3 skipped). .. ************* + 30 2025-12-30 15:18 32 ************* + 31 2025-12-30 16:17 33 ************** + 32 2025-12-30 17:16 33 ************** + 33 2025-12-30 18:15 32 ************* + ... ..( 10 skipped). .. ************* + 44 2025-12-31 05:04 32 ************* + 45 2025-12-31 06:03 31 ************ + ... ..( 5 skipped). .. ************ + 51 2025-12-31 11:57 31 ************ + 52 2025-12-31 12:56 30 *********** + ... ..( 8 skipped). .. *********** + 61 2025-12-31 21:47 30 *********** + 62 2025-12-31 22:46 31 ************ + ... ..( 3 skipped). .. ************ + 66 2026-01-01 02:42 31 ************ + 67 2026-01-01 03:41 30 *********** + ... ..( 10 skipped). .. *********** + 78 2026-01-01 14:30 30 *********** + 79 2026-01-01 15:29 29 ********** + 80 2026-01-01 16:28 29 ********** + 81 2026-01-01 17:27 29 ********** + 82 2026-01-01 18:26 30 *********** + 83 2026-01-01 19:25 29 ********** + 84 2026-01-01 20:24 29 ********** + 85 2026-01-01 21:23 29 ********** + 86 2026-01-01 22:22 30 *********** + 87 2026-01-01 23:21 30 *********** + 88 2026-01-02 00:20 32 ************* + 89 2026-01-02 01:19 33 ************** + 90 2026-01-02 02:18 33 ************** + 91 2026-01-02 03:17 33 ************** + 92 2026-01-02 04:16 32 ************* + 93 2026-01-02 05:15 31 ************ + 94 2026-01-02 06:14 ? - + 95 2026-01-02 07:13 30 *********** + 96 2026-01-02 08:12 ? - + 97 2026-01-02 09:11 30 *********** + 98 2026-01-02 10:10 ? - + 99 2026-01-02 11:09 30 *********** + 100 2026-01-02 12:08 ? - + 101 2026-01-02 13:07 30 *********** + 102 2026-01-02 14:06 ? - + 103 2026-01-02 15:05 30 *********** + 104 2026-01-02 16:04 ? - + 105 2026-01-02 17:03 30 *********** + 106 2026-01-02 18:02 ? - + 107 2026-01-02 19:01 31 ************ + 108 2026-01-02 20:00 ? - + 109 2026-01-02 20:59 31 ************ + 110 2026-01-02 21:58 ? - + 111 2026-01-02 22:57 26 ******* + 112 2026-01-02 23:56 38 ******************* + 113 2026-01-03 00:55 36 ***************** + 114 2026-01-03 01:54 34 *************** + 115 2026-01-03 02:53 33 ************** + ... ..( 4 skipped). .. ************** + 120 2026-01-03 07:48 33 ************** + 121 2026-01-03 08:47 37 ****************** + 122 2026-01-03 09:46 42 *********************** + 123 2026-01-03 10:45 43 ************************ + 124 2026-01-03 11:44 42 *********************** + 125 2026-01-03 12:43 43 ************************ + 126 2026-01-03 13:42 44 ************************* + 127 2026-01-03 14:41 45 ************************** + 0 2026-01-03 15:40 43 ************************ + 1 2026-01-03 16:39 43 ************************ + 2 2026-01-03 17:38 42 *********************** + ... ..( 2 skipped). .. *********************** + 5 2026-01-03 20:35 42 *********************** + 6 2026-01-03 21:34 41 ********************** + 7 2026-01-03 22:33 41 ********************** + 8 2026-01-03 23:32 38 ******************* + + SCT Error Recovery Control: + Read: 70 (7.0 seconds) + Write: 70 (7.0 seconds) + + Device Statistics (GP Log 0x04) + Page Offset Size Value Flags Description + 0x01 ===== = = === == General Statistics (rev 1) == + 0x01 0x008 4 237 --- Lifetime Power-On Resets + 0x01 0x010 4 1564 --- Power-on Hours + 0x01 0x018 6 19393406971 --- Logical Sectors Written + 0x01 0x020 6 40248649 --- Number of Write Commands + 0x01 0x028 6 9269776944 --- Logical Sectors Read + 0x01 0x030 6 154066717 --- Number of Read Commands + 0x01 0x038 6 - --- Date and Time TimeStamp + 0x03 ===== = = === == Rotating Media Statistics (rev 1) == + 0x03 0x008 4 1215 --- Spindle Motor Power-on Hours + 0x03 0x010 4 528 --- Head Flying Hours + 0x03 0x018 4 1965 --- Head Load Events + 0x03 0x020 4 0 --- Number of Reallocated Logical Sectors + 0x03 0x028 4 0 --- Read Recovery Attempts + 0x03 0x030 4 0 --- Number of Mechanical Start Failures + 0x03 0x038 4 0 --- Number of Realloc. Candidate Logical Sectors + 0x03 0x040 4 229 --- Number of High Priority Unload Events + 0x04 ===== = = === == General Errors Statistics (rev 1) == + 0x04 0x008 4 0 --- Number of Reported Uncorrectable Errors + 0x04 0x010 4 0 --- Resets Between Cmd Acceptance and Completion + 0x04 0x018 4 0 -D- Physical Element Status Changed + 0x05 ===== = = === == Temperature Statistics (rev 1) == + 0x05 0x008 1 37 --- Current Temperature + 0x05 0x010 1 35 --- Average Short Term Temperature + 0x05 0x018 1 34 --- Average Long Term Temperature + 0x05 0x020 1 46 --- Highest Temperature + 0x05 0x028 1 27 --- Lowest Temperature + 0x05 0x030 1 43 --- Highest Average Short Term Temperature + 0x05 0x038 1 30 --- Lowest Average Short Term Temperature + 0x05 0x040 1 34 --- Highest Average Long Term Temperature + 0x05 0x048 1 34 --- Lowest Average Long Term Temperature + 0x05 0x050 4 0 --- Time in Over-Temperature + 0x05 0x058 1 60 --- Specified Maximum Operating Temperature + 0x05 0x060 4 0 --- Time in Under-Temperature + 0x05 0x068 1 5 --- Specified Minimum Operating Temperature + 0x06 ===== = = === == Transport Statistics (rev 1) == + 0x06 0x008 4 41 --- Number of Hardware Resets + 0x06 0x010 4 8 --- Number of ASR Events + 0x06 0x018 4 0 --- Number of Interface CRC Errors + 0xff ===== = = === == Vendor Specific Statistics (rev 1) == + 0xff 0x010 7 0 --- Vendor Specific + 0xff 0x018 7 0 --- Vendor Specific + |||_ C monitored condition met + ||__ D supports DSN + |___ N normalized value + + Pending Defects log (GP Log 0x0c) + No Defects Logged + + SATA Phy Event Counters (GP Log 0x11) + ID Size Value Description + 0x000a 2 2 Device-to-host register FISes sent due to a COMRESET + 0x0001 2 0 Command failed due to ICRC error + 0x0003 2 0 R_ERR response for device-to-host data FIS + 0x0004 2 0 R_ERR response for host-to-device data FIS + 0x0006 2 0 R_ERR response for device-to-host non-data FIS + 0x0007 2 0 R_ERR response for host-to-device non-data FIS + + Seagate FARM log (GP Log 0xa6) supported [try: -l farm] + ``` +- Execution finished, nothing else to do. The disk remains in the mirror. - Other notes - I labeled the two disks by hand as AGAPITO1 and AGAPITO2, but I never noted their serial numbers. Silly me. This is the relation: - AGAPITO1 is ata-ST4000NT001-3M2101_WX11TN0Z. - AGAPITO2 is ata-ST4000NT001-3M2101_WX11TN2P. - - + ## Side quests