Syncing went through. And here we are:
[root@real-saturn ~]# cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sdb2[0] sda2[1]
488279488 blocks [2/2] [UU]
md1 : active raid1 sdb1[0] sda1[1]
104320 blocks [2/2] [UU]
unused devices: <none>
and:
This email was generated by the smartd daemon running on:
host name: real-saturn
DNS domain: real.ivb.org
NIS domain: (none)
The following warning/error was logged by the smartd daemon:
Device: /dev/sda, ATA error count increased from 3814 to 3815
For details see host's SYSLOG (default: /var/log/messages).
You can also use the smartctl utility for further investigation.
Another email message will be sent in 1 days if the problem persists
and:
[root@real-saturn ~]# fdisk -lu /dev/sda; fdisk -lu /dev/sdb
Platte /dev/sda: 500.1 GByte, 500107862016 Byte
255 Köpfe, 63 Sektoren/Spuren, 60801 Zylinder, total 976773168 sectors
Einheiten = Sektoren von 1 * 512 = 512 Bytes
Gerät Boot Start End Blocks Id System
/dev/sda1 * 1 208769 104384+ fd Linux raid autodetect
Partition 1 endet nicht an einer Zylindergrenze.
/dev/sda2 208770 976768063 488279647 fd Linux raid autodetect
Platte /dev/sdb: 500.1 GByte, 500107862016 Byte
255 Köpfe, 63 Sektoren/Spuren, 60801 Zylinder, total 976773168 sectors
Einheiten = Sektoren von 1 * 512 = 512 Bytes
Gerät Boot Start End Blocks Id System
/dev/sdb1 * 1 208769 104384+ fd Linux raid autodetect
Partition 1 endet nicht an einer Zylindergrenze.
/dev/sdb2 208770 976768063 488279647 fd Linux raid autodetect
and:
[root@real-saturn ~]# smartctl -a /dev/sda
smartctl version 5.33 [i686-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is
http://smartmontools.sourceforge.net/=== START OF INFORMATION SECTION ===
Device Model: ST3500630AS
Serial Number: 6QG3A7B0
Firmware Version: 3.AAK
User Capacity: 500,107,862,016 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 7
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Fri Nov 19 22:46:50 2010 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 24) The self-test routine was aborted by
the host.
Total time to complete Offline
data collection: ( 430) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 163) minutes.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 115 100 006 Pre-fail Always - 90420120
3 Spin_Up_Time 0x0003 093 093 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 86
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 082 060 030 Pre-fail Always - 185476578
9 Power_On_Hours 0x0032 078 078 000 Old_age Always - 19876
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 86
187 Unknown_Attribute 0x0032 001 001 000 Old_age Always - 3488
189 Unknown_Attribute 0x003a 100 100 000 Old_age Always - 0
190 Unknown_Attribute 0x0022 062 055 045 Old_age Always - 707133478
194 Temperature_Celsius 0x0022 038 045 000 Old_age Always - 38 (Lifetime Min/Max 0/24)
195 Hardware_ECC_Recovered 0x001a 068 057 000 Old_age Always - 107020588
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 1
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0
SMART Error Log Version: 1
ATA Error Count: 3815 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 3815 occurred at disk power-on lifetime: 19875 hours (828 days + 3 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 af 53 bb 0c e0 Error: ICRC, ABRT 175 sectors at LBA = 0x000cbb53 = 834387
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 00 02 b8 0c e0 00 01:31:58.989 READ DMA EXT
25 00 00 02 b4 0c e0 00 01:31:58.980 READ DMA EXT
25 00 00 02 b0 0c e0 00 01:31:58.972 READ DMA EXT
25 00 80 82 ac 0c e0 00 01:31:58.963 READ DMA EXT
25 00 00 82 a8 0c e0 00 01:31:59.076 READ DMA EXT
Error 3814 occurred at disk power-on lifetime: 19872 hours (828 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 01 98 b9 94 e0 Error: UNC at LBA = 0x0094b998 = 9746840
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
42 00 00 00 b8 94 e0 00 00:14:33.231 READ VERIFY SECTOR(S) EXT
42 00 00 00 b0 94 e0 00 00:14:33.220 READ VERIFY SECTOR(S) EXT
42 00 00 00 a8 94 e0 00 00:14:33.203 READ VERIFY SECTOR(S) EXT
42 00 00 00 a0 94 e0 00 00:14:33.191 READ VERIFY SECTOR(S) EXT
42 00 00 00 98 94 e0 00 00:14:33.174 READ VERIFY SECTOR(S) EXT
Error 3813 occurred at disk power-on lifetime: 19867 hours (827 days + 19 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 98 b9 94 e7 Error: UNC at LBA = 0x0794b998 = 127187352
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 70 92 b9 94 e7 00 01:30:05.464 READ DMA
ec 03 46 00 00 00 a0 00 01:30:05.461 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 00 01:30:05.461 SET FEATURES [Set transfer mode]
ec 00 00 98 b9 94 a0 00 01:30:05.459 IDENTIFY DEVICE
c8 00 78 8a b9 94 e7 00 01:30:03.419 READ DMA
Error 3812 occurred at disk power-on lifetime: 19867 hours (827 days + 19 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 98 b9 94 e7 Error: UNC at LBA = 0x0794b998 = 127187352
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 78 8a b9 94 e7 00 01:30:05.464 READ DMA
ec 03 46 00 00 00 a0 00 01:30:05.461 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 00 01:30:05.461 SET FEATURES [Set transfer mode]
ec 00 00 98 b9 94 a0 00 01:30:05.459 IDENTIFY DEVICE
c8 00 80 82 b9 94 e7 00 01:30:03.419 READ DMA
Error 3811 occurred at disk power-on lifetime: 19867 hours (827 days + 19 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 98 b9 94 e7 Error: UNC at LBA = 0x0794b998 = 127187352
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 80 82 b9 94 e7 00 01:29:59.327 READ DMA
ec 03 46 00 00 00 a0 00 01:29:57.288 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 00 01:29:57.285 SET FEATURES [Set transfer mode]
ec 00 00 98 b9 94 a0 00 01:29:57.285 IDENTIFY DEVICE
c8 00 88 7a b9 94 e7 00 01:30:03.419 READ DMA
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Aborted by host 80% 19873 -
# 2 Short offline Completed without error 00% 19873 -
# 3 Short offline Aborted by host 90% 19873 -
# 4 Short offline Completed: read failure 90% 19873 127187352
# 5 Short offline Completed: read failure 90% 19871 127187352
# 6 Short offline Interrupted (host reset) 90% 19846 -
# 7 Short offline Aborted by host 80% 19873 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
and:
[root@real-saturn ~]# smartctl -a /dev/sdb
smartctl version 5.33 [i686-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is
http://smartmontools.sourceforge.net/=== START OF INFORMATION SECTION ===
Device Model: ST3500630AS
Serial Number: 9QG8S58F
Firmware Version: 3.AAK
User Capacity: 500,107,862,016 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 7
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Fri Nov 19 22:48:56 2010 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 430) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 163) minutes.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 113 094 006 Pre-fail Always - 57276716
3 Spin_Up_Time 0x0003 093 093 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 90
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 14
7 Seek_Error_Rate 0x000f 082 060 030 Pre-fail Always - 181467908
9 Power_On_Hours 0x0032 078 078 000 Old_age Always - 19901
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 90
187 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
189 Unknown_Attribute 0x003a 100 100 000 Old_age Always - 0
190 Unknown_Attribute 0x0022 066 056 045 Old_age Always - 656539682
194 Temperature_Celsius 0x0022 034 044 000 Old_age Always - 34 (Lifetime Min/Max 0/22)
195 Hardware_ECC_Recovered 0x001a 063 059 000 Old_age Always - 18356524
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 19898 -
# 2 Short offline Completed without error 00% 19897 -
# 3 Short offline Interrupted (host reset) 90% 19871 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Am I right to replace the sda-disk? Can I just plug it off the machine?
stefan