Question to RAID 1 replacing disks to bigger disks

SchulzStefan

620
+0/-0

Re: Question to RAID 1 replacing disks to bigger disks

« Reply #15 on: March 22, 2013, 08:20:09 AM »

Yes.

Logged

And then one day you find ten years have got behind you.

Time, 1973
(Mason, Waters, Wright, Gilmour)

purvis

567
+0/-0

Re: Question to RAID 1 replacing disks to bigger disks

« Reply #16 on: March 22, 2013, 01:50:48 PM »

i do not like the idea of anything automatically increasing the size of my drives.
I have installed sme on smaller drives and then installed larger drives for reasons of my own.
If somebody wants to run a bash scipt to increase a volume that can be downloaded. I have no issues with that.

Logged

janet

4,812
+0/-0

Re: Question to RAID 1 replacing disks to bigger disks

« Reply #17 on: March 22, 2013, 07:41:54 PM »

SchulzStefan

Quote

It is not enough to dd if=/dev/sda1 of=/dev/sdb1 the bootsector?

Where did you get that command from ?
man dd implies it would copy sda1 to sdb1, I am not sure what that achieves.

According to
http://wiki.contribs.org/Raid#Reusing_Hard_Drives
the command to "zero" a drive & prepare it for reuse is
dd if=/dev/zero of=/dev/sdx bs=512 count=1
after doing the above you MUST reboot so that the empty partition table gets read correctly.
(replace sdx with sda, sdb, sdc etc)

If a drive is brand new & NEVER used then you should be able to add it to a RAID array in sme server & it will be added & sync'd by the system after manually selecting to do so from the console menu. You should NOT need to issue commands of any sort to either prepare the drive, or format it, or add a filesystems or add partitions etc, that will all be done by the system, after manually selecting to add the drive from the console menu (log in as admin to access this, or type console from the root command line prompt).

If the drive has been used (either a brand new or second hand drive) in Windows, Linux or wherever, then you MUST clear the drive & prepare it for use BEFORE adding it to an array.
The command to do this is
dd if=/dev/zero of=/dev/sdx bs=512 count=1
after doing the above you MUST reboot so that the empty partition table gets read correctly.
where /dev/sdx is the location of the drive to be cleared eg /dev/sdc

Then you can proceed to rebuild your array using the larger drives:
add both larger disks
check array is in sync
run cat /proc/mdstat to check this before proceeding
then follow the procedure to enlarge the array.

Logged

Please search before asking, an answer may already exist.
The Search & other links to useful information are at top of Forum.

slords

235
+3/-0

Re: Question to RAID 1 replacing disks to bigger disks

« Reply #18 on: March 22, 2013, 07:42:06 PM »

Quote from: SchulzStefan on March 21, 2013, 11:20:04 AM

pvs:
PV VG Fmt Attr PSize PFree
/dev/md2 main lvm2 a-- 931,41G 0

vgs:
VG #PV #LV #SN Attr VSize VFree
main 1 2 0 wz--n- 931,41G 0

lvs:
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
root main -wi-ao 929,47G
swap main -wi-ao 1,94G

All of this shows that things have expanded as they should be. The only step left is to actually expand the root filesystem.

What happens when you run the following command:

Code: [Select]

resize2fs -p /dev/mapper/main-root
I don't know why the & is on the final command on the wiki. The actual expansion of the filesystem shouldn't take that long and there is no reason to put it in the background. If this is an sme8 system then the only reason that command should fail would be if your root filesystem is ext4 instead of ext3. If this is the case then you should replace the resize2fs command above with resize4fs.

Sorry it has taken me so long to get back to you.

Logged

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs,
and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." -- Rich Cook

SchulzStefan

620
+0/-0

Re: Question to RAID 1 replacing disks to bigger disks

« Reply #19 on: March 24, 2013, 09:22:31 AM »

@mary

Quote

If a drive is brand new & NEVER used then you should be able to add it to a RAID array in sme server & it will be added & sync'd by the system after manually selecting to do so from the console menu. You should NOT need to issue commands of any sort to either prepare the drive, or format it, or add a filesystems or add partitions etc, that will all be done by the system, after manually selecting to add the drive from the console menu (log in as admin to access this, or type console from the root command line prompt).

That's what I did first. Both disks were in sync. The last one I added was not able to boot after removing sda. To test this, I plugged sda from the motherboard off, and changed sdb to sda. The synced drive didn't boot.

Therefore I followed this: http://wiki.contribs.org/Raid#Adding_another_Hard_Drive_Later_.28Raid1_array_only.29. I copied the boot partition with dd if=/dev/sda1 of=/dev/sdb1 to the new drive. Before doing this I followed http://wiki.contribs.org/Hard_Disk_Partitioning. After syncing and rebooting both drives are now able to boot. In my understanding that's the way it should be. If one disk fails the other shoud be bootable. This is the case right now.

@slords

Thank's for coming back. System is up, everybody is able to work, so time doesn't really matter. Here's the result of the command:

resize2fs -p /dev/mapper/main-root
resize2fs 1.39 (29-May-2006)
Filesystem at /dev/mapper/main-root is mounted on /; on-line resizing required
Performing an on-line resize of /dev/mapper/main-root to 243654656 (4k) blocks.
The filesystem on /dev/mapper/main-root is now 243654656 blocks long.

I'll boot the system and will check again the df -h command.

« Last Edit: March 24, 2013, 09:26:07 AM by SchulzStefan »

Logged

And then one day you find ten years have got behind you.

Time, 1973
(Mason, Waters, Wright, Gilmour)

SchulzStefan

620
+0/-0

Re: Question to RAID 1 replacing disks to bigger disks

« Reply #20 on: March 24, 2013, 09:54:37 AM »

@slords

Here's the output after the reboot:

df -h
Dateisystem Gr��e Benut Verf Ben% Eingeh�ngt auf
/dev/mapper/main-root
915G 141G 729G 17% /
/dev/md1 99M 46M 48M 50% /boot
none 2,0G 0 2,0G 0% /dev/shm

mdadm --detail /dev/md1
/dev/md1:
Version : 0.90
Creation Time : Fri Aug 8 17:01:14 2008
Raid Level : raid1
Array Size : 104320 (101.89 MiB 106.82 MB)
Used Dev Size : 104320 (101.89 MiB 106.82 MB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 1
Persistence : Superblock is persistent

Update Time : Sun Mar 24 09:30:10 2013
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

UUID : b5f1b131:fe27265a:85dfe98f:3fb577a2
Events : 0.19506

Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 1 1 active sync /dev/sda1

mdadm --detail /dev/md2
/dev/md2:
Version : 0.90
Creation Time : Fri Aug 8 17:01:14 2008
Raid Level : raid1
Array Size : 976655552 (931.41 GiB 1000.10 GB)
Used Dev Size : 976655552 (931.41 GiB 1000.10 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 2
Persistence : Superblock is persistent

Update Time : Sun Mar 24 09:50:33 2013
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

UUID : 7be080c3:58e3a9c4:55bdf7e0:ca9607bf
Events : 0.46794156

Number Major Minor RaidDevice State
0 8 18 0 active sync /dev/sdb2
1 8 2 1 active sync /dev/sda2

fdisk -l

Platte /dev/sda: 1000.2 GByte, 1000204886016 Byte
255 heads, 63 sectors/track, 121601 cylinders
Einheiten = Zylinder von 16065 � 512 = 8225280 Bytes

Ger�t boot. Anfang Ende Bl�cke Id System
/dev/sda1 * 1 13 104384+ fd Linux raid autodetect
Partition 1 endet nicht an einer Zylindergrenze.
/dev/sda2 13 121601 976655647 fd Linux raid autodetect

Platte /dev/sdb: 1000.2 GByte, 1000204886016 Byte
255 heads, 63 sectors/track, 121601 cylinders
Einheiten = Zylinder von 16065 � 512 = 8225280 Bytes

Ger�t boot. Anfang Ende Bl�cke Id System
/dev/sdb1 * 1 13 104384+ fd Linux raid autodetect
Partition 1 endet nicht an einer Zylindergrenze.
/dev/sdb2 13 121601 976655647 fd Linux raid autodetect

Platte /dev/md2: 1000.0 GByte, 1000095285248 Byte
2 heads, 4 sectors/track, 244163888 cylinders
Einheiten = Zylinder von 8 � 512 = 4096 Bytes

Festplatte /dev/md2 enth�lt keine g�ltige Partitionstabelle

Platte /dev/md1: 106 MByte, 106823680 Byte
2 heads, 4 sectors/track, 26080 cylinders
Einheiten = Zylinder von 8 � 512 = 4096 Bytes

Festplatte /dev/md1 enth�lt keine g�ltige Partitionstabelle

Seems the size is now correct. Thank you so far. There are still errors reported with the fdisk -l command. Is further investigation necessary?

Logged

And then one day you find ten years have got behind you.

Time, 1973
(Mason, Waters, Wright, Gilmour)

slords

235
+3/-0

Re: Question to RAID 1 replacing disks to bigger disks

« Reply #21 on: March 26, 2013, 05:00:08 PM »

Quote from: SchulzStefan on March 24, 2013, 09:54:37 AM

df -h
Dateisystem Gr��e Benut Verf Ben% Eingeh�ngt auf
/dev/mapper/main-root
915G 141G 729G 17% /
/dev/md1 99M 46M 48M 50% /boot
none 2,0G 0 2,0G 0% /dev/shm

This looks good now. You should have access to your extra space now.

Quote from: SchulzStefan on March 24, 2013, 09:54:37 AM

Seems the size is now correct. Thank you so far. There are still errors reported with the fdisk -l command. Is further investigation necessary?

fdisk -l tries to find partitions on all devices. We don't care about all devices. Just about the two hard drives, and they look good.

Logged

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs,
and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." -- Rich Cook

SchulzStefan

620
+0/-0

Re: Question to RAID 1 replacing disks to bigger disks

« Reply #22 on: March 26, 2013, 06:54:55 PM »

Thanks to all for staying with me.

stefan

Logged

And then one day you find ten years have got behind you.

Time, 1973
(Mason, Waters, Wright, Gilmour)

SchulzStefan

620
+0/-0

Re: Question to RAID 1 replacing disks to bigger disks

« Reply #23 on: March 28, 2013, 11:24:09 AM »

Well - got a few problems since resizing the system.

Two days ago the the server stoppped working. The error on the console was "i/o error ext3 journal". The server was bootable and resynced automatically. In the morning hours today, during an affa backup, the server died again. On the restart the machine dropped on the console (ctrl-d to boot up, or pwd for maintenance) with an inconsistent filesystem, to perform a fsck. Fsck found a lot of errors which have been corrected.

After booting again, the sdb2 was removed from md2. I added the disk with mdadm --add /dev/md2 /dev/sdb2. Server is up and syncing now.

I think there must be something wrong. In my experiance with smeserver since 5.x always running RAID1, I don't remember having problems in this way. Is it possible to find the reason why the server is running so instable , or should I use my backup to build the server new?

Some more information about the disks:

smartctl -a /dev/sda
smartctl 5.42 2011-10-20 r3458 [i686-linux-2.6.18-348.3.1.el5PAE] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model: WDC WD10EFRX-68JCSN0
Serial Number: WD-WCC1U0647211
LU WWN Device Id: 5 0014ee 207ef4bfd
Firmware Version: 01.01A01
User Capacity: 1.000.204.886.016 bytes [1,00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Thu Mar 28 12:19:46 2013 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (13680) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 156) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x30bd) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 140 137 021 Pre-fail Always - 3983
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 30
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 310
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 30
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 15
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 14
194 Temperature_Celsius 0x0022 111 107 000 Old_age Always - 32
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0

SMART Error Log Version: 1
ATA Error Count: 1
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 1 occurred at disk power-on lifetime: 278 hours (11 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
10 51 08 8a 96 03 e0 Error: IDNF at LBA = 0x0003968a = 235146

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ca 00 08 8a 96 03 e0 08 8d+14:10:43.909 WRITE DMA

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

And the other one:

smartctl -a /dev/sdb
smartctl 5.42 2011-10-20 r3458 [i686-linux-2.6.18-348.3.1.el5PAE] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model: WDC WD10EFRX-68JCSN0
Serial Number: WD-WCC1U0641728
LU WWN Device Id: 5 0014ee 2b29a02e9
Firmware Version: 01.01A01
User Capacity: 1.000.204.886.016 bytes [1,00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Thu Mar 28 12:20:48 2013 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (13320) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 152) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x30bd) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 137 135 021 Pre-fail Always - 4141
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 15
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 237
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 15
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 9
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 5
194 Temperature_Celsius 0x0022 112 110 000 Old_age Always - 31
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0

SMART Error Log Version: 1
ATA Error Count: 1
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 1 occurred at disk power-on lifetime: 219 hours (9 days + 3 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
10 51 50 02 57 04 e0 Error: IDNF at LBA = 0x00045702 = 284418

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ca 00 50 02 57 04 e0 08 07:00:40.145 WRITE DMA

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Both brand new, original packed and sealed, bought by my local dealer.

« Last Edit: March 28, 2013, 12:22:58 PM by SchulzStefan »

Logged

And then one day you find ten years have got behind you.

Time, 1973
(Mason, Waters, Wright, Gilmour)

SchulzStefan

620
+0/-0

Re: Question to RAID 1 replacing disks to bigger disks

« Reply #24 on: March 29, 2013, 09:17:47 PM »

After replacing the disks and having the trouble that the server two times died, I checked also the BIOS settings of the machine. I found, that the hard raid was enabled. I'm quite sure, that before changing the disks, I had it disabled. I disabled the RAID again. Server is running and up. No problems so far. I'll report in a few days whether the server is running stable.

Logged

And then one day you find ten years have got behind you.

Time, 1973
(Mason, Waters, Wright, Gilmour)

SchulzStefan

620
+0/-0

Re: Question to RAID 1 replacing disks to bigger disks

« Reply #25 on: March 31, 2013, 12:01:45 PM »

Server is still up and running. Some more information:

LOG before I replaced the disks:

Mar 13 13:35:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 66 to 64
Mar 13 13:35:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 60 to 63
Mar 13 14:05:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 64 to 63
Mar 13 14:35:54 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 63 to 62
Mar 13 16:05:54 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 63 to 62
Mar 13 18:35:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 62 to 61
Mar 13 18:35:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 62 to 61
Mar 13 19:05:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 61 to 62
Mar 13 19:05:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 61 to 63
Mar 13 19:35:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 62 to 63
Mar 13 19:35:54 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 63 to 62
Mar 13 20:05:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 63 to 62
Mar 13 21:05:54 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 62 to 61
Mar 13 21:05:54 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 62 to 61
Mar 13 21:35:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 61 to 62
Mar 13 21:35:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 61 to 62
Mar 14 00:35:54 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 62 to 58
Mar 14 00:35:54 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 62 to 60
Mar 14 01:05:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 58 to 57
Mar 14 01:05:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 60 to 59
Mar 14 01:35:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 59 to 58
Mar 14 02:05:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 57 to 58
Mar 14 02:05:54 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 58 to 59
Mar 14 03:05:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 59 to 60
Mar 14 04:05:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 58 to 59
Mar 14 04:35:56 smartd[2842]: Device: /dev/sdc [SAT], SMART Prefailure Attribute: 10 Spin_Retry_Count changed from 195 to 196
Mar 14 05:05:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 60 to 61
Mar 14 05:35:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 59 to 60
Mar 14 05:35:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 61 to 62
Mar 14 05:35:55 smartd[2842]: Device: /dev/sdc [SAT], SMART Usage Attribute: 9 Power_On_Hours changed from 89 to 88
Mar 14 07:35:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 62 to 61
Mar 14 12:35:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 61 to 60
Mar 14 13:35:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 60 to 59
Mar 14 14:35:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 59 to 60
Mar 14 14:35:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 60 to 61
Mar 14 21:05:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 61 to 62
Mar 14 22:05:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 60 to 66
Mar 14 22:05:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 62 to 66
Mar 14 23:05:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 66 to 65

LOG after changing the disks:

Mar 28 18:18:26 smartd[2672]: Device: /dev/sda [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 200 to 100
Mar 28 19:48:25 smartd[2672]: Device: /dev/sdb [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 200 to 100
Mar 28 20:18:25 smartd[2672]: Device: /dev/sdb [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 100 to 200
Mar 28 23:48:25 smartd[2672]: Device: /dev/sda [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 100 to 200
---
Mar 31 00:48:26 smartd[2672]: Device: /dev/sda [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 200 to 100
Mar 31 01:18:25 smartd[2672]: Device: /dev/sda [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 100 to 200
Mar 31 01:48:25 smartd[2672]: Device: /dev/sda [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 200 to 100
Mar 31 01:48:25 smartd[2672]: Device: /dev/sdb [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 200 to 100
Mar 31 03:48:25 smartd[2672]: Device: /dev/sdb [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 100 to 200
Mar 31 04:18:26 smartd[2672]: Device: /dev/sdb [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 200 to 100
Mar 31 05:18:25 smartd[2672]: Device: /dev/sdb [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 100 to 200
Mar 31 06:48:25 smartd[2672]: Device: /dev/sda [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 100 to 200

I'm not firm with smartd. Does it mean anything to the stability of the server? Or is further investigation needed?

Logged

And then one day you find ten years have got behind you.

Time, 1973
(Mason, Waters, Wright, Gilmour)

SchulzStefan

620
+0/-0

Re: Question to RAID 1 replacing disks to bigger disks

« Reply #26 on: April 04, 2013, 11:16:28 AM »

I have no explanation - since April 01 there are no more messages from smartd. Server is up and running stable.

Logged

And then one day you find ten years have got behind you.

Time, 1973
(Mason, Waters, Wright, Gilmour)

Jáder

1,099
+0/-0

Re: Question to RAID 1 replacing disks to bigger disks

« Reply #27 on: April 04, 2013, 03:32:41 PM »

Quote from: SchulzStefan on March 29, 2013, 09:17:47 PM

After replacing the disks and having the trouble that the server two times died, I checked also the BIOS settings of the machine. I found, that the hard raid was enabled. I'm quite sure, that before changing the disks, I had it disabled. I disabled the RAID again. Server is running and up. No problems so far. I'll report in a few days whether the server is running stable.

A long time I was talking about to use only SW RAID vs HW RAID when available (in another thread I really do not remember now) and Charlie Brady ask me "why do not enable both of them".
I said I've heard about problems with this... but just could not found any memories to show him.
Now I have them. Thank you!

I hope you read this Charlie!

Logged

...