Koozali.org: home of the SME Server

Question to RAID 1 replacing disks to bigger disks

Offline SchulzStefan

  • *
  • 620
  • +0/-0
Re: Question to RAID 1 replacing disks to bigger disks
« Reply #15 on: March 22, 2013, 08:20:09 AM »
Yes.
And then one day you find ten years have got behind you.

Time, 1973
(Mason, Waters, Wright, Gilmour)

Offline purvis

  • *****
  • 567
  • +0/-0
Re: Question to RAID 1 replacing disks to bigger disks
« Reply #16 on: March 22, 2013, 01:50:48 PM »
i do not like the idea of anything automatically increasing the size of my drives.
I have installed sme on smaller drives and then installed larger drives for reasons of my own.
If somebody wants to run a bash scipt to increase a volume that can be downloaded. I have no issues with that.


Offline janet

  • *****
  • 4,812
  • +0/-0
Re: Question to RAID 1 replacing disks to bigger disks
« Reply #17 on: March 22, 2013, 07:41:54 PM »
SchulzStefan

Quote
It is not enough to dd if=/dev/sda1 of=/dev/sdb1 the bootsector?

Where did you get that command from ?
man dd implies it would copy sda1 to sdb1, I am not sure what that achieves.

According to
http://wiki.contribs.org/Raid#Reusing_Hard_Drives
the command to "zero" a drive & prepare it for reuse is
dd if=/dev/zero of=/dev/sdx bs=512 count=1
after doing the above you MUST reboot so that the empty partition table gets read correctly.
(replace sdx with sda, sdb, sdc etc)


If a drive is brand new & NEVER used then you should be able to add it to a RAID array in sme server & it will be added & sync'd by the system after manually selecting to do so from the console menu. You should NOT need to issue commands of any sort to either prepare the drive, or format it, or add a filesystems or add partitions etc, that will all be done by the system, after manually selecting to add the drive from the console menu (log in as admin to access this, or type console from the root command line prompt).

If the drive has been used (either a brand new or second hand drive) in Windows, Linux or wherever, then you MUST clear the drive & prepare it for use BEFORE adding it to an array.
The command to do this is
dd if=/dev/zero of=/dev/sdx bs=512 count=1
after doing the above you MUST reboot so that the empty partition table gets read correctly.
where /dev/sdx is the location of the drive to be cleared eg /dev/sdc

Then you can proceed to rebuild your array using the larger drives:
add both larger disks
check array is in sync
run cat /proc/mdstat to check this before proceeding
then follow the procedure to enlarge the array.
Please search before asking, an answer may already exist.
The Search & other links to useful information are at top of Forum.

Offline slords

  • *****
  • 235
  • +3/-0
Re: Question to RAID 1 replacing disks to bigger disks
« Reply #18 on: March 22, 2013, 07:42:06 PM »
pvs:
PV             VG   Fmt  Attr PSize   PFree
/dev/md2   main lvm2 a--  931,41G    0

vgs:
VG   #PV #LV #SN Attr   VSize   VFree
main   1   2   0 wz--n- 931,41G    0

lvs:
LV   VG   Attr   LSize   Origin Snap%  Move Log Copy%  Convert
root main -wi-ao 929,47G
swap main -wi-ao   1,94G

All of this shows that things have expanded as they should be.  The only step left is to actually expand the root filesystem.

What happens when you run the following command:

Code: [Select]
resize2fs -p /dev/mapper/main-root
I don't know why the & is on the final command on the wiki.  The actual expansion of the filesystem shouldn't take that long and there is no reason to put it in the background.  If this is an sme8 system then the only reason that command should fail would be if your root filesystem is ext4 instead of ext3.  If this is the case then you should replace the resize2fs command above with resize4fs.

Sorry it has taken me so long to get back to you.
"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs,
and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." -- Rich Cook

Offline SchulzStefan

  • *
  • 620
  • +0/-0
Re: Question to RAID 1 replacing disks to bigger disks
« Reply #19 on: March 24, 2013, 09:22:31 AM »
@mary

Quote
If a drive is brand new & NEVER used then you should be able to add it to a RAID array in sme server & it will be added & sync'd by the system after manually selecting to do so from the console menu. You should NOT need to issue commands of any sort to either prepare the drive, or format it, or add a filesystems or add partitions etc, that will all be done by the system, after manually selecting to add the drive from the console menu (log in as admin to access this, or type console from the root command line prompt).

That's what I did first. Both disks were in sync. The last one I added was not able to boot after removing sda. To test this, I plugged sda from the motherboard off, and changed sdb to sda. The synced drive didn't boot.

Therefore I followed this: http://wiki.contribs.org/Raid#Adding_another_Hard_Drive_Later_.28Raid1_array_only.29. I copied the boot partition with dd if=/dev/sda1 of=/dev/sdb1 to the new drive. Before doing this I followed http://wiki.contribs.org/Hard_Disk_Partitioning. After syncing and rebooting both drives are now able to boot. In my understanding that's the way it should be. If one disk fails the other shoud be bootable. This is the case right now.

@slords

Thank's for coming back. System is up, everybody is able to work, so time doesn't really matter. Here's the result of the command:

resize2fs -p /dev/mapper/main-root
resize2fs 1.39 (29-May-2006)
Filesystem at /dev/mapper/main-root is mounted on /; on-line resizing required
Performing an on-line resize of /dev/mapper/main-root to 243654656 (4k) blocks.
The filesystem on /dev/mapper/main-root is now 243654656 blocks long.

I'll boot the system and will check again the df -h command.
« Last Edit: March 24, 2013, 09:26:07 AM by SchulzStefan »
And then one day you find ten years have got behind you.

Time, 1973
(Mason, Waters, Wright, Gilmour)

Offline SchulzStefan

  • *
  • 620
  • +0/-0
Re: Question to RAID 1 replacing disks to bigger disks
« Reply #20 on: March 24, 2013, 09:54:37 AM »
@slords

Here's the output after the reboot:

df -h
Dateisystem          Gr��e Benut  Verf Ben% Eingeh�ngt auf
/dev/mapper/main-root
                       915G  141G  729G  17% /
/dev/md1             99M   46M   48M  50% /boot
none                  2,0G     0     2,0G   0% /dev/shm

mdadm --detail /dev/md1
/dev/md1:
        Version : 0.90
  Creation Time : Fri Aug  8 17:01:14 2008
     Raid Level : raid1
     Array Size : 104320 (101.89 MiB 106.82 MB)
  Used Dev Size : 104320 (101.89 MiB 106.82 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Sun Mar 24 09:30:10 2013
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : b5f1b131:fe27265a:85dfe98f:3fb577a2
         Events : 0.19506

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8        1        1      active sync   /dev/sda1

mdadm --detail /dev/md2
/dev/md2:
        Version : 0.90
  Creation Time : Fri Aug  8 17:01:14 2008
     Raid Level : raid1
     Array Size : 976655552 (931.41 GiB 1000.10 GB)
  Used Dev Size : 976655552 (931.41 GiB 1000.10 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Sun Mar 24 09:50:33 2013
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 7be080c3:58e3a9c4:55bdf7e0:ca9607bf
         Events : 0.46794156

    Number   Major   Minor   RaidDevice State
       0       8       18        0      active sync   /dev/sdb2
       1       8        2        1      active sync   /dev/sda2


fdisk -l               

Platte /dev/sda: 1000.2 GByte, 1000204886016 Byte
255 heads, 63 sectors/track, 121601 cylinders
Einheiten = Zylinder von 16065 � 512 = 8225280 Bytes

    Ger�t  boot.     Anfang        Ende     Bl�cke   Id  System
/dev/sda1   *           1          13      104384+  fd  Linux raid autodetect
Partition 1 endet nicht an einer Zylindergrenze.
/dev/sda2              13      121601   976655647   fd  Linux raid autodetect

Platte /dev/sdb: 1000.2 GByte, 1000204886016 Byte
255 heads, 63 sectors/track, 121601 cylinders
Einheiten = Zylinder von 16065 � 512 = 8225280 Bytes

    Ger�t  boot.     Anfang        Ende     Bl�cke   Id  System
/dev/sdb1   *           1          13      104384+  fd  Linux raid autodetect
Partition 1 endet nicht an einer Zylindergrenze.
/dev/sdb2              13      121601   976655647   fd  Linux raid autodetect

Platte /dev/md2: 1000.0 GByte, 1000095285248 Byte
2 heads, 4 sectors/track, 244163888 cylinders
Einheiten = Zylinder von 8 � 512 = 4096 Bytes

Festplatte /dev/md2 enth�lt keine g�ltige Partitionstabelle

Platte /dev/md1: 106 MByte, 106823680 Byte
2 heads, 4 sectors/track, 26080 cylinders
Einheiten = Zylinder von 8 � 512 = 4096 Bytes

Festplatte /dev/md1 enth�lt keine g�ltige Partitionstabelle

Seems the size is now correct. Thank you so far. There are still errors reported with the fdisk -l command. Is further investigation necessary?
And then one day you find ten years have got behind you.

Time, 1973
(Mason, Waters, Wright, Gilmour)

Offline slords

  • *****
  • 235
  • +3/-0
Re: Question to RAID 1 replacing disks to bigger disks
« Reply #21 on: March 26, 2013, 05:00:08 PM »
df -h
Dateisystem          Gr��e Benut  Verf Ben% Eingeh�ngt auf
/dev/mapper/main-root
                       915G  141G  729G  17% /
/dev/md1             99M   46M   48M  50% /boot
none                  2,0G     0     2,0G   0% /dev/shm

This looks good now.  You should have access to your extra space now.

Seems the size is now correct. Thank you so far. There are still errors reported with the fdisk -l command. Is further investigation necessary?

fdisk -l tries to find partitions on all devices.  We don't care about all devices.  Just about the two hard drives, and they look good.
"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs,
and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." -- Rich Cook

Offline SchulzStefan

  • *
  • 620
  • +0/-0
Re: Question to RAID 1 replacing disks to bigger disks
« Reply #22 on: March 26, 2013, 06:54:55 PM »
Thanks to all for staying with me.

stefan
And then one day you find ten years have got behind you.

Time, 1973
(Mason, Waters, Wright, Gilmour)

Offline SchulzStefan

  • *
  • 620
  • +0/-0
Re: Question to RAID 1 replacing disks to bigger disks
« Reply #23 on: March 28, 2013, 11:24:09 AM »
Well - got a few problems since resizing the system.

Two days ago the the server stoppped working. The error on the console was "i/o error ext3 journal". The server was bootable and resynced automatically. In the morning hours today, during an affa backup, the server died again. On the restart the machine dropped on the console (ctrl-d to boot up, or pwd for maintenance) with an inconsistent filesystem, to perform a fsck. Fsck found a lot of errors which have been corrected.

After booting again, the sdb2 was removed from md2. I added the disk with  mdadm --add /dev/md2 /dev/sdb2. Server is up and syncing now.

I think there must be something wrong. In my experiance with smeserver since 5.x always running RAID1, I don't remember having problems in this way. Is it possible to find the reason why the server is running so instable , or should I use my backup to build the server new?

Some more information about the disks:

smartctl -a /dev/sda
smartctl 5.42 2011-10-20 r3458 [i686-linux-2.6.18-348.3.1.el5PAE] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD10EFRX-68JCSN0
Serial Number:    WD-WCC1U0647211
LU WWN Device Id: 5 0014ee 207ef4bfd
Firmware Version: 01.01A01
User Capacity:    1.000.204.886.016 bytes [1,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Thu Mar 28 12:19:46 2013 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (13680) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 156) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x30bd) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   140   137   021    Pre-fail  Always       -       3983
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       30
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       310
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       30
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       15
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       14
194 Temperature_Celsius     0x0022   111   107   000    Old_age   Always       -       32
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

SMART Error Log Version: 1
ATA Error Count: 1
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 1 occurred at disk power-on lifetime: 278 hours (11 days + 14 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  10 51 08 8a 96 03 e0  Error: IDNF at LBA = 0x0003968a = 235146

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ca 00 08 8a 96 03 e0 08   8d+14:10:43.909  WRITE DMA

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

And the other one:

smartctl -a /dev/sdb
smartctl 5.42 2011-10-20 r3458 [i686-linux-2.6.18-348.3.1.el5PAE] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD10EFRX-68JCSN0
Serial Number:    WD-WCC1U0641728
LU WWN Device Id: 5 0014ee 2b29a02e9
Firmware Version: 01.01A01
User Capacity:    1.000.204.886.016 bytes [1,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Thu Mar 28 12:20:48 2013 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (13320) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 152) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x30bd) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   137   135   021    Pre-fail  Always       -       4141
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       15
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       237
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       15
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       9
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       5
194 Temperature_Celsius     0x0022   112   110   000    Old_age   Always       -       31
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

SMART Error Log Version: 1
ATA Error Count: 1
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 1 occurred at disk power-on lifetime: 219 hours (9 days + 3 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  10 51 50 02 57 04 e0  Error: IDNF at LBA = 0x00045702 = 284418

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ca 00 50 02 57 04 e0 08      07:00:40.145  WRITE DMA

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Both brand new, original packed and sealed, bought by my local dealer.
« Last Edit: March 28, 2013, 12:22:58 PM by SchulzStefan »
And then one day you find ten years have got behind you.

Time, 1973
(Mason, Waters, Wright, Gilmour)

Offline SchulzStefan

  • *
  • 620
  • +0/-0
Re: Question to RAID 1 replacing disks to bigger disks
« Reply #24 on: March 29, 2013, 09:17:47 PM »
After replacing the disks and having the trouble that the server two times died, I checked also the BIOS settings of the machine. I found, that the hard raid was enabled. I'm quite sure, that before changing the disks, I had it disabled. I disabled the RAID again. Server is running and up. No problems so far. I'll report in a few days whether the server is running stable.
And then one day you find ten years have got behind you.

Time, 1973
(Mason, Waters, Wright, Gilmour)

Offline SchulzStefan

  • *
  • 620
  • +0/-0
Re: Question to RAID 1 replacing disks to bigger disks
« Reply #25 on: March 31, 2013, 12:01:45 PM »
Server is still up and running. Some more information:

LOG before I replaced the disks:

Mar 13 13:35:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 66 to 64
Mar 13 13:35:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 60 to 63
Mar 13 14:05:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 64 to 63
Mar 13 14:35:54 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 63 to 62
Mar 13 16:05:54 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 63 to 62
Mar 13 18:35:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 62 to 61
Mar 13 18:35:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 62 to 61
Mar 13 19:05:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 61 to 62
Mar 13 19:05:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 61 to 63
Mar 13 19:35:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 62 to 63
Mar 13 19:35:54 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 63 to 62
Mar 13 20:05:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 63 to 62
Mar 13 21:05:54 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 62 to 61
Mar 13 21:05:54 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 62 to 61
Mar 13 21:35:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 61 to 62
Mar 13 21:35:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 61 to 62
Mar 14 00:35:54 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 62 to 58
Mar 14 00:35:54 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 62 to 60
Mar 14 01:05:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 58 to 57
Mar 14 01:05:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 60 to 59
Mar 14 01:35:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 59 to 58
Mar 14 02:05:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 57 to 58
Mar 14 02:05:54 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 58 to 59
Mar 14 03:05:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 59 to 60
Mar 14 04:05:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 58 to 59
Mar 14 04:35:56 smartd[2842]: Device: /dev/sdc [SAT], SMART Prefailure Attribute: 10 Spin_Retry_Count changed from 195 to 196
Mar 14 05:05:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 60 to 61
Mar 14 05:35:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 59 to 60
Mar 14 05:35:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 61 to 62
Mar 14 05:35:55 smartd[2842]: Device: /dev/sdc [SAT], SMART Usage Attribute: 9 Power_On_Hours changed from 89 to 88
Mar 14 07:35:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 62 to 61
Mar 14 12:35:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 61 to 60
Mar 14 13:35:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 60 to 59
Mar 14 14:35:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 59 to 60
Mar 14 14:35:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 60 to 61
Mar 14 21:05:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 61 to 62
Mar 14 22:05:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 60 to 66
Mar 14 22:05:53 smartd[2842]: Device: /dev/sdb [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 62 to 66
Mar 14 23:05:53 smartd[2842]: Device: /dev/sda [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 66 to 65

LOG after changing the disks:

Mar 28 18:18:26 smartd[2672]: Device: /dev/sda [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 200 to 100
Mar 28 19:48:25 smartd[2672]: Device: /dev/sdb [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 200 to 100
Mar 28 20:18:25 smartd[2672]: Device: /dev/sdb [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 100 to 200
Mar 28 23:48:25 smartd[2672]: Device: /dev/sda [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 100 to 200
---
Mar 31 00:48:26 smartd[2672]: Device: /dev/sda [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 200 to 100
Mar 31 01:18:25 smartd[2672]: Device: /dev/sda [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 100 to 200
Mar 31 01:48:25 smartd[2672]: Device: /dev/sda [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 200 to 100
Mar 31 01:48:25 smartd[2672]: Device: /dev/sdb [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 200 to 100
Mar 31 03:48:25 smartd[2672]: Device: /dev/sdb [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 100 to 200
Mar 31 04:18:26 smartd[2672]: Device: /dev/sdb [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 200 to 100
Mar 31 05:18:25 smartd[2672]: Device: /dev/sdb [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 100 to 200
Mar 31 06:48:25 smartd[2672]: Device: /dev/sda [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 100 to 200

I'm not firm with smartd. Does it mean anything to the stability of the server? Or is further investigation needed?
And then one day you find ten years have got behind you.

Time, 1973
(Mason, Waters, Wright, Gilmour)

Offline SchulzStefan

  • *
  • 620
  • +0/-0
Re: Question to RAID 1 replacing disks to bigger disks
« Reply #26 on: April 04, 2013, 11:16:28 AM »
I have no explanation - since April 01 there are no more messages from smartd. Server is up and running stable.
And then one day you find ten years have got behind you.

Time, 1973
(Mason, Waters, Wright, Gilmour)

Offline Jáder

  • *
  • 1,099
  • +0/-0
    • LinuxFacil
Re: Question to RAID 1 replacing disks to bigger disks
« Reply #27 on: April 04, 2013, 03:32:41 PM »
After replacing the disks and having the trouble that the server two times died, I checked also the BIOS settings of the machine. I found, that the hard raid was enabled. I'm quite sure, that before changing the disks, I had it disabled. I disabled the RAID again. Server is running and up. No problems so far. I'll report in a few days whether the server is running stable.

A long time I was talking about to use only SW RAID vs HW RAID when available (in another thread I really do not remember now) and Charlie Brady ask me "why do not enable both of them".
I said I've heard about problems with this... but just could not found any memories to show him.
Now I have them. Thank you!

I hope you read this Charlie! :D

 
...