Koozali.org: home of the SME Server

RAID Monitor Question

keyman0202

RAID Monitor Question
« on: December 08, 2006, 01:34:15 PM »
I'm fairly new to SME and Linux, but I've noticed a slow down in the server recently.   I received a message in the admin account mailbox that I've not seen before REBUILD20, REBUILD40, and REBUILD60.  

Can someone review the log below and let me know if it is evident that I have a problem with my RAID configuration?

Thanks in advance!

2006-12-06 15:09:39.669615500 mdadm: only specify super-minor once, super-minor=2 ignored.
2006-12-06 15:09:39.669623500 mdadm: only specify super-minor once, super-minor=1 ignored.
2006-12-06 15:09:41.009164500 Event: SparesMissing, Device: /dev/md1, Member:
2006-12-06 15:09:41.176215500 Event: SparesMissing, Device: /dev/md2, Member:
2006-12-06 15:09:41.362868500 Event: NewArray, Device: /dev/md3, Member:
2006-12-06 15:25:35.774626500 mdadm: only specify super-minor once, super-minor=2 ignored.
2006-12-06 15:25:35.774635500 mdadm: only specify super-minor once, super-minor=1 ignored.
2006-12-06 15:25:38.814959500 Event: SparesMissing, Device: /dev/md1, Member:
2006-12-06 15:25:39.077569500 Event: SparesMissing, Device: /dev/md2, Member:
2006-12-06 15:25:39.310961500 Event: NewArray, Device: /dev/md3, Member:
2006-12-07 03:11:49.610587500 Event: Rebuild20, Device: /dev/md2, Member:
2006-12-07 16:03:27.694012500 Event: Rebuild40, Device: /dev/md2, Member:
2006-12-08 03:29:33.333764500 Event: Rebuild60, Device: /dev/md2, Member:

keyman0202

More info
« Reply #1 on: December 08, 2006, 01:45:57 PM »
Here's some output of my madm...I have 4 250M drives...I would really appreciate the help.

[root@prserver ~]# mdadm -D /dev/md1
/dev/md1:
        Version : 00.90.01
  Creation Time : Sun Apr 16 17:20:09 2006
     Raid Level : raid1
     Array Size : 104320 (101.88 MiB 106.82 MB)
    Device Size : 104320 (101.88 MiB 106.82 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Fri Dec  8 01:12:48 2006
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0


    Number   Major   Minor   RaidDevice State
       0       3       65        0      active sync   /dev/hdb1
       1       3        1        1      active sync   /dev/hda1
           UUID : a5060977:bd629c6c:b2654567:ca001d89
         Events : 0.6440

[root@prserver ~]# mdadm -D /dev/md2
/dev/md2:
        Version : 00.90.01
  Creation Time : Sun Apr 16 17:20:09 2006
     Raid Level : raid1
     Array Size : 244091520 (232.78 GiB 249.95 GB)
    Device Size : 244091520 (232.78 GiB 249.95 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Wed Dec  6 15:25:17 2006
          State : dirty, recovering
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0


 Rebuild Status : 70% complete

    Number   Major   Minor   RaidDevice State
       0       3       66        0      active sync   /dev/hdb2
       1       3        2        1      active sync   /dev/hda2
           UUID : 37c9ce7f:10eb5124:13451591:18224c5e
         Events : 0.11103518

[root@prserver ~]# mdadm -D /dev/md3
/dev/md3:
        Version : 00.90.01
  Creation Time : Tue Jun 13 15:41:47 2006
     Raid Level : raid1
     Array Size : 244195904 (232.88 GiB 250.06 GB)
    Device Size : 244195904 (232.88 GiB 250.06 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 3
    Persistence : Superblock is persistent

    Update Time : Fri Dec  8 07:44:59 2006
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0


    Number   Major   Minor   RaidDevice State
       0      22        1        0      active sync   /dev/hdc1
       1      22       65        1      active sync   /dev/hdd1
           UUID : fe4427c1:5b926133:6cc88440:cca83c9b
         Events : 0.4612018

mbachmann

RAID Monitor Question
« Reply #2 on: December 08, 2006, 02:29:34 PM »
md2 is being rebuilded, as you can read yourself. Rebuilding degrades performance. If the array is continuing to fail after the rebuilding you should swap the failing drive (hda or hdb). In my experience it will not last very long until the rebuild starts again. Buy a spare disk now to be prepared.

keyman0202

Thanks....one more question
« Reply #3 on: December 08, 2006, 02:35:13 PM »
I am new to this, so please ingore the stupid question.  How can I tell which one is failing?

Thanks again...

Offline bpivk

  • *
  • 908
  • +0/-0
    • http://www.bezigrad.com
RAID Monitor Question
« Reply #4 on: December 08, 2006, 06:51:34 PM »
Quote
[root@prserver ~]# mdadm -D /dev/md2
/dev/md2:
Version : 00.90.01
Creation Time : Sun Apr 16 17:20:09 2006
Raid Level : raid1
Array Size : 244091520 (232.78 GiB 249.95 GB)
Device Size : 244091520 (232.78 GiB 249.95 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 2
Persistence : Superblock is persistent

Update Time : Wed Dec 6 15:25:17 2006
State : dirty, recovering
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0


This tells you that disk 2 is about to go on a long vacation.
"It should just work" if it doesn't report it. Thanks!

Offline pfloor

  • *****
  • 889
  • +1/-0
RAID Monitor Question
« Reply #5 on: December 09, 2006, 06:45:31 PM »
What do you get with:

# cat /proc/mdstat
In life, you must either "Push, Pull or Get out of the way!"

Offline bpivk

  • *
  • 908
  • +0/-0
    • http://www.bezigrad.com
RAID Monitor Question
« Reply #6 on: December 09, 2006, 10:51:37 PM »
cat /proc/mdstat announces a properly running RAID-5
"It should just work" if it doesn't report it. Thanks!

mbachmann

RAID Monitor Question
« Reply #7 on: December 09, 2006, 11:00:29 PM »
From the looks i'd say that two raid 1 are running. I never expected a Raid 5. Are the four drives hooked to a dedicated Raid controller, an onboard Raid controller or what? Can you tell us some server specs?

Offline bpivk

  • *
  • 908
  • +0/-0
    • http://www.bezigrad.com
RAID Monitor Question
« Reply #8 on: December 09, 2006, 11:11:23 PM »
Quote from: "mbachmann"
I never expected a Raid 5.


If you're refering to my answer it was ment for pfloor.


keyman0202 has a raid 1 array and the second disk if failing. At least that's what i think.
"It should just work" if it doesn't report it. Thanks!

mbachmann

RAID Monitor Question
« Reply #9 on: December 09, 2006, 11:41:50 PM »
Exactly, i messed up question and answer - hdb seems to fail in keyman0202s setup.

mbachmann

RAID Monitor Question
« Reply #10 on: December 09, 2006, 11:51:56 PM »
Exactly, i messed up question and answer - hdb seems to fail in keyman0202s setup.

keyman0202

More info on RAID errors
« Reply #11 on: December 10, 2006, 05:40:29 PM »
I'm running from the onboard IDE controller.  You are correct, there are  two RAID1 devices....4 250G drives total, each pair RAID1.

Here's my proc/mdstat:


Personalities : [raid1]
md1 : active raid1 hdb1[0] hda1[1]
      104320 blocks [2/2] [UU]

md2 : active raid1 hdb2[0] hda2[1]
      244091520 blocks [2/2] [UU]

md3 : active raid1 hdc1[0] hdd1[1]
      244195904 blocks [2/2] [UU]

unused devices: <none>

It finished rebuilding, but it is still showing "dirty state".  See output below....

[root@prserver ~]# mdadm -D /dev/md2
/dev/md2:
        Version : 00.90.01
  Creation Time : Sun Apr 16 17:20:09 2006
     Raid Level : raid1
     Array Size : 244091520 (232.78 GiB 249.95 GB)
    Device Size : 244091520 (232.78 GiB 249.95 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Sun Dec 10 11:37:47 2006
          State : dirty
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0


    Number   Major   Minor   RaidDevice State
       0       3       66        0      active sync   /dev/hdb2
       1       3        2        1      active sync   /dev/hda2
           UUID : 37c9ce7f:10eb5124:13451591:18224c5e
         Events : 0.11287591

I'm not seeing any errors in the logfile, so I'm still confused as to which of the drives in this array is having an issue....

Performance is still slow, however it has not started to REBUILD again as of yet.

Any advice?

Offline bpivk

  • *
  • 908
  • +0/-0
    • http://www.bezigrad.com
RAID Monitor Question
« Reply #12 on: December 10, 2006, 06:18:52 PM »
Change the disk that's causing problems.
"It should just work" if it doesn't report it. Thanks!

Offline pfloor

  • *****
  • 889
  • +1/-0
Re: More info on RAID errors
« Reply #13 on: December 10, 2006, 07:08:16 PM »
Quote from: "keyman0202"
Any advice?

1-Don't place your raided devices on the same IDE channel.  It reduces performance and provides NO redundancy if the IDE channel fails and takes the drives with it.  It may also cause the issues you are having.  I suggest you:

a-Remove the extra drives (md3 array).
b-Place your primary drives on hda and hdc as "Master drives".
c-If you still need the extra storage, get an additional IDE card with 2 more IDE channels.  Put the extra drives on that as hde and hdg (Master) and add the additional array back.

2-Don't use IBM (and possibly Hitachi) "Deskstar" drives.  My experience shows they have a hard time staying in sync for some unknown reason.  I personally have experienced this problem on 3 seperate sets of "Deathstar" drives placed in a raid 1 configuration.  

3-If you experience drive performance problems (AFTER seperating the drives to different IDE channels) then read this thread about hdparm:

http://forums.contribs.org/index.php?topic=28038.0
In life, you must either "Push, Pull or Get out of the way!"

Offline pfloor

  • *****
  • 889
  • +1/-0
RAID Monitor Question
« Reply #14 on: December 10, 2006, 07:20:37 PM »
Quote from: "bpivk"
Quote
[root@prserver ~]# mdadm -D /dev/md2
/dev/md2:
Version : 00.90.01
Creation Time : Sun Apr 16 17:20:09 2006
Raid Level : raid1
Array Size : 244091520 (232.78 GiB 249.95 GB)
Device Size : 244091520 (232.78 GiB 249.95 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 2
Persistence : Superblock is persistent

Update Time : Wed Dec 6 15:25:17 2006
State : dirty, recovering
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0


This tells you that disk 2 is about to go on a long vacation.


This just states that md2 is dirty and makes no mention of which disk is causing the problem.

# cat /proc/mdstat during the rebuild process will tell which drive is out of sync.
In life, you must either "Push, Pull or Get out of the way!"