Koozali.org: home of the SME Server

SME RAID1 - Errors Help Please

RMUCB

SME RAID1 - Errors Help Please
« on: October 04, 2004, 11:02:18 PM »
Hey All,

I am looking for some help here.  Normally I have no problem searching through the forum for research on things, it is just at the moment I am a bit pressed for time and the error has me a bit worried.  I am hoping I might be able to get some help or suggestions here.  Sorry for the length of this post.

I am currently running the SME v6.01-01 distro, and things have been chugging along great.

Under the current setup, I have 2 160GB IDE HD running in the SME RAID1 Configuration.  Both drives are are set as master on their own IDE channel.  I have been running the disks in this manner for a while now with no problems and no errors.  Now however I am receiving this:

cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md2 : active raid1 hda3[0] hdc3[1]
      264960 blocks [2/2] [UU]

md1 : active raid1 hda2[0]
      155918784 blocks [2/1] [U_]

md0 : active raid1 hda1[0] hdc1[1]
      104320 blocks [2/2] [UU]

unused devices: <none>

mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.00
  Creation Time : Mon Aug  2 00:54:14 2004
     Raid Level : raid1
     Array Size : 104320 (101.87 MiB 106.82 MB)
    Device Size : 104320 (101.87 MiB 106.82 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Mon Oct  4 03:50:13 2004
          State : dirty, no-errors
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0


    Number   Major   Minor   RaidDevice State
       0       3        1        0      active sync   /dev/hda1
       1      22        1        1      active sync   /dev/hdc1
           UUID : bc6e4515:e2eaf0ef:7df7a6cb:d862d07c

mdadm --detail /dev/md1
/dev/md1:
        Version : 00.90.00
  Creation Time : Mon Aug  2 00:49:47 2004
     Raid Level : raid1
     Array Size : 155918784 (148.69 GiB 159.66 GB)
    Device Size : 155918784 (148.69 GiB 159.66 GB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Mon Oct  4 03:50:13 2004
          State : dirty, no-errors
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0


    Number   Major   Minor   RaidDevice State
       0       3        2        0      active sync   /dev/hda2
       1       0        0        1      faulty removed
           UUID : a1c8be30:e4213692:0b1d1dbd:33ba577f

mdadm --detail /dev/md2
/dev/md2:
        Version : 00.90.00
  Creation Time : Mon Aug  2 00:49:42 2004
     Raid Level : raid1
     Array Size : 264960 (258.75 MiB 271.31 MB)
    Device Size : 264960 (258.75 MiB 271.31 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Mon Oct  4 03:50:13 2004
          State : dirty, no-errors
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0


    Number   Major   Minor   RaidDevice State
       0       3        3        0      active sync   /dev/hda3
       1      22        3        1      active sync   /dev/hdc3
           UUID : 027a5f80:3efbbee5:431f3c7e:8b620aa7

lsraid -a /dev/md0
[dev   9,   0] /dev/md0         BC6E4515.E2EAF0EF.7DF7A6CB.D862D07C online
[dev   3,   1] /dev/hda1        BC6E4515.E2EAF0EF.7DF7A6CB.D862D07C good
[dev  22,   1] /dev/hdc1        BC6E4515.E2EAF0EF.7DF7A6CB.D862D07C good

lsraid -a /dev/md1
[dev   9,   1] /dev/md1         A1C8BE30.E4213692.0B1D1DBD.33BA577F online
[dev   3,   2] /dev/hda2        A1C8BE30.E4213692.0B1D1DBD.33BA577F good
[dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000 missing

lsraid -a /dev/md2
[dev   9,   2] /dev/md2         027A5F80.3EFBBEE5.431F3C7E.8B620AA7 online
[dev   3,   3] /dev/hda3        027A5F80.3EFBBEE5.431F3C7E.8B620AA7 good
[dev  22,   3] /dev/hdc3        027A5F80.3EFBBEE5.431F3C7E.8B620AA7 good

And in the log (most irrelevent entries removed):

hda: WDC WD1600JB-00HUA0, ATA DISK drive
blk: queue c0378440, I/O limit 4095Mb (mask 0xffffffff)
hdc: WDC WD1600JB-00HUA0, ATA DISK drive
blk: queue c03788a4, I/O limit 4095Mb (mask 0xffffffff)
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: attached ide-disk driver.
hda: host protected area => 1
hda: 312581808 sectors (160042 MB) w/8192KiB Cache, CHS=19457/255/63, UDMA(33)
hdc: attached ide-disk driver.
hdc: host protected area => 1
hdc: 312581808 sectors (160042 MB) w/8192KiB Cache, CHS=19457/255/63, UDMA(33)
ide-floppy driver 0.99.newide
Partition check:
 hda: hda1 hda2 hda3
 hdc: hdc1 hdc2 hdc3
ide-floppy driver 0.99.newide
md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: Autodetecting RAID arrays.
 [events: 0000001a]
 [events: 0000001b]
 [events: 0000001a]
 [events: 0000001a]
 [events: 00000017]
 [events: 0000001a]
md: autorun ...
md: considering hdc3 ...
md:  adding hdc3 ...
md:  adding hda3 ...
md: created md2
md: bind<hda3,1>
md: bind<hdc3,2>
md: running: <hdc3><hda3>
md: hdc3's event counter: 0000001a
md: hda3's event counter: 0000001a
md: RAID level 1 does not need chunksize! Continuing anyway.
kmod: failed to exec /sbin/modprobe -s -k md-personality-3, errno = 2
md: personality 3 is not loaded!
md :do_md_run() returned -22
md: md2 stopped.
md: unbind<hdc3,1>
md: export_rdev(hdc3)
md: unbind<hda3,0>
md: export_rdev(hda3)
md: considering hdc2 ...
md:  adding hdc2 ...
md:  adding hda2 ...
md: created md1
md: bind<hda2,1>
md: bind<hdc2,2>
md: running: <hdc2><hda2>
md: hdc2's event counter: 00000017
md: hda2's event counter: 0000001b
md: superblock update time inconsistency -- using the most recent one
md: freshest: hda2
md: kicking non-fresh hdc2 from array!
md: unbind<hdc2,1>
md: export_rdev(hdc2)
md: RAID level 1 does not need chunksize! Continuing anyway.
kmod: failed to exec /sbin/modprobe -s -k md-personality-3, errno = 2
md: personality 3 is not loaded!
md :do_md_run() returned -22
md: md1 stopped.
md: unbind<hda2,0>
md: export_rdev(hda2)
md: considering hdc1 ...
md:  adding hdc1 ...
md:  adding hda1 ...
md: created md0
md: bind<hda1,1>
md: bind<hdc1,2>
md: running: <hdc1><hda1>
md: hdc1's event counter: 0000001a
md: hda1's event counter: 0000001a
md: RAID level 1 does not need chunksize! Continuing anyway.
kmod: failed to exec /sbin/modprobe -s -k md-personality-3, errno = 2
md: personality 3 is not loaded!
md :do_md_run() returned -22
md: md0 stopped.
md: unbind<hdc1,1>
md: export_rdev(hdc1)
md: unbind<hda1,0>
md: export_rdev(hda1)
md: ... autorun DONE.

md: raid1 personality registered as nr 3
Journalled Block Device driver loaded
md: Autodetecting RAID arrays.
 [events: 0000001a]
 [events: 0000001a]
 [events: 00000017]
 [events: 0000001b]
 [events: 0000001a]
 [events: 0000001a]
md: autorun ...
md: considering hda1 ...
md:  adding hda1 ...
md:  adding hdc1 ...
md: created md0
md: bind<hdc1,1>
md: bind<hda1,2>
md: running: <hda1><hdc1>
md: hda1's event counter: 0000001a
md: hdc1's event counter: 0000001a
md: RAID level 1 does not need chunksize! Continuing anyway.
md0: max total readahead window set to 124k
md0: 1 data-disks, max readahead per data-disk: 124k
raid1: device hda1 operational as mirror 0
raid1: device hdc1 operational as mirror 1
raid1: raid set md0 active with 2 out of 2 mirrors
md: updating md0 RAID superblock on device
md: hda1 [events: 0000001b]<6>(write) hda1's sb offset: 104320
md: hdc1 [events: 0000001b]<6>(write) hdc1's sb offset: 104320
md: considering hda2 ...
md:  adding hda2 ...
md:  adding hdc2 ...
md: created md1
md: bind<hdc2,1>
md: bind<hda2,2>
md: running: <hda2><hdc2>
md: hda2's event counter: 0000001b
md: hdc2's event counter: 00000017
md: superblock update time inconsistency -- using the most recent one
md: freshest: hda2
md: kicking non-fresh hdc2 from array!
md: unbind<hdc2,1>
md: export_rdev(hdc2)
md: RAID level 1 does not need chunksize! Continuing anyway.
md1: max total readahead window set to 124k
md1: 1 data-disks, max readahead per data-disk: 124k
raid1: device hda2 operational as mirror 0
raid1: md1, not all disks are operational -- trying to recover array
raid1: raid set md1 active with 1 out of 2 mirrors
md: updating md1 RAID superblock on device
md: hda2 [events: 0000001c]<6>(write) hda2's sb offset: 155918784
md: recovery thread got woken up ...
md1: no spare disk to reconstruct array! -- continuing in degraded mode
md: recovery thread finished ...
md: considering hda3 ...
md:  adding hda3 ...
md:  adding hdc3 ...
md: created md2
md: bind<hdc3,1>
md: bind<hda3,2>
md: running: <hda3><hdc3>
md: hda3's event counter: 0000001a
md: hdc3's event counter: 0000001a
md: RAID level 1 does not need chunksize! Continuing anyway.
md2: max total readahead window set to 124k
md2: 1 data-disks, max readahead per data-disk: 124k
raid1: device hda3 operational as mirror 0
raid1: device hdc3 operational as mirror 1
raid1: raid set md2 active with 2 out of 2 mirrors
md: updating md2 RAID superblock on device
md: hda3 [events: 0000001b]<6>(write) hda3's sb offset: 264960
md: hdc3 [events: 0000001b]<6>(write) hdc3's sb offset: 264960
md: ... autorun DONE.
kjournald starting.  Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
Freeing unused kernel memory: 120k freed
Adding Swap: 264952k swap-space (priority -1)
usb.c: registered new driver usbdevfs
usb.c: registered new driver hub
usb-uhci.c: $Revision: 1.275 $ time 08:52:36 May 29 2003
usb-uhci.c: High bandwidth mode enabled
PCI: Found IRQ 9 for device 00:07.2
usb-uhci.c: USB UHCI at I/O 0x1400, IRQ 9
usb-uhci.c: Detected 2 ports
usb.c: new USB bus registered, assigned bus number 1
hub.c: USB hub found
hub.c: 2 ports detected
usb-uhci.c: v1.275:USB Universal Host Controller Interface driver
EXT3 FS 2.4-0.9.19, 19 August 2002 on md(9,1), internal journal
kjournald starting.  Commit interval 5 seconds
EXT3 FS 2.4-0.9.19, 19 August 2002 on md(9,0), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
SCSI subsystem driver Revision: 1.00
scsi0 : SCSI host adapter emulation for IDE ATAPI devices

Again, sorry for the length of this post.  I am unsure what steps I should take from here, as I am still not all that familiar with software RAID under linux.  I will continue to look around, but any thoughts, suggestions, ideas would be greatly appreciated.  Thanks in advance !!

R.

BobWilliams

SME RAID1 - Errors Help Please
« Reply #1 on: October 05, 2004, 12:24:27 AM »
You have a problem with Meta Device md1/hdc2 partition. I suggest you download Darrell May's "Raid1 Monitor How To" which will explain what to do.  http://mirror.contribs.org/smeserver/contribs/dmay/mitel/contrib/raidmonitor/raid-recovery-howto.html

Bob...

RMUCB

SME RAID1 - Errors Help Please
« Reply #2 on: October 05, 2004, 07:42:33 AM »
BW,

Thanks so much for the reply and the links.  That was exactly info I was needing.  I had actually been looking at implementing that monitoring package, but did not realize a RAID recovery doc was posted there as well.

After reviewing the documentation and a few other items I had found, I pulled the disk to test.  It actually tested okay with the diags I ran.  I wiped the disk and had the RAID resync.  It is showing up valid now.

I am still not sure what exactly happened, but I will just have to keep an eye on things for a while.

Thanks again for the help, it was greatly appreciated.

R.

BobWilliams

SME RAID1 - Errors Help Please
« Reply #3 on: October 05, 2004, 02:48:26 PM »
Glad that worked for you. I highly recommend you install the raidMonitor rpm. I use it on SME 6.0 and it works great. It's nice to get an email from the service telling you you need to schedule time to find out whats wrong with your raid array before it crashes altogether.

Bob...