Koozali.org: home of the SME Server

Legacy Forums => Experienced User Forum => Topic started by: Sean on January 26, 2003, 12:26:50 PM

Title: Raid1 gone weird
Post by: Sean on January 26, 2003, 12:26:50 PM: My SME5.5 seems to have had a failure of some sort on the s/w RAID array. Raidmonitor is now showing:
ALARM! RAID configuration problem

Current configuration is:

Personalities : [raid1]
read_ahead 1024 sectors
md2 : active raid1 hdb1[1] hda1[0] 264960 blocks [2/2] [UU]
md0 : active raid1 hdb5[1] hda5[0] 15936 blocks [2/2] [UU]
md1 : active raid1 hdb6[2](F) hda6[0] 38796864 blocks [2/1] [U_]
unused devices:

Last known good configuration was:

Personalities : [raid1]
read_ahead 1024 sectors
md2 : active raid1 hdb1[1] hda1[0] 264960 blocks [2/2] [UU]
md0 : active raid1 hdb5[1] hda5[0] 15936 blocks [2/2] [UU]
md1 : active raid1 hda6[0] 38796864 blocks [2/1] [U_]
unused devices:

I think the md1: lines indicate that md1/hdb6[2] is now back (after the array dropped it for no known reason), but is is failed mode (F) and downed [U_].

Dan et al, how can I get it back so md1 shows [UU]?

No, I have not yet gone through the whole "break the mirror and rebuild it from scratch" routing as outlined on Dan's site. I am trying to avoid having to down the server if at all possible.

Sean
Title: Re: Raid1 gone weird
Post by: John Crisp on January 26, 2003, 05:25:56 PM: From Darrel Mays FAQ :
http://myezserver.com/downloads/mitel/contrib/raidmonitor/raid-recovery-howto.html

I think that you can just do :

/sbin/raidhotadd /dev/md1 /dev/hda6

which will add the partition back to the array. I seem to remember doing this once or twice myself.

I guess that you ought to find out why if dropped in the first place.

B. Rgds
John
Title: Re: Raid1 gone weird
Post by: Sean on January 27, 2003, 02:45:06 AM: John,

Tried that a few times. The first time it went off and said it was rebuilding the partition. Then when I looked at the array after it was finished it was still [U_]. Restarted the machine and still [U_]...

The second go just said the disk was busy. Hang on, I'll run that command again and tell you exactly what the error is:
[root@gateway log]# /sbin/raidhotadd /dev/md1 /dev/hdb6
/dev/md1: can not hot-add disk: disk busy!

Having just re-trolled the logs, I think the secondary disk has "had issues". It seems the BIOS has gone silly and detected it as an 8GB volume:
Jan 26 12:56:15 gateway kernel: hda: ST340016A, 38166MB w/2048kB Cache, CHS=4865/255/63
Jan 26 12:56:15 gateway kernel: hdb: ST340016A, 8063MB w/2048kB Cache, CHS=1027/255/63

Note that both drives are identical model numbers.

I will have to down the box and determine exactly _what_ the BIOS thinks it is doing.

Hmmm. The plot thickens.

Sean
Title: Re: Raid1 gone weird
Post by: Sean on January 31, 2003, 11:03:11 AM: You wouldn't believe it - the site had a power failure that their UPS couldn't live through and the box rebooted (hooray), raidmonitor started resyncing the drives and voila! it is now [UU].

Kicking it in the guts may have been what it wanted all along...

Sean