Koozali.org: home of the SME Server
Legacy Forums => Experienced User Forum => Topic started by: Sean on January 26, 2003, 12:26:50 PM
-
My SME5.5 seems to have had a failure of some sort on the s/w RAID array. Raidmonitor is now showing:
ALARM! RAID configuration problem
Current configuration is:
Personalities : [raid1]
read_ahead 1024 sectors
md2 : active raid1 hdb1[1] hda1[0] 264960 blocks [2/2] [UU]
md0 : active raid1 hdb5[1] hda5[0] 15936 blocks [2/2] [UU]
md1 : active raid1 hdb6[2](F) hda6[0] 38796864 blocks [2/1] [U_]
unused devices:
Last known good configuration was:
Personalities : [raid1]
read_ahead 1024 sectors
md2 : active raid1 hdb1[1] hda1[0] 264960 blocks [2/2] [UU]
md0 : active raid1 hdb5[1] hda5[0] 15936 blocks [2/2] [UU]
md1 : active raid1 hda6[0] 38796864 blocks [2/1] [U_]
unused devices:
I think the md1: lines indicate that md1/hdb6[2] is now back (after the array dropped it for no known reason), but is is failed mode (F) and downed [U_].
Dan et al, how can I get it back so md1 shows [UU]?
No, I have not yet gone through the whole "break the mirror and rebuild it from scratch" routing as outlined on Dan's site. I am trying to avoid having to down the server if at all possible.
Sean
-
From Darrel Mays FAQ :
http://myezserver.com/downloads/mitel/contrib/raidmonitor/raid-recovery-howto.html
I think that you can just do :
/sbin/raidhotadd /dev/md1 /dev/hda6
which will add the partition back to the array. I seem to remember doing this once or twice myself.
I guess that you ought to find out why if dropped in the first place.
B. Rgds
John
-
John,
Tried that a few times. The first time it went off and said it was rebuilding the partition. Then when I looked at the array after it was finished it was still [U_]. Restarted the machine and still [U_]...
The second go just said the disk was busy. Hang on, I'll run that command again and tell you exactly what the error is:
[root@gateway log]# /sbin/raidhotadd /dev/md1 /dev/hdb6
/dev/md1: can not hot-add disk: disk busy!
Having just re-trolled the logs, I think the secondary disk has "had issues". It seems the BIOS has gone silly and detected it as an 8GB volume:
Jan 26 12:56:15 gateway kernel: hda: ST340016A, 38166MB w/2048kB Cache, CHS=4865/255/63
Jan 26 12:56:15 gateway kernel: hdb: ST340016A, 8063MB w/2048kB Cache, CHS=1027/255/63
Note that both drives are identical model numbers.
I will have to down the box and determine exactly _what_ the BIOS thinks it is doing.
Hmmm. The plot thickens.
Sean
-
You wouldn't believe it - the site had a power failure that their UPS couldn't live through and the box rebooted (hooray), raidmonitor started resyncing the drives and voila! it is now [UU].
Kicking it in the guts may have been what it wanted all along...
Sean