I have a system 7.5.1 that sends me an email when I reboot saying....
rom: mdadm monitoring[root@abc.mydomain.com]
To: admin_raidreport@abc.mydomain.com
Subject: DegradedArray event on /dev/md2:linux.abc.mydomain.com
This is an automatically generated mail message from mdadm running on smclinux.abc.mydomain.com.
A DegradedArray event has been detected on md device /dev/md2.
The Gui says ...
------Disk Reduncancy status as of Thursday Dec 22 etc.. ------
Current RAID status:
Personalities : [raid1]
md2 : active raid1 hda2[0]
38973568 blocks [2/1] [U_]
md1 : active raid1 hda1[0] hdb1[1]
104320 blocks [2/2] [UU]
unused devices: <none>
Only Some of the RAID devices are unclean.
Manual intervention may be required.
I have also run the following commands.
[root@smclinux ~]# mdadm --query --detail /dev/md2
/dev/md2:
Version : 00.90.01
Creation Time : Thu Apr 24 13:58:39 2008
Raid Level : raid1
Array Size : 38973568 (37.17 GiB 39.91 GB)
Device Size : 38973568 (37.17 GiB 39.91 GB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 2
Persistence : Superblock is persistent
Update Time : Thu Dec 22 12:01:58 2011
State : clean, degraded <--------- NOTE THIS
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
UUID : 48710ec5:b53317cb:05eafe7f:5ae8df80
Events : 0.30714056
Number Major Minor RaidDevice State
0 3 2 0 active sync /dev/hda2
1 0 0 - removed <---------- NOTE THIS
--------------------------------------------------------------------
Here is the command for MD1
[root@smclinux ~]# mdadm --query --detail /dev/md1
/dev/md1:
Version : 00.90.01
Creation Time : Thu Apr 24 13:58:39 2008
Raid Level : raid1
Array Size : 104320 (101.89 MiB 106.82 MB)
Device Size : 104320 (101.89 MiB 106.82 MB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 1
Persistence : Superblock is persistent
Update Time : Thu Dec 22 04:05:00 2011
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
UUID : da6572c9:181c2142:f8f7fecf:df36a8e2
Events : 0.14725
Number Major Minor RaidDevice State
0 3 1 0 active sync /dev/hda1
1 3 65 1 active sync /dev/hdb1
I'm not sure what to do. I have read several wiki's and am trying to understand this data.
Since this computer is at a remote location I cant just go there and open it up. Frankly I installed this computer several years ago and cant remember much about the hardware.
Can someone help me understand the data I have retrieved. First I want to understand the hardware I have and the problem I have. Can someone confirm or not my asumptions 1.-5.
1. I'm guessing I have two 40 gig (basically) IDE drives e.g. /dev/hda and /dev/hdb
2. Is the The system mirrored, with boot on (hda1 (mirrored on second drive) hdb1)
and
a second partition hda2 (mirrored on second drive) hdb2)
3. MD1 is the boot mirrored device and MD2 is the user files mirrored device.
4. Techincally the failure is on the second drive, on the second partition. E.G. hdb2.
5. I understand that the drive may be going bad and should be replaced but Initialy I would like to rebuild the drive if thats an option. Then I will get a new drive and travel to replace. But untill I can buy new drives and replace them I would like to rebuild.
To rebuild the array, or at least try to, do the following????
#> mdadm -a /dev/hdb2 /dev/md2
I have also read on the wiki about removing disks. Should I do that first, eg. run the following command.
#> mdadm --remove /dev/md2 /dev/hdb2
Before running the #> mdadm -a /dev/hdb2 /dev/md2
Obviously this is important and I want to make sure Im doing it right.
Am I on the right track here?
By the way I saw a bug report
http://bugs.contribs.org/show_bug.cgi?id=2390Regarding emailing you when a drive fails. I dont really understand it. Im confused whether I should only get this error when I reboot or if I should get it on some sort of scedule.
It seems if its important (and it is) that It should be sent out regularly? Not sure?
thanks.