Koozali.org: home of the SME Server

mdadm 2 drives with 2 drives : A Fail event has been detected on md device

Offline painkiller

  • ***
  • 66
  • +0/-0
i have sme server with RAID configured , i have a warning message :

A Fail event has been detected on md device /dev/md1.
A Fail event has been detected on md device /dev/md2.

admin :

md2 : active raid1 sda2[2] (f) sdb2(1)
 488279488 blocks [2/1] [_U]
md2 : active raid1 sda1[2] (f) sdb1(1)
 104320 blocks [2/1] [_U]

Server is stil up and i did not got another message a day further.

I believe a good harddisk has to be [UU]

some one has an idee to handle this problem?


Offline Stefano

  • *
  • 10,894
  • +3/-0
buy a new disk (size must be the same or bigger than the failed one)
plug it into the server
login as admin in console
choose "manage disk redudancy"

read the manual ;-)

Offline painkiller

  • ***
  • 66
  • +0/-0
it looks like 2 failed disk, a'm i right?

Offline Stefano

  • *
  • 10,894
  • +3/-0
no, it's only one, sda

Offline Jeppe Fugl

  • **
  • 33
  • +0/-0
As stated in the manual, this sometimes happens without a broken disk.

You could run different checks before buying a new one.

I tried this 2 times in the past years.

Hope it helps...

Offline Stefano

  • *
  • 10,894
  • +3/-0
As stated in the manual, this sometimes happens without a broken disk.

You could run different checks before buying a new one.

I tried this 2 times in the past years.

Hope it helps...

I would agree if only md1 or md2 suffered from disk failure..
I suspect that it's something serious..

anyway a
Code: [Select]
grep sda /var/log/messages*

could give to the OP the reason why sda is out from the array

Offline mike_mattos

  • *
  • 313
  • +0/-0
I had something similar happen 3 or 4 times, I finally figured out it was a NETWORK issue,  if a station locked up ( virus or bad adapter , was never sure so we dumped the machine ) I'd get the alarm.  Since we dumped the station, no alarms!

...

Offline pc_doc

  • 4
  • +0/-0
  • the Original PC Doctor
i have sme server with RAID configured , i have a warning message :

A Fail event has been detected on md device /dev/md1.
A Fail event has been detected on md device /dev/md2.

admin :

md2 : active raid1 sda2[2] (f) sdb2(1)
 488279488 blocks [2/1] [_U]
md2 : active raid1 sda1[2] (f) sdb1(1)
 104320 blocks [2/1] [_U]

Server is stil up and i did not got another message a day further.

I believe a good harddisk has to be [UU]

some one has an idee to handle this problem?

What the report is telling you is that on RAID 1 partition 1 the first drive in the raid has failed (note the _U and not UU). In addition, the first drive in the second partition has also failed (once again, _U and not UU)

please see http://wiki.contribs.org/Raid#Resynchronising_a_Failed_RAID for more details.

If you read a little higher in the wiki, it will also tell you how to replace a drive with a larger one, reboot, remove the smaller surviving one, replace with another larger one, and finish with a larger raid drive without loosing data. I have used this method on a number of occasions with success.

For more information, check out http://www.linuxmanpages.com/man8/mdadm.8.php
Failure is not an option -
                                  It comes bundled with Microsoft products!