Koozali.org: home of the SME Server

Failed Hard Drive? Advice required please

Offline ntblade

  • *
  • 252
  • +0/-0
Failed Hard Drive? Advice required please
« on: June 07, 2008, 11:31:52 PM »
Hi all,
Just got:
"A Fail event has been detected on md device /dev/md2" emailed to me from a server that I built just over a week ago and installed 4 days ago.  This is a brand new machine with 2x160G drives for the system and 2x750 drives for backuppc.

What would the folks here do to advise on what to check to see whether a drive has really died or not.  Is it possible that my adding the two extra drives in RAID1 could have screwed things up?

Many thanks,
NTB

Offline christian

  • *
  • 369
  • +0/-0
    • http://www.szpilfogel.com
Re: Failed Hard Drive? Advice required please
« Reply #1 on: June 09, 2008, 03:54:49 AM »
Not likely due to the other drives other than perhaps heat (each drive adds about 10W of heat).

I would try and rebuild the array first as a hiccup may have caused this. This is well documented in the forums and wiki (http://wiki.contribs.org/Raid); a search would have shown you how to rebuild and determine which disk failed.

I would also ensure the box is well ventilated as heat can cause a temporary glitch throwing a disk out of the array. consider also the ambient temperature of the room.

If the same disk fails later or the array won't rebuild then either you have a real heat or connection issue (to be rectified) or of course a bad disk and it should be replaced ideally with the same make/model.

In my history I have had a failed disk once. I have also had a disk thrown out of the array for no apparent reason and after rebuilding the array, all has been fine for almost two years.
SME since 2003

Offline ntblade

  • *
  • 252
  • +0/-0
Re: Failed Hard Drive? Advice required please
« Reply #2 on: June 09, 2008, 10:47:05 AM »
Thanks for the reply,
I ordered a couple of new drives just in case one of the existing ones really is faulty.  I don't have access to the server right now but I'll report back.

It's actually quite nice to be able to contact a custormer and tell them that their machine has a problem and it's being dealt with before they even know it themselves!  :-)

NTB