I am seeking advice.
On my daughter's machine I have been getting intermittent RAID rebuild sequences, where email messages get generated during a re-sync. (e.g., "A Rebuild20 event has been detected on md device /dev/md2." ending with "A RebuildFinished event has been detected on md device /dev/md2.")
There was one sequence in April, another in June and one in July. All referenced /dev/md2.
Last week, however, the "RebuildFinished" event was followed almost immediately by a "A Fail event has been detected on md device /dev/md2." message. A day later I initiated a manual re-sync. It successfully went through all of the rebuild messages, ending with the "RebuildFinished" followed by a "SpareActive" event. BUT, 4 hours later I got a "A Fail event has been detected on md device /dev/md2."
It's a fairly old box but the drives are only a couple of years old.
Is this just a bad block somewhere on the second drive that I could get around by running fsck (or something else)? Or is it possibly a controller problem? Or do I need to bite the bullet and replace the drive?
Thanks for your input.
John