Well I had a drive (sdb2) kicked out of my raid array this evening.
to recover it I used:
mdadm --add /dev/md2 /dev/sdb2
after checking /proc/mdstat, I noticed that the recovery keeps restarting.
So checking dmesg I find:
SCSI device sda: 1953525168 512-byte hdwr sectors (1000205 MB)
SCSI device sda: drive cache: write back
ata1.00: exception Emask 0x0 SAct 0x3ffffff SErr 0x0 action 0x0
ata1.00: (irq_stat 0x40000008)
ata1.00: cmd 60/c8:78:05:42:03/00:00:00:00:00/40 tag 15 cdb 0x0 data 102400 in
res 41/40:00:05:42:03/da:00:00:00:00/40 Emask 0x9 (media error)
ata1.00: configured for UDMA/133
SCSI error : <0 0 0 0> return code = 0x8000002
Info fld=0x4000000 (nonstd), Invalid sda: sense = 72 11
end_request: I/O error, dev sda, sector 213509
ata1: EH complete
SCSI device sda: 1953525168 512-byte hdwr sectors (1000205 MB)
SCSI device sda: drive cache: write back
[b]raid1: sda: unrecoverable I/O read error for block 4608[/b]
md: md2: sync done.
I think block 4608 is where it is restarting.
So this looks like a conundrum, failed and kicked out sdb2 and unable to recover from sda. Though I should note the system does boot! It is just very very busy trying to re-add sdb2.
Any ideas on how might be able to recover this?
Should I run fsck (or even e2fsck) on sda? If so what would be the proper command line to perform a safe recovery?
Is there a better way?
Needless to say, I will be replacing with new hard drives.
Christian