Koozali.org: home of the SME Server

RAID 1 recovery

Offline daniel

  • ****
  • 146
  • +0/-0
  • Platinum Sponsor
    • http://www.charton-mgmt.com
RAID 1 recovery
« on: February 15, 2010, 03:00:59 PM »
I must say,  RAID 1 is one more reason to use SME Server.  I have 2 1TB SATA drives in a Raid 1 scenario and /dev/sdb quit staying in sync with the raid.  Following the Wiki notes, I replaced it last night with a drive of the same size.  After 7 hours it is in sync and running just fine. 

Now I have a few more questions the Wiki isn't answering. 

1. In the mdstat the raid 1 shows sdb2[1] sda2[0]  I replaced /sdb, so I assumed sdb should be listed second here instead of first.  Does the sda[0] mean the drive thats R/W and sdb[1] mean RO?
2.  If I replace /dev/sda now will it build from a synced sdb the same way sdb syned with sda?
3.  If I add a third 1tb drive as a hotspare, will the raid automatically add it or will I have to do command line functions to add it as a hot spare?  Is there a real advantage to having a Raid1 + hotspare?

Thanks to the community for making a great stable product.

Offline FreakWent

  • ****
  • 89
  • +0/-0
Re: RAID 1 recovery
« Reply #1 on: February 19, 2010, 10:59:03 AM »
Just to scare you and hijack the thread:

-bash-3.00$ md5sum /home/e-smith/files/ibays/install/files/OS/XP/WindowsXP-KB835935-SP2-ENU.exe;

a3698937beffb3e6fda0ff4f9eac7947  /home/e-smith/files/ibays/install/files/OS/XP/WindowsXP-KB835935-SP2-ENU.exe

-bash-3.00$ md5sum /home/e-smith/files/ibays/install/files/OS/XP/WindowsXP-KB835935-SP2-ENU.exe;

702ac77029cc728cdb66f56566ee4a1c  /home/e-smith/files/ibays/install/files/OS/XP/WindowsXP-KB835935-SP2-ENU.exe

Any file over ~ 200MB on my system gets corrupted.  It's not RAM.  I suppose it could be other hardware, but if I drop the RAID and use either drive in single mode -- same hardware -- it's perfect.

Weird eh?  I can't find any known bugs on the forum so I guess it's just me!


To answer your questions:



1. In the mdstat the raid 1 shows sdb2[1] sda2[0]  I replaced /sdb, so I assumed sdb should be listed second here instead of first.  Does the sda[0] mean the drive thats R/W and sdb[1] mean RO?

I DON'T KNOW!! But! I don't think the order is relevant, and what gives you the idea that mirrored disks have one R/W and one R/O? That makes no sense to me, where did you get that idea?

2.  If I replace /dev/sda now will it build from a synced sdb the same way sdb syned with sda?

It can do, yes -- I used
mdadm --add /dev/md1 /dev/sda1

then

mdadm --add /dev/md2 /dev/sda2

When I replaced sda.

3.  If I add a third 1tb drive as a hotspare, will the raid automatically add it or will I have to do command line functions to add it as a hot spare?  Is there a real advantage to having a Raid1 + hotspare?

The advantage is that you're proteced if the second drive dies while you're still out buying a replacement for he first.  Not really very likely.

You would have to configure it as a hot spare, try

man mdadm

for details.

Whether or not it's worth having a hot spare depends on what the server is used for.

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: RAID 1 recovery
« Reply #2 on: February 19, 2010, 04:14:23 PM »
Weird eh?  I can't find any known bugs on the forum so I guess it's just me!

The forum is not the place to search for bugs - try the bug tracker.

But do not wait for others to report bugs, and please do not gossip about problems in the forum. You have found a problem - please report it, in detail, via the bug tracker.

Offline FreakWent

  • ****
  • 89
  • +0/-0
Re: RAID 1 recovery
« Reply #3 on: February 20, 2010, 07:54:57 AM »
Thank you very much for your reply.  You're right of course, but it turns out it's not a SME server specific problem anyway.

I seem to have the problem detailed here:
http://www.phwinfo.com/forum/linux-debian-user/336830-corrupt-data-raid-sata_sil-3114-chip.html

This has been annoying me for over six months, to be honest.

Today I've added

"option sata_sil slow_down=1"

to /etc/modprobe.conf, and this means that now in the syslog I get

 sata_sil 0000:00:0b.0: Applying R_ERR on DMA activate FIS errata fix

But I still have strangely varying MD5sums on files.

Given this new info, does this still belong in the tracker?

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: RAID 1 recovery
« Reply #4 on: February 20, 2010, 04:16:14 PM »
Given this new info, does this still belong in the tracker?

Anything which doesn't 'just work' belongs in the bug tracker.