Koozali.org: home of the SME Server

Does an install with 'sme nospare' lead to issues later on when adding disks?

Offline piran

  • *****
  • 502
  • +0/-0
During the initial install the optional use of 'sme nospare' builds a RAID5 system with all four of the disks present.

Later, when adding a fifth disk, how does sme know whether to use/configure the newly fitted drive as a new hot spare or to carry on in 'no spare' mode with a resulting increase in overall capacity?

What happens when two (more than the original set) are added?

The wiki's RAID area suggests (1st paragraph) that 4-6 drives automatically builds a RAID5 + hot spare. Later in that same area (RAID notes ~ nospare) it appears to otherwise suggest that a RAID6 system is built (which necessarily results in a lower overall capacity though one with increased reliability).

I'm confused;~/ It's not helped me sleuth out why a 5th perfectly operational identical drive is not being put into any use at all by sme7.4 on adding it to an existing/working four drive RAID5 system.

Offline kevinb

  • *
  • 237
  • +0/-0
I am not an expert in this area but I have found that adding a drive to an exiting array is not a trivial task and must be done manually. Adding another drive to another mount point is not difficult.

I have found that installing with "sme" on six disks gives you a RAID 5 array with a hot spare. Installing with "sme nospare" on six disks gives you a RAID 6 array.

Offline piran

  • *****
  • 502
  • +0/-0
It's capacity that's needed... the RAID level is least irrelevant.
Looks like it'll have to be a complete re-build (to get capacity).

Offline kevinb

  • *
  • 237
  • +0/-0
You can aways install with "sme partition" and create your own partitions and arrays. You apparently can't setup a logical volume this way (if I am wrong would someone please post the instructions) so you will be in "unsupported" territory with SME.

I have one server setup this way to maximise disk space (6 disk RAID 5, no spare, no LVM, no swap [4 Gb RAM]). It has been running for two years now with no issues. Disk replacements do get rebuilt via the admin panel.

Offline piran

  • *****
  • 502
  • +0/-0
Previously ran SME with just about big enough drives and then mounted a favourite hardware RAID5 card ie with the extra capacity. That h/w card has now expired... leaving a bit of a capacity gap ~ so to speak. The avenue of adding further drives isn't working at all well. Bit of a mess.

Offline chris burnat

  • *****
  • 1,135
  • +2/-0
    • http://www.burnat.com
Previously ran SME with just about big enough drives and then mounted a favourite hardware RAID5 card ie with the extra capacity. That h/w card has now expired... leaving a bit of a capacity gap ~ so to speak. The avenue of adding further drives isn't working at all well. Bit of a mess.

This issue has vbeen copvered at some length in the Bugtracker, refer:
http://bugs.contribs.org/show_bug.cgi?id=5330

 
Quote
-------  Comment #4 From  Shad L. Lords   2009-06-06 10:01:03   (-) [reply] -------
Adding any number of drives to the array will only add spares to the array.
mdadm on RHEL4 base doesn't support capacity expansion.  RHEL5 base is supposed
to support it but from the tests I've done it will corrupt data about 30% of
the time.

The only way you are going to get more space is to create another raid array of
new drives and expand the LVM to include that space or rebuild the system with
all the drives attached.

- chris
If it does not work out of the box, please fill in a Bug Report @ Bugzilla (http://bugs.contribs.org)  - check: http://wiki.contribs.org/Bugzilla_Help .  Thanks.

Offline piran

  • *****
  • 502
  • +0/-0
Indeed Chris.

I filed what I thought was a bug in the bugzilla. I asked for support here in the support forum. The two aren't necessarily the same. If you wish to so join them it's your choice.

I also misunderstood the wiki notes by reading 'add' as being 'add new'. It seems clear now to me that I should've read 'add' instead as being 'add replace'. I've been trying to add to the array ~ literally ~ which sme does not do 'automatically' despite my unfortunate literal interpretation of the word 'add'. Semantics leading me up the wrong path. Choice now is to rebuild from scratch with all six or seven drives now resident or learn how to roll my own LVM and add my own mdadm RAID. Neither fill me with any great enthusiam.

Offline chris burnat

  • *****
  • 1,135
  • +2/-0
    • http://www.burnat.com
Choice now is to rebuild from scratch with all six or seven drives now resident or learn how to roll my own LVM and add my own mdadm RAID. Neither fill me with any great enthusiam.

I share your frustration, but then again, you would probably face the same issue (if not worse) with other server products...  I also suspect that you will not receive a reply on "their" bugtracker from a Senior Developer within two days, and you may have to pay for support.  Here, it is free. This is one of the strength of SME. Enjoy.
« Last Edit: June 07, 2009, 12:56:02 AM by chris burnat »
- chris
If it does not work out of the box, please fill in a Bug Report @ Bugzilla (http://bugs.contribs.org)  - check: http://wiki.contribs.org/Bugzilla_Help .  Thanks.

Offline piran

  • *****
  • 502
  • +0/-0
I am **NOT** frustrated either with SME, the Devs, or what you perceive as their response. All of the above I hold in great esteem. I am somewhat miffed at where my literal nature has got me... with a severe lack of (storage) capacity and some tracts of data that is both unique and vulnerable (ie single copy in existence) through this mistake of mine and failed electronics.

Offline piran

  • *****
  • 502
  • +0/-0
My intention is to recover from this scenario by bringing forward the build of a half-built machine on which to duplicate the vulnerable tracts of data. The resident SME box will then be re-assembled with six identical WD GP 1TB drives and re-installed using 'sme nospare' from the outset. The wiki does not make entirely clear whether this will result in unused 'spares' either as 'hot spare' or as the extra redundancy imbued under the category of RAID6. Any idea? I've read things inappropriately before... (keep forgetting I've lost my h/w card, so the projected seven now has to be just six).

[postedit: not seven but six now (as no h/w card)
« Last Edit: June 07, 2009, 01:13:09 AM by piran »

Offline kevinb

  • *
  • 237
  • +0/-0
I have not actually tried seven drives but I am fairly sure "sme nospare" will get you a 5 TB RAID 6.

I do have servers running 6 drives installed as "no spare" and they are 4 TB RAID 6.
« Last Edit: June 07, 2009, 01:22:09 AM by kevinb »

Offline piran

  • *****
  • 502
  • +0/-0
Cheers, 4 or 5TB should hold me for now;~)

Offline slords

  • *****
  • 235
  • +3/-0
What you are looking for is called "online capacity expansion" which is a feature high end raid cards and storage cabinets support.  Along with this feature most of them also support "online raid level migration".  Linux is trying to support these but isn't quite there.  They almost have the capacity expansion part but are a long way off on the raid level migration.

I've been playing with both hardware and software OCE and ORLM for years now.  Just having a controller/box that supports expansion doesn't mean that you are done.  You still need to figure out how to repartition and expand the file systems.  I've done this many times with SME (and other distros) and most of the time things go very well.

There are some tips you can follow to make things easier on you if you have one of these type units.  Install SME on a set of smaller drives (I usually use three 36GB 15k drives) as RAID-1+spare.  Then create the LVM for the rest of your data directly on the device that supports expansion.  Then when you expand all you have to do is pvresize and you have the extra space available to you.

Also be aware of how long it will take to sync/migrate if you ever have an issue.  Some cards do better then others in this matter.  I've got a really nice arcea card that would sync a 12 x 320GB, raid-6 array in about 13 hours.  I've got a 16 x 1TB iscsi unit that syncs the multiple raid-5/6 stripes in about 22 hours.  Using software raid-6 over the number of drives/sizes you are taking about is going to take days/weeks.

The issue you run into is what happens if a second (third with raid-6) drive fails while you are syncing?  I've seen it all too often that the stress on the drives during a resync causes even more to fail.  This is especially true if all the drives were bought around the same time.

Good luck with your project and I wish you the best of luck.  If it were me in this situation I'd seriously invest in a decent raid card (ex. areca) that has a battery backup and supports the expansion cabailities and speed that you really need with an array that size.
"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs,
and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." -- Rich Cook

Offline piran

  • *****
  • 502
  • +0/-0
Agreed. Sadly, I *did* have both expansion and RAID level change capabilities on my (probably expired) h/w card. Yes, sync migration took a time but being a true RAID card it was largely irrelevant to both my own and the site's needs. Whereas SME's software RAID sync migration latency is another matter altogether... particularly for several TBs. The multiple CPU cores doesn't seem as helpful as the pundits would have everyone believe. Yes, I still do have my three 36GB 15k Raptors (for the SME OS) as a hangover from those happy times when 'the data' was stored separately on the h/w array card. And you're right about the time assessment ~ over a week IIRC;~)

The bulk data stored by SME is the backup not the source, so the RAID element isn't mandatory per se but a desirable flavour if available. I don't have the finance to readily support what you suggest despite its temptation. A while back I was centralising backup resources but I can't support the sheer numbers or logistics, so will have re-orchestrate the flow of data on a more distributed and sustainable basis. I try NOT to entrust anything to RAIDx, it's actual redundant copies that counts in my book. And all transfers are a function of what can be transferred across the gigabit SMB/intranet bottleneck.

Right now I have to get off what I fear might be the start of 'the slippery slope' caused by a succession of failures. I've much too much data flying about when it should be archived offline ~ long story ~ was expecting BD to arrive to cope with this sort of eventuality somewhat earlier.

Thank you again for your timely intervention, thoughts and good wishes.

Offline piran

  • *****
  • 502
  • +0/-0
Using software raid-6 over the number of drives/sizes you are taking about is going to take days/weeks.
...now awaiting delivery of (another) Adaptec:
http://www.adaptec.com/en-US/products/Controllers/Hardware/sata/value/SAS-31605
Out-of-stock... altered the plans somewhat and specified
http://www.adaptec.com/en-US/products/Controllers/Hardware/sas/performance/SAS-5805/
Finished that half-built box and installed the h/w RAID at source in workstation rather than across the intranet in the SME. Logistics. Card is quick... three WD GP 1TB in minimal RAID5 sync'd up in approx 7 to 9hrs (wasn't specifically watching the clock) at no load to the rest of the box. Comes with internal diagnostic LED array [no pun intended] the display pattern of which, in idle mode, oscillates left and right like the AI car ("Kit") in the TV programme Knightrider but without woooshing noise.
Now restoring just as fast as I can stack up the sessions.

[PostEdit: to change new item's URL]
« Last Edit: June 12, 2009, 02:28:46 AM by piran »