Koozali.org: home of the SME Server

Raid has crashed. Can I recover

fitzrik

Raid has crashed. Can I recover
« on: October 10, 2006, 08:36:25 PM »
Hello there.

I have an installation of SMEServer 7.0. It is on a PC with one standard IDE drive.

The Kernel panics when trying to mount md1. Md0 which seems to be the boot partition at 100MB or so seems fine.

If I boot using a live distro such as Mepis, I can mount and fsck md0 but not md1.

Running smeserver does not offer to upgrade existing installation but simply to wipe disks.

I tried running fsck -b 8139 /dev/md1 and got the same error as if I had run fsck /dev/md0. It comppains about a bad superblock.

Any ideas? I did a search but while people seems to have similar problems, noone seemed to have a solution. If I can't recover the raid, is there a way I can recover the data? (No backup of course :(

Thanks

R

Offline RvLardin

  • ****
  • 82
  • +0/-0
    • http://sme.firewall-services.com
Raid has crashed. Can I recover
« Reply #1 on: October 11, 2006, 03:26:50 PM »
No ideas, we felt in recovering datas lost this way, too.
:(

I use this post just to inform others that we had a *lot* of pb with the Raid (1 or 5) in SME 7 since 4 monthes. *All* our installations have died, sometimes without too much damages, running on one disk but othertimes with complete lost of datas.
We have stopped to use the SME raid capacities for this reason, because we can't guess where it come from. Until your post we were thinking that it has a relation with SATA disk ...

Kernel panic often seen : "commit I/O error" ... Whithout any warnings in the logs before.

A+,
RV.
----
"Those who are willing to lose some of their essential liberties in favour of security deserve neither and will lose both."
- Thomas Jefferson .

Offline raem

  • *
  • 3,972
  • +4/-0
Raid has crashed. Can I recover
« Reply #2 on: October 11, 2006, 03:34:57 PM »
RvLardin

Have you reported your problems to the bug tracker so they can be fixed, if found to be valid ?

How do you expect problems to be fixed if they are not reported ?
...

Offline jfarschman

  • *
  • 406
  • +0/-0
Raid has crashed. Can I recover
« Reply #3 on: October 11, 2006, 03:48:32 PM »
Ray,

  Is there a particular method he should be using?  Is running fsck all he needs to do?  It's seems like there should be something more to this before reporting it to the bug tracker.

Code: [Select]
fsck -b 8139 /dev/md1
Jay Farschman
ICQ - 60448985
jay@hitechsavvy.com

Offline raem

  • *
  • 3,972
  • +4/-0
Raid has crashed. Can I recover
« Reply #4 on: October 11, 2006, 04:05:33 PM »
jfarschman

I was answering RvLardin who said:

I use this post just to inform others that we had a *lot* of pb with the Raid (1 or 5) in SME 7 since 4 monthes. *All* our installations have died, sometimes without too much damages, running on one disk but othertimes with complete lost of datas.
...

Offline RvLardin

  • ****
  • 82
  • +0/-0
    • http://sme.firewall-services.com
Raid has crashed. Can I recover
« Reply #5 on: October 11, 2006, 04:19:33 PM »
We did !
bug #1440
:)

If you make a search in the bugtracker with "not syncing", you will see a lot of post with the same description.
Charlie has suggested, in one post on the BugTracker, to report to the RedHat kernel team but it is difficult because there's no way to repeat certainly the bug and there's no common factors between those installs.

Since this date (May 2006). We did other installations, even with the 7.0 final and we had many time the same prob, either with 2 disks or more, with diffrent hardware, most of time after a couple of days (2 monthes one time) of good working.
"Not syncing" are the exact words that we discovered each time on the dead machine, but other have reported also.

For us, the exact count of lost servers is about 7 (100 % of our software raid installs).
:(

RV.
----
"Those who are willing to lose some of their essential liberties in favour of security deserve neither and will lose both."
- Thomas Jefferson .

Offline raem

  • *
  • 3,972
  • +4/-0
Raid has crashed. Can I recover
« Reply #6 on: October 11, 2006, 04:26:01 PM »
RvLardin

> We did !
> bug #1440

That's fine then, thanks.
I will take a look.
...

fitzrik

Raid has crashed. Can I recover
« Reply #7 on: October 11, 2006, 05:01:11 PM »
I don't think I'm going to get any joy.

My superblock has 8192 blocks so I have used

fsck -b 8193 /dev/md1
fsck -b 16385 /dev/md1
fsck -b 24511 /dev/md1

All of them tell me that my superblock is corrupt.

Is there a way to bypass the superblock. Or create a fresh one or anything?

tune2fs -j /dev/md1

also just tells me I have a bad magic number.

I'm not sure where to go from here :(

Offline william_syd

  • *****
  • 1,608
  • +0/-0
  • Nothing to see here.
    • http://www.magicwilly.info
Raid has crashed. Can I recover
« Reply #8 on: October 12, 2006, 03:45:22 AM »
Quote from: "fitzrik"
I don't think I'm going to get any joy.

My superblock has 8192 blocks so I have used

fsck -b 8193 /dev/md1
fsck -b 16385 /dev/md1
fsck -b 24511 /dev/md1

All of them tell me that my superblock is corrupt.

Is there a way to bypass the superblock. Or create a fresh one or anything?

tune2fs -j /dev/md1

also just tells me I have a bad magic number.

I'm not sure where to go from here :(


I've messed up my SME twice now and could not recover from it. High cpu usage while running multiple VMware VMs.

Resorted to placing one of the drives in a USB caddy and attaching it to an XP machine with Ext3 filesystem drivers. Recovered what I wanted (emails). Probably an ignorant way to do it but it worked for me.
Regards,
William

IF I give advise.. It's only if it was me....