Koozali.org: home of the SME Server

Dynamic RAID scrubbed my data

brookes

Dynamic RAID scrubbed my data
« on: January 31, 2007, 11:06:24 PM »
Hey,

Did an interesting thing last night, I rebuilt my server and in the process I trashed all my data. While this is very annoying, and I do mean very annoying I do have a backup of the important stuff from a couple of weeks ago. Now how did I do this you might ask, well I followed what would be considered an appropriate approach but SME was not going to let me get away with that so here is what happened:

My SME server was rebuild a year or so ago from v6 with a pre-release version of 7 and has been updated since to bring it to v7.1 through YUM. About 6 months ago I added another drive (250GB) to the mirrored 80GB drives just to store data I don't mind losing. And I had it as a mount point in my "data" i-bay and all was good. When I decided to rebuild, and I do that because I have been tinkering with the server over the last year and I thought it was time to do some house cleaning, I though best to be safe and copied by important data to the 250GB drive and disable it in the BIOS during the rebuild. This worked brilliantly and after the rebuild and the configuring of the server I re-enabled the drive in the BIOS. I then tried to mount the drive and it kept coming up as "in use" I then checked the state of the drives and dscovered the server was scrubbing the 250GB drive in an attempt to prepare it for the RAID array. Bugger.

So my questions are these:

1. Is it possible to recover data from a partially scrubbed EXT3 drive?
2. Is it possible to exclude devices from the dymanic RAID?

My method here was sound but there is a lesson for everyone here and that is the old 2 backups rule?

Stephan.

Offline Gaston94

  • *****
  • 184
  • +0/-0
Dynamic RAID scrubbed my data
« Reply #1 on: January 31, 2007, 11:34:50 PM »
Hi,
I am not sure understanding exactly what happened or not.
It just sounds me like another thread : try a search with "dmraid" as a key word. Should bring back some clues, dmraid should not have taken over your expectations.
 you might not have lost all your datas ;)


Just my two cents

Offline bpivk

  • *
  • 908
  • +0/-0
    • http://www.bezigrad.com
Dynamic RAID scrubbed my data
« Reply #2 on: February 01, 2007, 12:10:12 AM »
We have been over this losing data issue. You can't recover it from SME if it was wiped. You'll have to recover it with the use of a special recovery tool (google is your friend).
"It should just work" if it doesn't report it. Thanks!

brookes

Dynamic RAID scrubbed my data
« Reply #3 on: February 01, 2007, 12:13:41 AM »
Hey,

It could be that you are right and I am hoping that is the case. To explain a bit better I have one 80GB drive on each SATA channel and the 250GB as a secondary of the first channel. When I examined the RAID after enabling the the 250GB drive it reported it was 20% through rebuilding what appeared to be the 250GB drive. I am going to pull the 250GB drive out of the server and examine it with the EXT driver for windows to reduce risk. Typically I can only do that when I have some spare time but will keep you posted.

Stephan.

brookes

Dynamic RAID scrubbed my data
« Reply #4 on: February 01, 2007, 12:15:53 AM »
Quote from: "bpivk"
We have been over this losing data issue. You can't recover it from SME if it was wiped. You'll have to recover it with the use of a special recovery tool (google is your friend).


Yeah I'm on that recover thing and I'm sorry if it's been covered before. More importantly is why is SME trying to RAID the drive and how to disable this function?

Stephan.

Offline pfloor

  • *****
  • 889
  • +1/-0
Dynamic RAID scrubbed my data
« Reply #5 on: February 01, 2007, 12:42:11 AM »
Quote from: "brookes"
Hey,

It could be that you are right and I am hoping that is the case. To explain a bit better I have one 80GB drive on each SATA channel and the 250GB as a secondary of the first channel. When I examined the RAID after enabling the the 250GB drive it reported it was 20% through rebuilding what appeared to be the 250GB drive. I am going to pull the 250GB drive out of the server and examine it with the EXT driver for windows to reduce risk. Typically I can only do that when I have some spare time but will keep you posted.

Stephan.

Are you sure it wasn't 20% through building the 2 X 80gig drive pair?  This will take a little while, maybe 1 hour on a modest system.

Also, if the rebuild process is not complete upon a shutdown/reboot, it will start all over again.  Let the 2 X 80gig drives fully sync before you turn off or reboot the machine.
In life, you must either "Push, Pull or Get out of the way!"

Offline slords

  • *****
  • 235
  • +3/-0
Dynamic RAID scrubbed my data
« Reply #6 on: February 01, 2007, 03:24:39 AM »
Disabling stuff in the bios doesn't really hide it from operating systems.  My guess is that when you installed 7 fresh it detected all 3 drives and created you a new raid-5 and was in the process of syncing it.

Best rule of thumb is if you have critical data on a drive and are going to re-install a system, take the critical drive completely out of the system so it isn't touched.

If you post the output of the following command then we should be able to help determine exactly what the system was 20% of they way through doing.
Code: [Select]
cat /proc/mdstat
"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs,
and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." -- Rich Cook

brookes

Dynamic RAID scrubbed my data
« Reply #7 on: February 13, 2007, 09:21:03 PM »
Quote from: "slords"

If you post the output of the following command then we should be able to help determine exactly what the system was 20% of they way through doing.
Code: [Select]
cat /proc/mdstat


And here it is:

Personalities : [raid1]
md1 : active raid1 sda1[0] sdb1[2]
      104320 blocks [3/2] [U_U]

md2 : active raid1 sda2[0] sdb2[2]
      78019584 blocks [3/2] [U_U]

unused devices: <none>


I took the 250GB drive out and ran some recovery software over it and it contains a 100MB logical drive/partition and an unknown <rest of drive> partition so I think we can safely say it is toast. I did go into the administrator and look at disk redundancy and it showed I had a clean RAID so I re-enabled the drive and then looked again and now it's not clean. This indicates that the dynamic RAID did trash the new drive and is still trying. Is there any way to exclude drives? Here is my RAIDMONITOR log from the rebuild:

2007-02-03 15:22:13.740226500 mdadm: only specify super-minor once, super-minor=2 ignored.
2007-02-03 15:22:13.740301500 mdadm: only specify super-minor once, super-minor=1 ignored.
2007-02-03 15:22:14.268702500 Event: SparesMissing, Device: /dev/md1, Member:
2007-02-03 15:22:14.417209500 Event: SparesMissing, Device: /dev/md2, Member:
2007-02-08 07:32:18.946810500 mdadm: only specify super-minor once, super-minor=2 ignored.
2007-02-08 07:32:18.946814500 mdadm: only specify super-minor once, super-minor=1 ignored.
2007-02-08 07:32:19.106968500 Event: SparesMissing, Device: /dev/md1, Member:
2007-02-08 07:32:19.245507500 Event: SparesMissing, Device: /dev/md2, Member:
2007-02-10 15:09:40.167599500 mdadm: only specify super-minor once, super-minor=2 ignored.
2007-02-10 15:09:40.196796500 mdadm: only specify super-minor once, super-minor=1 ignored.
2007-02-10 15:09:40.391634500 Event: SparesMissing, Device: /dev/md1, Member:
2007-02-10 15:09:40.594497500 Event: SparesMissing, Device: /dev/md2, Member:
2007-02-13 19:57:31.641636500 mdadm: only specify super-minor once, super-minor=2 ignored.
2007-02-13 19:57:31.672236500 mdadm: only specify super-minor once, super-minor=1 ignored.
2007-02-13 19:57:32.179812500 Event: SparesMissing, Device: /dev/md1, Member:
2007-02-13 19:57:32.346743500 Event: SparesMissing, Device: /dev/md2, Member:
2007-02-13 20:07:16.852962500 mdadm: only specify super-minor once, super-minor=2 ignored.
2007-02-13 20:07:16.874267500 mdadm: only specify super-minor once, super-minor=1 ignored.
2007-02-13 20:07:17.464969500 Event: DegradedArray, Device: /dev/md1, Member:
2007-02-13 20:07:17.940694500 Event: SparesMissing, Device: /dev/md1, Member:
2007-02-13 20:07:18.089771500 Event: DegradedArray, Device: /dev/md2, Member:
2007-02-13 20:07:18.281437500 Event: SparesMissing, Device: /dev/md2, Member:

So now I want to fix the current RAID then put the 250GB back in. Can someone tell me the command to see which of the drives is the "good" one and how to break out the dodgy one and re-join it?

Cheers

Stephan

Offline bpivk

  • *
  • 908
  • +0/-0
    • http://www.bezigrad.com
Dynamic RAID scrubbed my data
« Reply #8 on: February 13, 2007, 11:06:35 PM »
Quote
I did go into the administrator and look at disk redundancy and it showed I had a clean RAID so I re-enabled the drive and then looked again and now it's not clean.


The unclean problem happened because you disabled one of the drives and they had to synch again.
"It should just work" if it doesn't report it. Thanks!

brookes

Dynamic RAID scrubbed my data
« Reply #9 on: February 13, 2007, 11:28:34 PM »
Quote from: "bpivk"
Quote
I did go into the administrator and look at disk redundancy and it showed I had a clean RAID so I re-enabled the drive and then looked again and now it's not clean.


The unclean problem happened because you disabled one of the drives and they had to synch again.


Yeah thats true in a way but its the 250GB drive that caused this when I have two 80GB drives for the mirror. I build the server with only the two 80GB drives visible and afterwards added the 250GB drive.  The problem is with the Dynamic RAID. If I were to guess it is because when I added the 250GB drive and Linux now saw it as having a lower ID number (because its on the first controller) then the 2nd 80GB drive.

This did not happen when YUM upgraded the server.

Stephan.

brookes

Dynamic RAID scrubbed my data
« Reply #10 on: March 05, 2007, 10:45:58 AM »
So can anyone help me fix the arrays? I have removed both partitions on the 2nd 80GB drive and started a resync but the 250GB drive hangs on. Here is the detail:

[root@peter ~]# mdadm --misc --detail /dev/md1
/dev/md1:
        Version : 00.90.01
  Creation Time : Wed Jan 31 20:33:02 2007
     Raid Level : raid1
     Array Size : 104320 (101.88 MiB 106.82 MB)
    Device Size : 104320 (101.88 MiB 106.82 MB)
   Raid Devices : 3
  Total Devices : 2
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Mon Mar  5 20:17:43 2007
          State : clean, degraded
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1
       2       0        0       -1      removed
           UUID : 6ffe1e62:c7ef50e8:7f32c622:1d21af25
         Events : 0.5323

[root@peter ~]# mdadm --misc --detail /dev/md2
/dev/md2:
        Version : 00.90.01
  Creation Time : Wed Jan 31 20:33:02 2007
     Raid Level : raid1
     Array Size : 78019584 (74.41 GiB 79.89 GB)
    Device Size : 78019584 (74.41 GiB 79.89 GB)
   Raid Devices : 3
  Total Devices : 2
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Mon Mar  5 20:32:31 2007
          State : clean, degraded, recovering
 Active Devices : 1
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 1

 Rebuild Status : 53% complete

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       0        0       -1      removed
       2       0        0       -1      removed
       3       8       18        1      spare   /dev/sdb2
           UUID : f5381167:014bc5e1:48f2a107:687669c9
         Events : 0.10649177

How do I remove the "removed" partitions? Apparently I can use the mdadm.conf to specfy devices I don't want RAIDed.

Cheers

Stephan.