Koozali.org: home of the SME Server

DegradedArray event Assistance Please (Resolved)

Offline sits

  • ***
  • 68
  • +0/-0
DegradedArray event Assistance Please (Resolved)
« on: March 08, 2007, 12:31:49 AM »
Hi all this is a recent install of SME 7.1 with 2 Physical SATA drives

am now getting email stating that
A DegradedArray event had been detected on md device /dev/md2.

What do I need to do to add the SDA drive back into the array, since it seems to have gone missing.

This is the output from MDSTAT

Quote
root@intermet1 etc]#  cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb1[1]
      104320 blocks [2/1] [_U]

md2 : active raid1 sdb2[1]
      312464128 blocks [2/1] [_U]

unused devices: <none>


and output from MDADM

Quote
[root@intermet1 etc]# mdadm --query --detail /dev/md[12]
/dev/md1:
        Version : 00.90.01
  Creation Time : Wed Aug  9 19:48:36 2006
     Raid Level : raid1
     Array Size : 104320 (101.88 MiB 106.82 MB)
    Device Size : 104320 (101.88 MiB 106.82 MB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Thu Mar  8 08:15:01 2007
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0


    Number   Major   Minor   RaidDevice State
       0       0        0       -1      removed
       1       8       17        1      active sync   /dev/sdb1
           UUID : 129efccc:97234f6b:ba35e278:ddd70d74
         Events : 0.1133
/dev/md2:
        Version : 00.90.01
  Creation Time : Wed Aug  9 19:46:20 2006
     Raid Level : raid1
     Array Size : 312464128 (297.99 GiB 319.96 GB)
    Device Size : 312464128 (297.99 GiB 319.96 GB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Thu Mar  8 09:12:02 2007
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0


    Number   Major   Minor   RaidDevice State
       0       0        0       -1      removed
       1       8       18        1      active sync   /dev/sdb2
           UUID : 872ad137:2f28053f:dded8c14:6750c7ec
         Events : 0.3818383


sfdisk show drives in system

Quote
[root@intermet1 etc]# sfdisk -s
/dev/sda: 312570167
/dev/sdb: 312571224
/dev/md2: 312464128
/dev/md1:    104320
/dev/sdc: 312571224


SDC is an external USB drive added recently for backups
...

Offline rdtaylor

  • **
  • 26
  • +0/-0
DegradedArray event Assistance Please (Resolved)
« Reply #1 on: March 08, 2007, 05:07:44 AM »
The disk may well be cactus...

If you need to replace the drive, have a look at my post from the past couple of days - I went through some pain due to the fact that the new drive was slightly smaller than the original drive which had failed (even though they were both "160Gb")

Offline sits

  • ***
  • 68
  • +0/-0
DegradedArray event Assistance Please (Resolved)
« Reply #2 on: March 08, 2007, 04:17:46 PM »
Followed this procedure from Gaston94

Quote
Dump the partition layout from one disk and apply it to the other one Code:
[root@ixus ~]# sfdisk -d /dev/sdb > sfdisk_sdb.out
[root@ixus ~]# sfdisk /dev/sda < sfdisk_sdb.out

(this will DESTROY any data you might have on sda !)

Add the new partitions to your raid device Code:
[root@ixus ~]# mdadm -a /dev/md1 /dev/sda1
mdadm: hot added /dev/sda1
[root@ixus ~]# mdadm -a /dev/md2 /dev/sda2
mdadm: hot added /dev/sda2

Now you can see the raid array being rebuilt Code:

[root@ixus ~]# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sda1[2] sdb1[0]
      104320 blocks [2/1] [U_]
      [===================>.]  recovery = 98.0% (102528/104320) finish=0.0min speed=9320K/sec
md2 : active raid1 sda2[2] sdb2[0]
      1991936 blocks [2/1] [U_]
        resync=DELAYED
unused devices: <none>
[root@ixus ~]#


The Raid is now rebuilding
...

Offline Gaston94

  • *****
  • 184
  • +0/-0
DegradedArray event Assistance Please (Resolved)
« Reply #3 on: March 08, 2007, 09:37:06 PM »
Happy, this does help you :D
think about reinstalling the grub part once the disk are in sync

Gaston

Offline slords

  • *****
  • 235
  • +3/-0
DegradedArray event Assistance Please (Resolved)
« Reply #4 on: March 08, 2007, 10:38:30 PM »
And what is wrong with doing this through the console?  It does all of that plus a little more.
"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs,
and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." -- Rich Cook

Offline sits

  • ***
  • 68
  • +0/-0
DegradedArray event Assistance Please (Resolved)
« Reply #5 on: March 09, 2007, 12:01:08 AM »
The raid has finished rebuilding successfully
and I couldn't do this through the console, because all I saw was

Quote
Personalities : [raid1]
md1 : active raid1 sdb1[1]
104320 blocks [2/1] [_U]

md2 : active raid1 sdb2[1]
312464128 blocks [2/1] [_U]

unused devices: <none>


when you press next it goes back to the Admin menu, and there was no options to add or modify the raid.

I have also now upgraded this to SME 7.1.2

Quote
grub
grub> root (hd0,0)
 Filesystem type is ext2fs, partition type 0xfd


Grub reports this as
grub
grub> root (hd0,0)
 Filesystem type is ext2fs, partition type 0x83

What do I need to do to get this now to 0xfd
...

Offline Gaston94

  • *****
  • 184
  • +0/-0
DegradedArray event Assistance Please (Resolved)
« Reply #6 on: March 10, 2007, 02:35:38 PM »
Hi,
Quote from: "sits"
Grub reports this as
grub
grub> root (hd0,0)
 Filesystem type is ext2fs, partition type 0x83

What do I need to do to get this now to 0xfd

That is quite strange, 0x83 is the standard Linux partition system's ID, whilst SME si being built over raid partition, this should be 0xfd.
The partition table dump you have done earlier (sfdisk -d) should have provided the correct one ...
Cross check which type are both of your disk
Code: [Select]
# fdisk -l /dev/sd?
Fixing this can be done through the fdisk utility (toggle partition system's ID).
As usual, partition operation have to be done carefully and once data backup were performed.

Quote from: "slords"
And what is wrong with doing this through the console?  It does all of that plus a little more.

I am not familiar with the console usage, apologize. Could you tell us more about  this "little more" ?
G.

Offline sits

  • ***
  • 68
  • +0/-0
DegradedArray event Assistance Please (Resolved)
« Reply #7 on: March 11, 2007, 03:44:03 AM »
This is how the 2 disks are reported by fdisk

Quote
[root@intermet1 ~]# fdisk -l /dev/sda

Disk /dev/sda: 320.0 GB, 320071851520 bytes
255 heads, 63 sectors/track, 38913 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1          13      104391   83  Linux
/dev/sda2              14       38913   312464250   83  Linux
[root@intermet1 ~]# fdisk -l /dev/sdb

Disk /dev/sdb: 320.0 GB, 320072933376 bytes
255 heads, 63 sectors/track, 38913 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1          13      104391   fd  Linux raid autodetect
/dev/sdb2              14       38913   312464250   fd  Linux raid autodetect
[root@intermet1 ~]#
...

Offline Gaston94

  • *****
  • 184
  • +0/-0
DegradedArray event Assistance Please (Resolved)
« Reply #8 on: March 11, 2007, 12:04:04 PM »
not a correct state :(

backup your datas
use fdsik on /dev/sda to change your partition system'sID to fd

G.

Offline sits

  • ***
  • 68
  • +0/-0
DegradedArray event Assistance Please (Resolved)
« Reply #9 on: March 13, 2007, 02:31:57 PM »
Quote from: "Gaston94"
not a correct state :(

backup your datas
use fdsik on /dev/sda to change your partition system'sID to fd

G.


How hard would it be to remove SDA from the raid again, change the ID then add back to raid, that would keep the data safe enough.
...

Offline Gaston94

  • *****
  • 184
  • +0/-0
DegradedArray event Assistance Please (Resolved)
« Reply #10 on: March 13, 2007, 04:16:46 PM »
sits
to answer your question, removing disk from array :
Code: [Select]
# mdadm /dev/md1 -f /dev/sda1 -r /dev/sda1
 # mdadm /dev/md2 -f /dev/sda2 -r /dev/sda2


correct your system ID fdisk /dev/sda) , and join  back
Code: [Select]
# mdadm -a /dev/md1 /dev/sda1
 # mdadm -a /dev/md2 /dev/sda2


somehow I wouldn't have split the array (I do not neither understand how you reach that state) ... but I might be wrong and I have NEVER try this.

Backup your system first
Anyone else has any advise ?

G.

Offline ldkeen

  • *
  • 405
  • +0/-0
DegradedArray event Assistance Please (Resolved)
« Reply #11 on: March 15, 2007, 01:49:34 PM »
Quote from: "Gaston94"
Anyone else has any advise ?

I think Shad has already told him how to fix the problem earlier on in the thread.. Run some diagnostics on the disk to make sure that it's not faulty, then boot with both disks in the machine and go into the console (su admin) and select option 5 - manage disk redundancy.
Lloyd

Offline sits

  • ***
  • 68
  • +0/-0
DegradedArray event Assistance Please (Resolved)
« Reply #12 on: March 15, 2007, 05:38:48 PM »
IDKeen

Try reading a bit, and you would see that option 5 only gives me a report, and then when you press next it just goes back to the Admin menu, not to any other screen, to do anything with the array.
...

Offline ldkeen

  • *
  • 405
  • +0/-0
DegradedArray event Assistance Please (Resolved)
« Reply #13 on: March 15, 2007, 09:36:27 PM »
Quote from: "sits"
and you would see that option 5 only gives me a report

Yes I saw that, but if you had read the forums you would also know that the recommended procedure here is to report that to the bug tracker. You first posted this on 7/3 - it is now the 16/3. My guess is if you had taken this to the bug tracker straight away, it may have been sorted out on the a lot earlier.
Lloyd

Offline ldkeen

  • *
  • 405
  • +0/-0
DegradedArray event Assistance Please (Resolved)
« Reply #14 on: March 15, 2007, 09:53:15 PM »
Sits,
Have you tried running the "add_drive_to_raid" script from the command line:
Code: [Select]

#/sbin/e-smith/add_drive_to_raid sda

I think there may be an issue where the script may not complete on the first run, so if it doesn't apear to work try running the script again and report what happens.
Lloyd