Koozali.org: home of the SME Server

degraded array

Offline pablitobs

  • **
  • 35
  • +0/-0
degraded array
« on: August 20, 2011, 12:27:29 AM »
I have degraded array event mail from my sme, every day...

Ran cat /proc/mdstat
Got:
Personalities : [raid1]
md1 : active raid1 sda1[0]
      104320 blocks [2/1] [U_]

md2 : active raid1 sda2[0]
      975595200 blocks [2/1] [U_]

unused devices: <none>

Meaning the md2 is having troubles..

Quick search and I found  this http://wiki.contribs.org/Raid#Resynchronising_a_Failed_RAID
Followed the instructions on removing md2 as U_ means md2 is having the problem:
 mdadm --remove /dev/md2 /dev/sda2
 
 but I am getting this error:
 mdadm: hot remove failed for /dev/sda2: Device or resource busy

Tried with mdadm --manage /dev/md2 --remove /dev/sda2

same answer...  mdadm: hot remove failed for /dev/sda2: Device or resource busy

Tried with mdadm --stop /dev/md2

same answer... mdadm: hot remove failed for /dev/sda2: Device or resource busy

mount got me this:


/dev/mapper/main-root on / type ext3 (rw,usrquota,grpquota)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/md1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sdb1 on /home/e-smith/files/ibays/hal/files type ext4 (rw)
/dev/sdc1 on /media/usbdisk type ext3 (rw)

as you can see there is no reference to md2.. or sda2.. so looks like is not mounted...


running  mdadm --query --detail /dev/md1 I get....

/dev/md1:
        Version : 0.90
  Creation Time : Mon May  2 08:52:48 2011
     Raid Level : raid1
     Array Size : 104320 (101.89 MiB 106.82 MB)
  Used Dev Size : 104320 (101.89 MiB 106.82 MB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Fri Aug 19 14:56:57 2011
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           UUID : 7361e483:63142ad2:773f3338:619f9c18
         Events : 0.796

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       0        0        1      removed

running the same at md2


/dev/md2:
        Version : 0.90
  Creation Time : Mon May  2 08:52:49 2011
     Raid Level : raid1
     Array Size : 975595200 (930.40 GiB 999.01 GB)
  Used Dev Size : 975595200 (930.40 GiB 999.01 GB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Fri Aug 19 16:25:38 2011
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           UUID : 080785ae:0db28d57:6a2c4489:a18058a4
         Events : 0.5228248

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       0        0        1      removed

so... looks like md2 is removed.... then I tried to add it again with mdadm --add /dev/md2 /dev/sda2

got: mdadm: Cannot open /dev/sda2: Device or resource busy


any help will be appreciated.....

Thanks

Offline janet

  • *****
  • 4,812
  • +0/-0
Re: degraded array
« Reply #1 on: August 20, 2011, 03:23:37 AM »
pablitobs

The first raid error means sda is OK, but the second disk in RAID1 is missing (failed). In a typical situation this would be sdb.

I suggest you run a disk manufacturers diagnostic test to check the other sdb (or even both sda & sdb drives).
eg UBCD (ultimate boot CD)

The other errors about drive being busy are probably caused by you when trying to add sda back to the RAID array, when it is still a member of the array, and appears to be functioning correctly too.

So you have misinterpreted the errors and are doing the wrong thing to repair the problem.
It may be that sdb has totally failed and you need to physically replace that drive, and then add & resync the drive/array using the menu in the SME console (log in as admin).

Please read the manual & the various RAID and drive Howtos, there should be enough info there to get you going again.
Please search before asking, an answer may already exist.
The Search & other links to useful information are at top of Forum.

Offline larieu

  • *****
  • 214
  • +0/-0
Re: degraded array
« Reply #2 on: August 20, 2011, 08:43:08 AM »
please try to tell us what RAID you use
I have also a hardware RAID which will provide to SME information that only one HDD it exist
in my hardware RAID I have several HDD into one RAID5
but SME will see only one HDD (sda) and make 2 partitions on it (sda1 and sda2)
mine looks like this

Code: [Select]
[root@mail ~]# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sda1[0]
      104320 blocks [2/1] [U_]
     
md2 : active raid1 sda2[0]
      1318221504 blocks [2/1] [U_]

even in the back there are more HDD (I can see them in RAID management interface)

What I see is that you have sda1 and sda2 partitions - almost the same behavior
if you have an RAID which provide the equivalent HDD to the SME this should be the right thing to see

what did you see on fdisk -l?

for example in same environment I see only one HDD


Code: [Select]
fdisk -l

Disk /dev/sda: 1349.9 GB, 1349967151104 bytes
255 heads, 63 sectors/track, 164124 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          13      104391   fd  Linux raid autodetect
/dev/sda2              14      164124  1318221607+  fd  Linux raid autodetect

Disk /dev/md2: 1349.8 GB, 1349858820096 bytes
2 heads, 4 sectors/track, 329555376 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md2 doesn't contain a valid partition table

Disk /dev/md1: 106 MB, 106823680 bytes
2 heads, 4 sectors/track, 26080 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md1 doesn't contain a valid partition table

and as you can see apparently I have a problem with the second md2 - but this is due the fact SME try to make an software RAID (or at least it is prepared to do it)
please try to see if you have an hardware RAID or not
if everybody's life around you is better, probably yours will be better
just try to improve their life

Offline brianr

  • *
  • 990
  • +2/-0
Re: degraded array
« Reply #3 on: August 20, 2011, 10:55:31 AM »
Assuming you are using the built in linux software raid, and your /proc/mdstat is as you indicated:

md1 : active raid1 sda1[0]
      104320 blocks [2/1] [U_]

md2 : active raid1 sda2[0]
      975595200 blocks [2/1] [U_]

then I would start by seeing if you can just re-add the offending disc by:

mdadm --add /dev/md1 /dev/sdb1

and

mdadm --add /dev/md2 /dev/sdb2

If this does not work, then you need to test /dev/sdb (if indeed that is the second drive), and either replace or at least re-format it.  You could also try the "p" command from:

fdisk /dev/sdb

this will show you details of the partition table.
Brian j Read
(retired, for a second time, still got 2 installations though)
The instrument I am playing is my favourite Melodeon.
.........

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: degraded array
« Reply #4 on: August 20, 2011, 04:58:57 PM »
The other errors about drive being busy are probably caused by you when trying to add sda back to the RAID array, ...

No, from trying to remove one device which is the only device in an array, and from trying to stop an array while it was in use.

Offline purvis

  • *****
  • 567
  • +0/-0
Re: degraded array
« Reply #5 on: August 24, 2011, 09:29:52 AM »
i also find using smartmontools very helpful
http://wiki.contribs.org/Monitor_Disk_Health
it is likely you may find(the server may find) problems with the hard drive and you may get some emails to the admin account.