Koozali.org: home of the SME Server

Kernel panic and raid down

Offline amazilia

  • **
  • 36
  • +0/-0
Kernel panic and raid down
« on: May 21, 2012, 12:54:43 PM »
Hi,

I have a server with a kernek panic error. I booted from the rescue cd

I got this running fdisk-l

Code: [Select]
Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *          63      208844      104391   fd  Linux raid autodetect
/dev/sda2          208845  1953520064   976655610   fd  Linux raid autodetect

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1      208769      104384+  fd  Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sdb2          208770  1953520063   976655647   fd  Linux raid autodetect

I ordered two disks (same size) what do I do now?

 This server worked fine for at least a year with these disks (I should have maybe be more careful concerning their health state ...)


Thanks in advance

Philippe

Offline Stefano

  • *
  • 10,894
  • +3/-0
Re: Kernel panic and raid down
« Reply #1 on: May 21, 2012, 01:01:54 PM »
1) kernel panic? are you sure it depends on hds?
2) do you read admin's email account? you should have some email telling you if something is going wrong with hds (smart, raid monitor.. ecc)
3) from rescue, are you able to mount the partitions? if so, please post here the result of
Code: [Select]
cat /proc/mdstat

please check your ram with memtest (should be one of the boot options of SME's cd)

Offline amazilia

  • **
  • 36
  • +0/-0
Re: Kernel panic and raid down
« Reply #2 on: May 21, 2012, 01:21:25 PM »
1) kernel panic? are you sure it depends on hds?
2) do you read admin's email account? you should have some email telling you if something is going wrong with hds (smart, raid monitor.. ecc)
3) from rescue, are you able to mount the partitions? if so, please post here the result of
Code: [Select]
cat /proc/mdstat

Hi thanks for your prompt answer.

1) I wish I knew. I have the kernel parnic when I start the server , I cannot boot and I  get this output from fdisk using the rescue cd.
2) no, but I know what I will have to do when the server is back ...
3) I can mount sda1 and sdb1 containing /boot but I cannot mount sda2 or sdb2

I am using mount -t ext3 /dev/sdb2 /mnt/sdb2part and i get an invalid argument error message. i guess it is because of the raid

Thanks gain

Philippe


I tried to mount drive with raid.

/etc/mdadm.conf
Code: [Select]
DEVICE /dev/sd[ab]1
DEVICE /dev/sd[ab]2

mdadm --examine --scan >> /etc/mdadm.conf

/etc/mdadm.conf
Code: [Select]
DEVICE /dev/sd[ab]1
DEVICE /dev/sd[ab]2
ARRAY /dev/md2 level=raid1 num-devices=2 spares=1 UUID=e99fbd4f:a57c9286:380fe2f4:dfc6a506
ARRAY /dev/md1 level=raid1 num-devices=2 spares=1 UUID=af4afe71:9b58db89:0be9d612:9efc4dda

md1 is /boot

mdadm --assemble --scan /dev/md1
Code: [Select]
mdadm: /dev/md1 has been started with 1 drive (out of 2) and 1 spare.

mdadm --detail /dev/md1
Code: [Select]
/dev/md1:
        Version : 00.90.01
  Creation Time : Thu Jul  3 11:41:18 2008
     Raid Level : raid1
     Array Size : 104320 (101.89 MiB 106.82 MB)
    Device Size : 104320 (101.89 MiB 106.82 MB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Mon May 21 11:49:05 2012
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           UUID : af4afe71:9b58db89:0be9d612:9efc4dda
         Events : 0.106134

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       0        0        -      removed

mdadm --assemble --scan /dev/md2

Code: [Select]
mdadm: /dev/md2 assembled from 0 drives and 1 spare. not enought to start the array.

Philippe
« Last Edit: May 21, 2012, 02:02:41 PM by amazilia »

Offline amazilia

  • **
  • 36
  • +0/-0
Re: Kernel panic and raid down
« Reply #3 on: May 23, 2012, 08:28:59 AM »
Hi again,

please check your ram with memtest (should be one of the boot options of SME's cd)

memtest is ok

thanks again

Philippe

Offline amazilia

  • **
  • 36
  • +0/-0
Re: Kernel panic and raid down
« Reply #4 on: May 25, 2012, 10:44:31 AM »
Hi,

sorry to bring this up, but could someone help me on this ?

thanks in advance,

Philippe

Offline Stefano

  • *
  • 10,894
  • +3/-0
Re: Kernel panic and raid down
« Reply #5 on: May 25, 2012, 10:52:13 AM »
I guess there's nothing to do..

your md2 array is broken.. it means that it was working in degraded mode and the only good disk is died..

try removing sda and start from cd in rescue mode.. if you are VERY lucky, you could see old data in sdb2 partition..

I think you should have learnt something:
- always read admin/root email.. in this way you know in real time if something is going wrong (i.e. one disk is kicked out of raid)
- always do a backup (I guess you don't have it.. )

if you have a backup, reinstall and restore

sad but true

P.S. repeat with me the mantra "backup is good, backup is my best friend, backup will save my day.." :-)

Offline amazilia

  • **
  • 36
  • +0/-0
Re: Kernel panic and raid down
« Reply #6 on: May 25, 2012, 03:29:33 PM »
I guess there's nothing to do..

your md2 array is broken.. it means that it was working in degraded mode and the only good disk is died..

try removing sda and start from cd in rescue mode.. if you are VERY lucky, you could see old data in sdb2 partition..

I think you should have learnt something:
- always read admin/root email.. in this way you know in real time if something is going wrong (i.e. one disk is kicked out of raid)
- always do a backup (I guess you don't have it.. )

if you have a backup, reinstall and restore

sad but true

P.S. repeat with me the mantra "backup is good, backup is my best friend, backup will save my day.." :-)

Hi,

Thanks for the answer.

I new I was a litle bit careless on this one.

I disconnected one of the two drives and have been able to access to /mnt/sysimage

for the time being I am able to recover my data.

Thanks for your help

Philippe