Koozali.org: home of the SME Server
Obsolete Releases => SME Server 7.x => Topic started by: sits on March 08, 2007, 12:31:49 AM
-
Hi all this is a recent install of SME 7.1 with 2 Physical SATA drives
am now getting email stating that
A DegradedArray event had been detected on md device /dev/md2.
What do I need to do to add the SDA drive back into the array, since it seems to have gone missing.
This is the output from MDSTAT
root@intermet1 etc]# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb1[1]
104320 blocks [2/1] [_U]
md2 : active raid1 sdb2[1]
312464128 blocks [2/1] [_U]
unused devices: <none>
and output from MDADM
[root@intermet1 etc]# mdadm --query --detail /dev/md[12]
/dev/md1:
Version : 00.90.01
Creation Time : Wed Aug 9 19:48:36 2006
Raid Level : raid1
Array Size : 104320 (101.88 MiB 106.82 MB)
Device Size : 104320 (101.88 MiB 106.82 MB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 1
Persistence : Superblock is persistent
Update Time : Thu Mar 8 08:15:01 2007
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Number Major Minor RaidDevice State
0 0 0 -1 removed
1 8 17 1 active sync /dev/sdb1
UUID : 129efccc:97234f6b:ba35e278:ddd70d74
Events : 0.1133
/dev/md2:
Version : 00.90.01
Creation Time : Wed Aug 9 19:46:20 2006
Raid Level : raid1
Array Size : 312464128 (297.99 GiB 319.96 GB)
Device Size : 312464128 (297.99 GiB 319.96 GB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 2
Persistence : Superblock is persistent
Update Time : Thu Mar 8 09:12:02 2007
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Number Major Minor RaidDevice State
0 0 0 -1 removed
1 8 18 1 active sync /dev/sdb2
UUID : 872ad137:2f28053f:dded8c14:6750c7ec
Events : 0.3818383
sfdisk show drives in system
[root@intermet1 etc]# sfdisk -s
/dev/sda: 312570167
/dev/sdb: 312571224
/dev/md2: 312464128
/dev/md1: 104320
/dev/sdc: 312571224
SDC is an external USB drive added recently for backups
-
The disk may well be cactus...
If you need to replace the drive, have a look at my post from the past couple of days - I went through some pain due to the fact that the new drive was slightly smaller than the original drive which had failed (even though they were both "160Gb")
-
Followed this procedure from Gaston94
Dump the partition layout from one disk and apply it to the other one Code:
[root@ixus ~]# sfdisk -d /dev/sdb > sfdisk_sdb.out
[root@ixus ~]# sfdisk /dev/sda < sfdisk_sdb.out
(this will DESTROY any data you might have on sda !)
Add the new partitions to your raid device Code:
[root@ixus ~]# mdadm -a /dev/md1 /dev/sda1
mdadm: hot added /dev/sda1
[root@ixus ~]# mdadm -a /dev/md2 /dev/sda2
mdadm: hot added /dev/sda2
Now you can see the raid array being rebuilt Code:
[root@ixus ~]# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sda1[2] sdb1[0]
104320 blocks [2/1] [U_]
[===================>.] recovery = 98.0% (102528/104320) finish=0.0min speed=9320K/sec
md2 : active raid1 sda2[2] sdb2[0]
1991936 blocks [2/1] [U_]
resync=DELAYED
unused devices: <none>
[root@ixus ~]#
The Raid is now rebuilding
-
Happy, this does help you :D
think about reinstalling the grub part once the disk are in sync
Gaston
-
And what is wrong with doing this through the console? It does all of that plus a little more.
-
The raid has finished rebuilding successfully
and I couldn't do this through the console, because all I saw was
Personalities : [raid1]
md1 : active raid1 sdb1[1]
104320 blocks [2/1] [_U]
md2 : active raid1 sdb2[1]
312464128 blocks [2/1] [_U]
unused devices: <none>
when you press next it goes back to the Admin menu, and there was no options to add or modify the raid.
I have also now upgraded this to SME 7.1.2
grub
grub> root (hd0,0)
Filesystem type is ext2fs, partition type 0xfd
Grub reports this as
grub
grub> root (hd0,0)
Filesystem type is ext2fs, partition type 0x83
What do I need to do to get this now to 0xfd
-
Hi,
Grub reports this as
grub
grub> root (hd0,0)
Filesystem type is ext2fs, partition type 0x83
What do I need to do to get this now to 0xfd
That is quite strange, 0x83 is the standard Linux partition system's ID, whilst SME si being built over raid partition, this should be 0xfd.
The partition table dump you have done earlier (sfdisk -d) should have provided the correct one ...
Cross check which type are both of your disk # fdisk -l /dev/sd?
Fixing this can be done through the fdisk utility (toggle partition system's ID).
As usual, partition operation have to be done carefully and once data backup were performed.
And what is wrong with doing this through the console? It does all of that plus a little more.
I am not familiar with the console usage, apologize. Could you tell us more about this "little more" ?
G.
-
This is how the 2 disks are reported by fdisk
[root@intermet1 ~]# fdisk -l /dev/sda
Disk /dev/sda: 320.0 GB, 320071851520 bytes
255 heads, 63 sectors/track, 38913 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 1 13 104391 83 Linux
/dev/sda2 14 38913 312464250 83 Linux
[root@intermet1 ~]# fdisk -l /dev/sdb
Disk /dev/sdb: 320.0 GB, 320072933376 bytes
255 heads, 63 sectors/track, 38913 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 * 1 13 104391 fd Linux raid autodetect
/dev/sdb2 14 38913 312464250 fd Linux raid autodetect
[root@intermet1 ~]#
-
not a correct state :(
backup your datas
use fdsik on /dev/sda to change your partition system'sID to fd
G.
-
not a correct state :(
backup your datas
use fdsik on /dev/sda to change your partition system'sID to fd
G.
How hard would it be to remove SDA from the raid again, change the ID then add back to raid, that would keep the data safe enough.
-
sits
to answer your question, removing disk from array : # mdadm /dev/md1 -f /dev/sda1 -r /dev/sda1
# mdadm /dev/md2 -f /dev/sda2 -r /dev/sda2
correct your system ID fdisk /dev/sda) , and join back
# mdadm -a /dev/md1 /dev/sda1
# mdadm -a /dev/md2 /dev/sda2
somehow I wouldn't have split the array (I do not neither understand how you reach that state) ... but I might be wrong and I have NEVER try this.
Backup your system first
Anyone else has any advise ?
G.
-
Anyone else has any advise ?
I think Shad has already told him how to fix the problem earlier on in the thread.. Run some diagnostics on the disk to make sure that it's not faulty, then boot with both disks in the machine and go into the console (su admin) and select option 5 - manage disk redundancy.
Lloyd
-
IDKeen
Try reading a bit, and you would see that option 5 only gives me a report, and then when you press next it just goes back to the Admin menu, not to any other screen, to do anything with the array.
-
and you would see that option 5 only gives me a report
Yes I saw that, but if you had read the forums you would also know that the recommended procedure here is to report that to the bug tracker. You first posted this on 7/3 - it is now the 16/3. My guess is if you had taken this to the bug tracker straight away, it may have been sorted out on the a lot earlier.
Lloyd
-
Sits,
Have you tried running the "add_drive_to_raid" script from the command line:
#/sbin/e-smith/add_drive_to_raid sda
I think there may be an issue where the script may not complete on the first run, so if it doesn't apear to work try running the script again and report what happens.
Lloyd
-
Sits,
Have you tried running the "add_drive_to_raid" script from the command line:
#/sbin/e-smith/add_drive_to_raid sda
I think there may be an issue where the script may not complete on the first run, so if it doesn't apear to work try running the script again and report what happens.
Lloyd
No I havn't tried that, but currently SDA is part of the raid it just has the wrong ID
If I run this script will it change the drive ID to the correct one, while this drive is already a raid member.
-
If I run this script will it change the drive ID to the correct one
It should do, I'm fairly sure that this script runs when you select option 5 from the console and is designed to setup the partitioning the correct way.
while this drive is already a raid member
I'm not sure, you'll have to try it. If it fails you may have to manually fail the drive from the array or wipe the drive. I would wipe the drive first then run the script twice and see if it works.
Lloyd
-
OK i will give it a go after the full backup is complete
-
OK it's all going ok now.
sorry it took so long to get back to you but had to go interstate for a few days.
I had ro remove SDA from the array, and then run the
sbin/e-smith/add_drive_to_raid sda
command twice before it was accepted correctly.
Thanks for your help guys