Koozali.org: home of the SME Server

RAID 1 Problem

Offline jimgoode

  • **
  • 40
  • +0/-0
RAID 1 Problem
« on: April 07, 2009, 08:39:38 PM »
I have a 6.0.1-01 raid 1 problem and am trying to find a way to correct it without doing a fresh install. As soon as I resolve this issue I plan to upgrade to 7.0 and then to 7.4 (which I have tested on a separate computer).

When we upgraded from 5.6 to 6.0.1-01 we decided to use raid 1 (which I think was a new option at that time). Our primary disk was a Seagate PATA 300 GB drive and we installed an identical drive along side it before doing the upgrade. The drives were connected as /dev/hda and /dev/hdb (which turned out to be a bad idea). After the upgrade and several days of successful running, we noticed that the primary drive only reported it's capacity as 32 GB, not 300. After some research, we discovered that the initial raid setup should have had the 2 drives on separate controllers (/dev/hda and /dev/hdc). We moved the second drive from hdb to hdc but the problem still exists.

I have attached excerpts from the messages log and several other commands which seem to indicate: 1) /dev/hda only shows 32 GB capacity; 2) /dev/hdc shows 300 GB capacity; 3) the system does not really recognize a working raid 1 configuration (it recognizes both drives but doesn't seem to utilize the second drive as expected).

Anyone. Am I interpreting the messages log correctly? Did the original 6.0.1-01 upgrade cause the original problem or am I dealing with multiple problems? Can it be corrected without a complete reinstall?

Here are my thoughts. I would like to be able to...
1) disable raid 1 and boot on a single disk1 (/dev/hda)
2) repartition and reformat the disk2 300 GB drive
3) copy all contents from /dev/hda (disk1) to /dev/hdc (disk2)
4) boot on the single 300 GB (disk2)
5) repartition and reformat the original disk1
6) enable raid 1 and have the system recognize both drives as 300 GB
7) then proceed with the upgrades

I've tried to find and read everything on the wiki about raid issues. I'm not sure how much of what is there applies only to 7.4 and how much might also apply to an earlier version. So, I almost need to hear from someone who is very familiar with raid 1 on the several releases between 6 and 7.4.

TIA,
Jim

[snip] from messages----------------------------------------------------------------------------------------------------
Apr  7 09:52:26 s01 kernel: Uniform Multi-Platform E-IDE driver Revision: 7.00beta3-.2.4
Apr  7 09:52:26 s01 kernel: ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
Apr  7 09:52:26 s01 kernel: NFORCE2: IDE controller at PCI slot 00:09.0
Apr  7 09:52:26 s01 kernel: NFORCE2: chipset revision 162
Apr  7 09:52:26 s01 kernel: NFORCE2: not 100%% native mode: will probe irqs later
Apr  7 09:52:26 s01 kernel: AMD_IDE: Bios didn't set cable bits corectly. Enabling workaround.
Apr  7 09:52:26 s01 kernel: AMD_IDE: Bios didn't set cable bits corectly. Enabling workaround.
Apr  7 09:52:26 s01 kernel: AMD_IDE: PCI device 10de:0065 (nVidia Corporation) (rev a2) UDMA100 controller on pci00:09.0
Apr  7 09:52:26 s01 random: Initializing random number generator:
Apr  7 09:52:26 s01 kernel:     ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:DMA
Apr  7 09:52:26 s01 kernel:     ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:DMA
Apr  7 09:52:26 s01 kernel: hda: ST3300831A, ATA DISK drive
Apr  7 09:52:26 s01 kernel: blk: queue c0371d00, I/O limit 4095Mb (mask 0xffffffff)
Apr  7 09:52:26 s01 kernel: hdc: ST3300831A, ATA DISK drive
Apr  7 09:52:26 s01 kernel: hdd: IDE-CD R/RW 24x12A, ATAPI CD/DVD-ROM drive
Apr  7 09:52:26 s01 random: Initializing random number generator:  succeeded
Apr  7 09:52:27 s01 kernel: blk: queue c0372164, I/O limit 4095Mb (mask 0xffffffff)
Apr  7 09:52:27 s01 random:
Apr  7 09:52:27 s01 kernel: ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Apr  7 09:52:27 s01 kernel: ide1 at 0x170-0x177,0x376 on irq 15
Apr  7 09:52:27 s01 kernel: hda: attached ide-disk driver.
Apr  7 09:52:27 s01 kernel: hda: host protected area => 1
Apr  7 09:52:27 s01 kernel: hda: setmax_ext LBA 586072368, native  66055244
Apr  7 09:52:27 s01 kernel: hda: 66055244 sectors (33820 MB) w/8192KiB Cache, CHS=4111/255/63, UDMA(100)
Apr  7 09:52:27 s01 rc: Starting random:  succeeded
Apr  7 09:52:27 s01 kernel: hdc: attached ide-disk driver.
Apr  7 09:52:27 s01 kernel: hdc: host protected area => 1
Apr  7 09:52:27 s01 kernel: hdc: 586072368 sectors (300069 MB) w/8192KiB Cache, CHS=36481/255/63, UDMA(100)
Apr  7 09:52:27 s01 kernel: ide-floppy driver 0.99.newide
Apr  7 09:52:27 s01 kernel: Partition check:
Apr  7 09:52:27 s01 kernel:  hda: hda1 hda2 hda3
Apr  7 09:52:27 s01 kernel:  hdc: hdc1 hdc2 hdc3
Apr  7 09:52:27 s01 kernel: ide-floppy driver 0.99.newide
Apr  7 09:52:27 s01 kernel: md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
Apr  7 09:52:27 s01 kernel: md: Autodetecting RAID arrays.
Apr  7 09:52:27 s01 kernel:  [events: 0000007e]
Apr  7 09:52:27 s01 last message repeated 2 times
Apr  7 09:52:27 s01 kernel:  [events: 0000005f]
Apr  7 09:52:27 s01 last message repeated 2 times
Apr  7 09:52:27 s01 kernel: md: autorun ...
Apr  7 09:52:27 s01 kernel: md: considering hdc3 ...
Apr  7 09:52:27 s01 kernel: md:  adding hdc3 ...
Apr  7 09:52:27 s01 kernel: md:  adding hda3 ...
Apr  7 09:52:27 s01 kernel: md: created md2
Apr  7 09:52:27 s01 kernel: md: bind<hda3,1>
Apr  7 09:52:27 s01 kernel: md: bind<hdc3,2>
Apr  7 09:52:27 s01 kernel: md: running: <hdc3><hda3>
Apr  7 09:52:27 s01 kernel: md: hdc3's event counter: 0000005f
Apr  7 09:52:27 s01 kernel: md: hda3's event counter: 0000007e
Apr  7 09:52:27 s01 kernel: md: superblock update time inconsistency -- using the most recent one
Apr  7 09:52:27 s01 kernel: md: freshest: hda3
Apr  7 09:52:27 s01 kernel: md: kicking non-fresh hdc3 from array!
Apr  7 09:52:27 s01 kernel: md: unbind<hdc3,1>
Apr  7 09:52:27 s01 kernel: md: export_rdev(hdc3)
Apr  7 09:52:27 s01 kernel: md: RAID level 1 does not need chunksize! Continuing anyway.
Apr  7 09:52:27 s01 kernel: kmod: failed to exec /sbin/modprobe -s -k md-personality-3, errno = 2
Apr  7 09:52:27 s01 kernel: md: personality 3 is not loaded!
Apr  7 09:52:27 s01 kernel: md :do_md_run() returned -22
Apr  7 09:52:27 s01 kernel: md: md2 stopped.
Apr  7 09:52:27 s01 kernel: md: unbind<hda3,0>
Apr  7 09:52:27 s01 kernel: md: export_rdev(hda3)
Apr  7 09:52:27 s01 kernel: md: considering hdc2 ...
Apr  7 09:52:27 s01 kernel: md:  adding hdc2 ...
Apr  7 09:52:27 s01 kernel: md:  adding hda2 ...
Apr  7 09:52:27 s01 kernel: md: created md1
Apr  7 09:52:27 s01 kernel: md: bind<hda2,1>
Apr  7 09:52:27 s01 kernel: md: bind<hdc2,2>
Apr  7 09:52:27 s01 kernel: md: running: <hdc2><hda2>
Apr  7 09:52:27 s01 kernel: md: hdc2's event counter: 0000005f
Apr  7 09:52:27 s01 kernel: md: hda2's event counter: 0000007e
Apr  7 09:52:27 s01 kernel: md: superblock update time inconsistency -- using the most recent one
Apr  7 09:52:27 s01 kernel: md: freshest: hda2
Apr  7 09:52:27 s01 kernel: md: kicking non-fresh hdc2 from array!
Apr  7 09:52:27 s01 kernel: md: unbind<hdc2,1>
Apr  7 09:52:27 s01 kernel: md: export_rdev(hdc2)
Apr  7 09:52:27 s01 kernel: md: RAID level 1 does not need chunksize! Continuing anyway.
Apr  7 09:52:27 s01 kernel: kmod: failed to exec /sbin/modprobe -s -k md-personality-3, errno = 2
Apr  7 09:52:27 s01 kernel: md: personality 3 is not loaded!
Apr  7 09:52:27 s01 kernel: md :do_md_run() returned -22
Apr  7 09:52:27 s01 kernel: md: md1 stopped.
Apr  7 09:52:27 s01 kernel: md: unbind<hda2,0>
Apr  7 09:52:27 s01 kernel: md: export_rdev(hda2)
Apr  7 09:52:27 s01 kernel: md: considering hdc1 ...
Apr  7 09:52:27 s01 kernel: md:  adding hdc1 ...
Apr  7 09:52:27 s01 kernel: md:  adding hda1 ...
Apr  7 09:52:27 s01 kernel: md: created md0
Apr  7 09:52:27 s01 kernel: md: bind<hda1,1>
Apr  7 09:52:27 s01 kernel: md: bind<hdc1,2>
Apr  7 09:52:27 s01 kernel: md: running: <hdc1><hda1>
Apr  7 09:52:27 s01 kernel: md: hdc1's event counter: 0000005f
Apr  7 09:52:27 s01 kernel: md: hda1's event counter: 0000007e
Apr  7 09:52:27 s01 kernel: md: superblock update time inconsistency -- using the most recent one
Apr  7 09:52:27 s01 kernel: md: freshest: hda1
Apr  7 09:52:27 s01 kernel: md: kicking non-fresh hdc1 from array!
Apr  7 09:52:27 s01 kernel: md: unbind<hdc1,1>
Apr  7 09:52:27 s01 kernel: md: export_rdev(hdc1)
Apr  7 09:52:27 s01 kernel: md: RAID level 1 does not need chunksize! Continuing anyway.
Apr  7 09:52:27 s01 kernel: kmod: failed to exec /sbin/modprobe -s -k md-personality-3, errno = 2
Apr  7 09:52:27 s01 kernel: md: personality 3 is not loaded!
Apr  7 09:52:27 s01 kernel: md :do_md_run() returned -22
Apr  7 09:52:27 s01 kernel: md: md0 stopped.
Apr  7 09:52:27 s01 kernel: md: unbind<hda1,0>
Apr  7 09:52:27 s01 kernel: md: export_rdev(hda1)
Apr  7 09:52:27 s01 kernel: md: ... autorun DONE.
Apr  7 09:52:27 s01 kernel: NET4: Linux TCP/IP 1.0 for NET4.0
Apr  7 09:52:27 s01 kernel: IP Protocols: ICMP, UDP, TCP, IGMP
Apr  7 09:52:27 s01 keytable: Loading keymap:
Apr  7 09:52:27 s01 kernel: IP: routing cache hash table of 8192 buckets, 64Kbytes
Apr  7 09:52:27 s01 kernel: TCP: Hash tables configured (established 262144 bind 65536)
Apr  7 09:52:27 s01 kernel: Linux IP multicast router 0.06 plus PIM-SM
Apr  7 09:52:27 s01 kernel: NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
Apr  7 09:52:27 s01 kernel: RAMDISK: Compressed image found at block 0
Apr  7 09:52:27 s01 kernel: Freeing initrd memory: 133k freed
Apr  7 09:52:27 s01 kernel: VFS: Mounted root (ext2 filesystem).
Apr  7 09:52:27 s01 kernel: md: raid1 personality registered as nr 3
Apr  7 09:52:27 s01 kernel: Journalled Block Device driver loaded
Apr  7 09:52:27 s01 kernel: md: Autodetecting RAID arrays.
Apr  7 09:52:27 s01 kernel:  [events: 0000005f]
Apr  7 09:52:27 s01 kernel:  [events: 0000007e]
Apr  7 09:52:27 s01 kernel:  [events: 0000005f]
Apr  7 09:52:27 s01 kernel:  [events: 0000007e]
Apr  7 09:52:27 s01 kernel:  [events: 0000005f]
Apr  7 09:52:27 s01 kernel:  [events: 0000007e]
Apr  7 09:52:27 s01 kernel: md: autorun ...
Apr  7 09:52:27 s01 kernel: md: considering hda1 ...
Apr  7 09:52:27 s01 kernel: md:  adding hda1 ...
Apr  7 09:52:27 s01 kernel: md:  adding hdc1 ...
Apr  7 09:52:27 s01 kernel: md: created md0
Apr  7 09:52:27 s01 kernel: md: bind<hdc1,1>
Apr  7 09:52:27 s01 kernel: md: bind<hda1,2>
Apr  7 09:52:27 s01 kernel: md: running: <hda1><hdc1>
Apr  7 09:52:27 s01 kernel: md: hda1's event counter: 0000007e
Apr  7 09:52:27 s01 kernel: md: hdc1's event counter: 0000005f
Apr  7 09:52:27 s01 kernel: md: superblock update time inconsistency -- using the most recent one
Apr  7 09:52:27 s01 kernel: md: freshest: hda1
Apr  7 09:52:27 s01 kernel: md: kicking non-fresh hdc1 from array!
Apr  7 09:52:27 s01 kernel: md: unbind<hdc1,1>
Apr  7 09:52:27 s01 kernel: md: export_rdev(hdc1)
Apr  7 09:52:27 s01 kernel: md: RAID level 1 does not need chunksize! Continuing anyway.
Apr  7 09:52:27 s01 kernel: md0: max total readahead window set to 124k
Apr  7 09:52:27 s01 kernel: md0: 1 data-disks, max readahead per data-disk: 124k
Apr  7 09:52:27 s01 kernel: raid1: device hda1 operational as mirror 0
Apr  7 09:52:27 s01 kernel: raid1: md0, not all disks are operational -- trying to recover array
Apr  7 09:52:27 s01 kernel: raid1: raid set md0 active with 1 out of 2 mirrors
Apr  7 09:52:27 s01 kernel: md: updating md0 RAID superblock on device
Apr  7 09:52:27 s01 kernel: md: hda1 [events: 0000007f]<6>(write) hda1's sb offset: 104320
Apr  7 09:52:27 s01 kernel: md: recovery thread got woken up ...
Apr  7 09:52:27 s01 kernel: md0: no spare disk to reconstruct array! -- continuing in degraded mode
Apr  7 09:52:27 s01 kernel: md: recovery thread finished ...
Apr  7 09:52:27 s01 kernel: md: considering hda2 ...
Apr  7 09:52:27 s01 kernel: md:  adding hda2 ...
Apr  7 09:52:27 s01 kernel: md:  adding hdc2 ...
Apr  7 09:52:27 s01 kernel: md: created md1
Apr  7 09:52:27 s01 kernel: md: bind<hdc2,1>
Apr  7 09:52:27 s01 kernel: md: bind<hda2,2>
Apr  7 09:52:27 s01 kernel: md: running: <hda2><hdc2>
Apr  7 09:52:27 s01 kernel: md: hda2's event counter: 0000007e
Apr  7 09:52:27 s01 kernel: md: hdc2's event counter: 0000005f
Apr  7 09:52:27 s01 kernel: md: superblock update time inconsistency -- using the most recent one
Apr  7 09:52:27 s01 kernel: md: freshest: hda2
Apr  7 09:52:27 s01 kernel: md: kicking non-fresh hdc2 from array!
Apr  7 09:52:27 s01 kernel: md: unbind<hdc2,1>
Apr  7 09:52:27 s01 keytable: Loading keymap:  succeeded
Apr  7 09:52:27 s01 kernel: md: export_rdev(hdc2)
Apr  7 09:52:27 s01 keytable:
Apr  7 09:52:27 s01 kernel: md: RAID level 1 does not need chunksize! Continuing anyway.
Apr  7 09:52:27 s01 keytable: Loading system font:
Apr  7 09:52:27 s01 kernel: md1: max total readahead window set to 124k
Apr  7 09:52:27 s01 kernel: md1: 1 data-disks, max readahead per data-disk: 124k
Apr  7 09:52:27 s01 kernel: raid1: device hda2 operational as mirror 0
Apr  7 09:52:27 s01 kernel: raid1: md1, not all disks are operational -- trying to recover array
Apr  7 09:52:27 s01 kernel: raid1: raid set md1 active with 1 out of 2 mirrors
Apr  7 09:52:27 s01 kernel: md: updating md1 RAID superblock on device
Apr  7 09:52:27 s01 kernel: md: hda2 [events: 0000007f]<6>(write) hda2's sb offset: 32652032
Apr  7 09:52:27 s01 kernel: md: recovery thread got woken up ...
Apr  7 09:52:27 s01 kernel: md1: no spare disk to reconstruct array! -- continuing in degraded mode
Apr  7 09:52:27 s01 kernel: md0: no spare disk to reconstruct array! -- continuing in degraded mode
Apr  7 09:52:27 s01 kernel: md: recovery thread finished ...
Apr  7 09:52:27 s01 kernel: md: considering hda3 ...
Apr  7 09:52:27 s01 kernel: md:  adding hda3 ...
Apr  7 09:52:27 s01 kernel: md:  adding hdc3 ...
Apr  7 09:52:27 s01 kernel: md: created md2
Apr  7 09:52:27 s01 kernel: md: bind<hdc3,1>
Apr  7 09:52:27 s01 kernel: md: bind<hda3,2>
Apr  7 09:52:27 s01 kernel: md: running: <hda3><hdc3>
Apr  7 09:52:27 s01 kernel: md: hda3's event counter: 0000007e
Apr  7 09:52:27 s01 kernel: md: hdc3's event counter: 0000005f
Apr  7 09:52:27 s01 kernel: md: superblock update time inconsistency -- using the most recent one
Apr  7 09:52:27 s01 kernel: md: freshest: hda3
Apr  7 09:52:27 s01 kernel: md: kicking non-fresh hdc3 from array!
Apr  7 09:52:27 s01 kernel: md: unbind<hdc3,1>
Apr  7 09:52:27 s01 kernel: md: export_rdev(hdc3)
Apr  7 09:52:27 s01 kernel: md: RAID level 1 does not need chunksize! Continuing anyway.
Apr  7 09:52:27 s01 kernel: md2: max total readahead window set to 124k
Apr  7 09:52:27 s01 kernel: md2: 1 data-disks, max readahead per data-disk: 124k
Apr  7 09:52:27 s01 kernel: raid1: device hda3 operational as mirror 0
Apr  7 09:52:27 s01 kernel: raid1: md2, not all disks are operational -- trying to recover array
Apr  7 09:52:27 s01 kernel: raid1: raid set md2 active with 1 out of 2 mirrors
Apr  7 09:52:27 s01 kernel: md: updating md2 RAID superblock on device
Apr  7 09:52:27 s01 kernel: md: hda3 [events: 0000007f]<6>(write) hda3's sb offset: 264960
Apr  7 09:52:27 s01 kernel: md: recovery thread got woken up ...
Apr  7 09:52:27 s01 kernel: md2: no spare disk to reconstruct array! -- continuing in degraded mode
Apr  7 09:52:27 s01 kernel: md1: no spare disk to reconstruct array! -- continuing in degraded mode
Apr  7 09:52:27 s01 kernel: md0: no spare disk to reconstruct array! -- continuing in degraded mode
Apr  7 09:52:27 s01 kernel: md: recovery thread finished ...
Apr  7 09:52:27 s01 kernel: md: ... autorun DONE.
[end snip] from messages------------------------------------------------------------------------------------------------

mdstat------------------------------------------------------------------------------------------------------------------
[root@s01 /]# cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md2 : active raid1 hda3[0]
      264960 blocks [2/1] [U_]

md1 : active raid1 hda2[0]
      32652032 blocks [2/1] [U_]

md0 : active raid1 hda1[0]
      104320 blocks [2/1] [U_]

unused devices: <none>
[root@s01 /]#
mdstat------------------------------------------------------------------------------------------------------------------

Offline janet

  • *****
  • 4,812
  • +0/-0
Re: RAID 1 Problem
« Reply #1 on: April 08, 2009, 04:37:42 AM »
jimgoode

To fix your current problem using sme6.x do a backup, rebuild the sme6.x machine (and make sure the full drive size is used), then restore from the backup. Reinstall any add on contribs.


Your best approach to fixing and upgrading (for a whole variety of reasons) is to backup the sme6.x machine and restore to a newly built sme7.x machine.

There are a couple of approaches to this.

You can use the standard backup and restore tools (but you may run into backup & restore size issues), or use an add on backup contrib to overcome size issues. Backing up and restoring will take considerably longer time.

Probably the easiest approach is to use the CopyFromDisk method outlined here (and do the upgrade to sme7.x at the same time).
http://wiki.contribs.org/UpgradeDisk

The general approach would be:
Remove and erase the 300Gb drive and leave your sme6.x running in degraded mode on one drive.
Build your new machine using sme7.x and the clean 300Gb drive.
Mount the 32Gb sme6.x drive in the new sme7.x box.
Then restore from the sme6.x disk.
Keep in mind the correct procedure to follow ie steps to perform before removing the drive from the sme6.x machine, and issues with custom templates and incompatible contribs.

Then erase your 32Gb disk (which you say is really a 300Gb drive) and add it to the RAID array using the admin console in sme7.4

This article may also be of interest (but only applicable to sme7.x), although not really needed if you follow the CopyFromDisk method.
http://wiki.contribs.org/Raid



The problem with your drive being smaller than expected may be jumpers which limit it to 32Gb, check the physical drive, or perhaps your BIOS does not recognise it correctly (upgrade BIOS).

It also looks like your 300Gb drive (hdc) has not been correctly added to the RAID array ie only one drive is in the RAID array - only hda is present 32652032 blocks [2/1] [U_]

None of the above really matter if you follow the suggested steps to upgrade & restore.
Please search before asking, an answer may already exist.
The Search & other links to useful information are at top of Forum.