Koozali.org: home of the SME Server
Obsolete Releases => SME Server 7.x => Topic started by: Justin.Wilkinson on December 31, 2010, 10:25:24 AM
-
I have been getting an error message when using my SME RAID
The initial error message occurred when I tried copying a file to my server via my LAN. The error returned roughly stated:
'there is not enough space for the file’
Only 456G of a 1.5T RAID1 mirrored server is currently used so I could not understand why the error was returned. I have not set any Quotas either. I tried to set quotas in my first try at fixing the problem but the process fails. The problem seems to also have been getting worse. I can't copy files from the drives, and now can't even see anything other than the top directory when I access via the LAN. I believe this is because the system thinks the discs are full.
Background
About 3 months ago I upgraded from RAID 1 (2 X 320 Gig drives) to RAID 1 (2 X 1.5T drives). I was also switching from PATA to SATA drives and I could not configure the old hardware to allow me to run one PATA and one SATA drive at the same time and so could not use the usual grow method to upgrade my harddrive size. For this reason I was forced to:
a. Copy all saved files to an external 1.5T drive
b. Set up SME on a single 1.5T drive (as a clean install from an install disc)
c. Copy the saved files back to the single 1.5T drive I set up in b. above
d. Add a second 1.5T drive the SME installation above and wait for the automated mirroring process to be performed.
Things had been running fine for a month or so prior to getting this error. I assume this is because the system was OK until I increased my disc usage from 320GB to 456GB when the problem started.
I have also noticed that when booting the server, a number of boot operations now fail that did not prior to the issue occurring. I get a series of messages saying quotas can't be set, and one message in particular stating:
'Can't read lock file etc/mtab: 799 Read only File System (use -n flag to override)'
Unfortunately I am new to this and can't perform anything other than command line by rote - ie exactly what I am told/read.
I tried running the mdadm grow code using the following command lines but they returned what I took to be errors which I failed to record at the time:
lvresize -1 +100%FREE main/root
ext2online -C0 /dev/main/root
The SME bugtrackers suggested I boot from the rescue CD and try running a couple of commands. What occurred is outlined below.
Upon booting from the SME CD in rescue mode the system returned the following error just after boot:
"Your system is mounted under the /mnt/sysimage directory. When finished please exit from the shell and your system will reboot.
-/bin/sh-3.00# Traceback (most recent call last):
File "/usr/bin/anaconda", line 607, in ?
File "/var/tmp/anaconda - 10.1.1.103//usr/lib/anaconda/rescue.pyy" line 450, in runRescue
File "/var/tmp/anaconda - 10.1.1.103//usr/lib/anaconda/rescue.pyy" line 66, in MakeMtab IOError: [Errno 5] Input/Output error: '/etc/mtab'
I then ran df -h and it returned the following:
Filesystem Size Used Avail Use% Mounted on
rootfs 6.0M 5.0M 668K 89% /
/temp/loop0 174M 174M 0 100% /mnt/runtime
/dev/main/root 457G 434G 0 100% /mnt/sysimage
/tmp/md1 99M 16M 78M 17% /mnt/sysimage/boot
dev/root.old 6.0M 5.0M 668K 89% /mnt/sysimage/dev
-/bin/sh-3.00#
I then ran cat /proc/mdstat and it returned the following:
Personalities : [raid0] [raid1] [raid5] [raid6]
md1 : active raid1 sdb1[1] sda1[0] 104320 blocks [2/2] [UU]
md2 : active raid1 sdb2[1] sda2[0] 488303616 blocks [2/2] [UU]
The df –h data returned would seem to suggest that the dev/main/root file has maxed out at 457G for some reason. The drives are definitely seen as 1.5T in the system BIOS.
I have now also run a couple of checks on the LVM which returned some data suggesting it is set up for 465.68 Gig of space only.
When I ran the PVDisplay it returned:
PV Name /dev/md2
VG Name main
PV Size 465.86GB / not usable 26.81MB
Allocatable yes(but full)
PE Size(Kbyte) 32768
Total PE 14901
Free PE 0
Allocated PE 14901
PV UUID LrbK82-biTF-k1EJ-1Nuc-0rpj-UFFr-gvNCzo
I then ran VGDispaly and it returned:
VG Name main
System ID format lvm2
Metadata areas 1
Metadata Sequence No 5
VG Access Read/Write
VG Status resizable
Max LV 0
Curr LV 2
Open LV 2
Max PV 0
Curr PV 1
Act PV 1
VG Size 465.66GB
PE Size 32.00MB
Total PE 14901
Alloc PE/Size 14901/465.66GB
Free PE Size 0/0
VG UUID 2iPFLx-4In9-P7b3-rAS4-nzG1-qa1p-jiZZjp
Finally I ran the LVDisplay command and it returned:
Logical Volume
LV Name /dev/main/root
VG Name main
LV UUID Ufivba-C8Bu-hEvF-I1dQ-wh2j-ZAHD-QgOP8r
LV write access read/write
LV Status available
# open 1
LV Size 463.72GB
Current LE 14839
Segments 1
Allocation Inherit
Read ahead sectors auto
-currently set to 256
Block Device 253:0
Logical Volume
LV Name /dev/main/swap
VG Name main
LV UUID UcXstF-caQh-IDiB-0XoF-Xj33-mnvh-1CfUKj
LV write access read/write
LV Status available
# open 1
LV Size 1.94GB
Current LE 62
Segments 1
Allocation Inherit
Read ahead sectors auto
-currently set to 256
Block Device 253:1
All this seems to suggest that the system is not seeing the full disk space available.
Does anyone know what commands I may be able to run to make the LVM partition use the approx. terabyte of space available to the system so that it works?
-
Show us the output of:
fdisk -l /dev/sd[ab]
-
Show us the output of:
fdisk -l /dev/sd[ab]
Disk /dev/sda: 1500.3GB 1500300828160 Bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = Cylindes of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 fd Linux RAID autodetect
/dev/sda2 14 60804 488303707+ fd Linux RAID autodetect
Disk /dev/sdb: 1500.3GB 1500300828160 Bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = Cylindes of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 * 1 13 104384 fd Linux RAID autodetect
Partition 1 does not end on cylinder boundary
/dev/sdb2 13 182401 1465031647 fd Linux RAID autodetect
So the system sees the full disk but it looks like there is a partitioning problem with /dev/sdb. Any ideas on how/if I can fix this or alternatively recover the lvm data?
-
Disk /dev/sda: 1500.3GB 1500300828160 Bytes
dev/sda2 14 60804 488303707+ fd Linux RAID autodetect
Disk /dev/sdb: 1500.3GB 1500300828160 Bytes
/dev/sdb2 13 182401 1465031647 fd Linux RAID autodetect
This is where the issues is. Raid-1 will take the smaller of the two partitions and that will be the maximum size you will achieve. Because sda is 488G that is the max space you have available. I'd guess that sdb is the drive you added after setting up the system. If you have an extra 1.5T drive and can replace sda with that drive and add it back into the array then the directions on the wiki will apply and you will be able to expand you useable space to about 1.4T of usable space.
-
This is where the issues is. Raid-1 will take the smaller of the two partitions and that will be the maximum size you will achieve. Because sda is 488G that is the max space you have available. I'd guess that sdb is the drive you added after setting up the system. If you have an extra 1.5T drive and can replace sda with that drive and add it back into the array then the directions on the wiki will apply and you will be able to expand you useable space to about 1.4T of usable space.
I have an extra 1.5TB drive , however I used it earlier to create a spare RAID disk as backup. It and the sda drive both appear to have had their characteristics modified by SME so that no matter what I do they can't be recognised as 1.5T disks on any computer or system I place them in. For example I ran a Disk test for the extra - which is a Western Digital 1.5TB Caviar Green disk. It has had its serial number removed, its Firmware number has been deleted ( and I guess this may mean the firmware has been modified), it shows as having no 'SMART' functionality - though it normally would have, and is now showing as only having a 500GB capacity. sda has the same issues, but shows as only having 466GB capacity. How do I change the modifications to the disk characteristics that have been made by SME? Do these changes represent changes to the firmware of the drive? These discs are virtually brand new.
-
SME didn't (and can't) change anything you've accused it of. You appear to be having issues with your hardware that are outside of SME.
-
Justin,
What do you mean "It has had its serial number removed, its Firmware number has been deleted.."?
These should be on the label as well as on the disk itself. Check with the manufacturer's website just what a HDD with that serial number is? ITb? 500Gb?
Check if there is a disk management utility from the manufacturer to set the disk parameters - such as SATA 1 or 2 (or even 3), or disk capacity.
This definitely sounds like it is a HDD problem, either with the disks themselves or some incompatibility with your motherboard (though my experience with this has been they would not be recognised at all).
Where did you get the disks? A reputable source? Could they be 500Gb disks re-branded as 1Tb (There was a scam a few years ago with dodgy thumb drives being tricked into reporting a larger size and working OK until you tried to exceed the real capacity - eg 256Mb being sold as 1Gb. (told you it was a few years ago....)
Then again, did you accidentally format the disks to have a 500Gb partition? see http://wiki.contribs.org/AddExtraHardDisk for formatting the disk.
There, that gives you a few more rabbit burrows to chase down......
Cheers and Good Luck.
Ian
-
Justin,
What do you mean "It has had its serial number removed, its Firmware number has been deleted.."?
These should be on the label as well as on the disk itself. Check with the manufacturer's website just what a HDD with that serial number is? ITb? 500Gb?
Check if there is a disk management utility from the manufacturer to set the disk parameters - such as SATA 1 or 2 (or even 3), or disk capacity.
This definitely sounds like it is a HDD problem, either with the disks themselves or some incompatibility with your motherboard (though my experience with this has been they would not be recognised at all).
Hi - the drive details are on the outside of the drive, but when I ran a Western Digital 'drive test' application it was unable to extract the serial number, SMART status or correct drive size. When I first obtained the drive i showed as being 1.5TB. Since my last post and running the WD test app the drive has compelely failed and will not mount on any system. I don't know what is wrong with it, but it is covered by a warranty and I have arranged for its replacement. This now leaves me with the two original seagate drives that I set my RAID up with. Of these, one only shows as having 466GB capacity in the RAID when it is a 1.5TB drive (sda) (and shows as such in the BIOS), while the other shows as having 1.5TB of space in the RAID, but also has a problem with overlapping partitions - (sdb). I have not run any partition editing apps so his has somehow occurred during the 'normal' running of the system or after I ran the MDADM grow process. Either way what I am experiencing seems to have some hardware aspect to it though I have not been able to figure out what. I tried booting off the sdb drive alone but the system won't boot.
I think I might try booting off the rescue disk and see if I can delete some of the files from the sda drive alone. Perhaps it is the existence of a different issue on each of the drives that is causing the problem. What worries me is the existence of the issues at all. I can't imagine what created them.
Where did you get the disks? A reputable source? Could they be 500Gb disks re-branded as 1Tb (There was a scam a few years ago with dodgy thumb drives being tricked into reporting a larger size and working OK until you tried to exceed the real capacity - eg 256Mb being sold as 1Gb. (told you it was a few years ago....)
Then again, did you accidentally format the disks to have a 500Gb partition? see http://wiki.contribs.org/AddExtraHardDisk for formatting the disk.
There, that gives you a few more rabbit burrows to chase down......
Cheers and Good Luck.
Ian