Hardware failure SME server not accessible anymore

janet

4,812
+0/-0

Re: Hardware failure SME server not accessible anymore

« Reply #15 on: October 23, 2010, 06:43:16 AM »

twijtzes

Quote

I haven't done anything but put in a SME install disk in the CD drive and choose rescue.
Can you please explain how I could delete remaining data when using the CD, as I have done that...

Do you really mean to ask that question, it does not make sense to me.
Please explain better what it is that you want to do.

« Last Edit: October 23, 2010, 06:54:46 AM by mary »

Logged

Please search before asking, an answer may already exist.
The Search & other links to useful information are at top of Forum.

twijtzes

47
+0/-0

Re: Hardware failure SME server not accessible anymore

« Reply #16 on: October 23, 2010, 06:59:40 AM »

So far,

I downloaded the latest 7.5 ISO and burned it to a CD. Booted the server with this new CD and chose rescue. I did this one week ago. It has eversince been displaying the text loading SCSI driver and something with cciss driver loading. There has been a huge amount of disk activity, but the boot seems to have stopped.

I have a replacement HP DL380 G4 for the old server which is also running SME server 7.5. And this new one is running like sunshine.

I would like to recover my old data, don't care about the old server (would be nice if it survived). I have backups of all the databases, as these were/are stored on two different locations, one in a DAR backup, the other in a MySql backup (run on a client computer) on a different USB drive. These were restored again.

I could, as you suggest, take out the SCSI drives from the old server and connect them externally to the new SME box. However, I have the feeling that this would permanently kill the old server, or am I wrong?

What would be the best way to take the disks out and mount them externally ?

« Last Edit: October 23, 2010, 07:09:55 AM by twijtzes »

Logged

janet

4,812
+0/-0

Re: Hardware failure SME server not accessible anymore

« Reply #17 on: October 23, 2010, 09:09:31 AM »

twijtzes

Your situation is difficult to accurately diagnose remotely, especially with the very little information you have provided.

If your server will not boot to the SME CD, then I assume there must be a hardware issue that is stopping it from booting up. Maybe the motherboard is faulty, maybe the drive controller card is faulty, maybe something else ? Perhaps you should take the system to a technician to determine if there is a hardware fault if you are unable to determine that yourself.

Again I'm guessing a possibility is that maybe when you put the replacement drive in, and started it rebuilding the array, perhaps the system wiped out any data, ie it resynced to a blank drive. It's hard to tell what has happened at this stage.

Are the drives using hardware RAID1 ie not software RAID1. If so then you need to keep the drives connected to that specific (functional) drive controller for the system to work correctly.
If you are using software RAID1 then you should be able to remove the drives and connect them to a machine with similar CPU type and if the drives are OK and contain data, then the server should boot up OK.

As I have said already, anything you do with the original drives adds the high risk that something will go wrong and destroy whatever data still remains on them. Typically you would do a bare metal disk clone copy using the dd command or similar, or compatible cloning software. That way you can do testing on the copied drives to see what data still exists, without tampering with the original drives.

If you want to play "slightly" dangerously, and taking great care, then you can put one of the original drives into the good server and mount it eg as /dev/sdc
Then you can use the working system to interrogate the drive. You can then attach another known good blank drive and copy the valuable data to it.

I think the viability of doing these steps will depend on whether the drive(s) are only readable when connected to their proprietary hardware disk controller.

See the various Howtos about disks which will assist you with testing, using, mounting, rebuilding etc, some are applicable, some will have ideas & techniques you can use, eg
http://wiki.contribs.org/AddExtraHardDisk
http://wiki.contribs.org/AddExtraHardDisk_-_SCSI
http://wiki.contribs.org/Disk_Manager
http://wiki.contribs.org/Booting
http://wiki.contribs.org/Monitor_Disk_Health
http://wiki.contribs.org/Raid
http://wiki.contribs.org/Raid:LSI_Monitoring
http://wiki.contribs.org/Raid:Manual_Rebuild
http://wiki.contribs.org/Recovering_SME_Server_with_lvm_drives
http://wiki.contribs.org/USBDisks

If one of your drives is readable in another (cleanly installed OS) machine, then you could try using this Howto to recover eveyrything
http://wiki.contribs.org/UpgradeDisk

Honestly if you are unsure about how to do all the above, then take the equipment to a Linux expert and get him/her to see if your data is recoverable.

Quote

I could, as you suggest, take out the SCSI drives from the old server and connect them externally to the new SME box. However, I have the feeling that this would permanently kill the old server, or am I wrong? What would be the best way to take the disks out and mount them externally ?

I do not think doing that that will "kill" the drives, it may break the RAID array, but the data should still be intact on each disk, the real issue may be that the drive is not readable as it needs to be connected to the specific controller card (if it is hardware RAID). I rarely deal with hardware RAID so cannot comment more.

Try connecting an original drive from the faulty server, to sdc (or the next spare port) on the other good server. This assumes sda & sdb are already being used for the RAID1 software array, so adjust accrdingly.

Verify what drives connected with
fdisk -l |more
Verify details of the old drive
fdisk -l /dev/sda
make a mount point eg
mkdir -p /mnt/olddrive
mount the drive
mount /dev/sdc1 /mnt/olddrive
For your drive it might be sdc2 or sdc3 depending on the drive setup

Use Linux commnads or mc (midnight commander) to read the old drive and see what the contents are eg
ls -al /mnt/olddrive
If OK you should see the directory listing of a typical sme server
Copy files and data etc, see this howto for what is in a normal backup
http://wiki.contribs.org/Backup_server_config#Standard_backup_.26_restore_inclusions

Then unmount teh drive
umount /mnt/olddrive

Then if all went well and depending what and where you copied data etc, do on the new server
signal-event post-upgrade
reboot

All the above is generic advice, your specific situation may require the advice to be adjusted or differ to suit.

« Last Edit: October 23, 2010, 09:12:53 AM by mary »

Logged

Please search before asking, an answer may already exist.
The Search & other links to useful information are at top of Forum.

twijtzes

47
+0/-0

Re: Hardware failure SME server not accessible anymore

« Reply #18 on: October 23, 2010, 09:45:10 AM »

First thing to to then is to stop the old server; i'll do that right away.

As the disks are hot swap hardware raid; i need to figure out if there is a hardware malfunction; if so i will repair the old server

What about the 8.0 beta from CD option ?

Logged

janet

4,812
+0/-0

Re: Hardware failure SME server not accessible anymore

« Reply #19 on: October 23, 2010, 09:57:51 AM »

twijtzes

Quote

What about the 8.0 beta from CD option ?

What about it ? You are only introducing another variable if you try using the SME 8 CD.

Supposedly your system was running OK with SME7.x so it should run the 7.5 CD OK.

Better that you test your SME7.5 CD on another server box, to see if the system will boot up to that disk OK (ie the same CD you tried to boot to for the last week).

« Last Edit: October 23, 2010, 10:01:47 AM by mary »

Logged

Please search before asking, an answer may already exist.
The Search & other links to useful information are at top of Forum.

twijtzes

47
+0/-0

Re: Hardware failure SME server not accessible anymore

« Reply #20 on: October 23, 2010, 10:59:20 AM »

I missed your last post, sorry

Booted with the SME 8 disk and gues what everything is still there !!!!!! Thank god, and you guys. Where can I send the wine/beer/whiskey; please send me an e-mail. My e-mail can be found under my name

How do I get the data from the box on to a USB disk ?

Taco

Logged

janet

4,812
+0/-0

Re: Hardware failure SME server not accessible anymore

« Reply #21 on: October 23, 2010, 11:13:05 AM »

twijtzes

Quote

How do I get the data from the box on to a USB disk ?

There are various instructions and links to instructions on my earlier posts to steer you in the right direction re how to do that.

In short
connect USB drive
mount drive
copy data
unmount drive
Then use USB to restore data to another server.

Actually it is probably best/easiest if you do an actual backup.
Boot to CD
at command prompt type
console
select the backup option
connect USB when asked
wait for backup to finish (can take from an hour to quite a few hours depending on how much data is on your system, say 8hrs for 350Gb)

On the good server restore from the USB
Normally done to a clean install of the OS on first boot, when you are given the option to restore from a backup
Otherwise if your system is clean (ie no data or users etc) then follow these instructions to reset the backup on first boot switch
http://wiki.contribs.org/Backup_server_config#Restore_on_initial_reboot_after_fresh_OS_install_-_How_to_Reset_option

Logged

Please search before asking, an answer may already exist.
The Search & other links to useful information are at top of Forum.

twijtzes

47
+0/-0

Re: Hardware failure SME server not accessible anymore

« Reply #22 on: October 25, 2010, 10:25:38 AM »

Hello All,

Thanks so far, You may have noticed that I don't know anything about LINUX. So I'm stuck again!

I've started up using the boot CD of version 8 in rescue mode.

When ít's there, starting up the MC doesn't work. Maybe because I'm running version 8 on a version 7 installed system. The system gives a segmentation fault.

I've tried to run console in rescue mode; however it says command not found

I'm trying to mount a USB disk in the version 8 rescue mode, however so far unsuccesful. The USB disk is ext3 and is tested ok on another server. I followed the contrib on wiki.contribs.org/usbdisks; the part that deals with version SME 8, but when i want to run the commandline

Code: [Select]

hal-find-by-property --key volume.fsusage --string filesystemI also get a command not found

Now, I have realy no clue where to start. My preferable solution would be to mount a USB disk and copy the data in /mnt/sysimage/home/e-smith to the USB disk. What should I do ?

« Last Edit: October 25, 2010, 11:29:31 AM by twijtzes »

Logged

janet

4,812
+0/-0

Re: Hardware failure SME server not accessible anymore

« Reply #23 on: October 25, 2010, 12:39:52 PM »

twijtzes

A search on
segmentation fault
shows many results, one is
http://forums.contribs.org/index.php/topic,31381.msg131933.html#msg131933
which advises you have either a hardware problem (especially bad RAM), a software bug or disk corruption.
Have you tested your drives one at a time with the manufacturers disk diagnostic utility ?

All the tasks you tried are not working, so this is suggestive of a common problem

I think you should test your system more thoroughly and you should really stop using it until you can prove that the various system components are working correctly. By continuing to use it with you may actually end up erasing data that is still on the drives.

You have been given suggestions in earlier posts re mounting the disks on another known working server, try that.

Logged

Please search before asking, an answer may already exist.
The Search & other links to useful information are at top of Forum.

twijtzes

47
+0/-0

Re: Hardware failure SME server not accessible anymore

« Reply #24 on: October 25, 2010, 01:18:28 PM »

Hi Mary,

I've installed SME server 7 and 8 on four different computers now. After a clean, successful setup I boot from de CD in linux rescue mode or SME rescue mode (For SME 7.5.1). Midnight Commander never starts in rescue mode. Neither in version 8.0 nor in 7.5.1 nor 7.5.

The cp command will do the trick anyway, so there is no real problem to be anticipated.

I would like to know how I can mount a USB disk in commandline mode under SME server 8.0 beta while running in rescue mode. I've read that this is not the same as in version 7.5.1. I don't succeed in 7.5.1 either (in rescue mode) /etc/fstab is allways missing. The mount command shows a usb disk, however mounting it is a complete different chapter.

Logged

Stefano

10,894
+3/-0

Re: Hardware failure SME server not accessible anymore

« Reply #25 on: October 25, 2010, 02:05:16 PM »

boot your server with SMe's cd in rescue mode with external usb disk connected..

you should see it after boot process end.

then mount it with

Code: [Select]

mount /dev/sdX1 /mnt/yourmountpoint

where sdX and yourmountpoint should be self explaining

then you can copy your data with cp -a or rsync

NOTE: not tested, should work

Logged

janet

4,812
+0/-0

Re: Hardware failure SME server not accessible anymore

« Reply #26 on: October 25, 2010, 02:21:30 PM »

twijtzes

Quote

I would like to know how I can mount a USB disk in commandline mode under SME server 8.0 beta while running in rescue mode. I've read that this is not the same as in version 7.5.1. I don't succeed in 7.5.1 either (in rescue mode) /etc/fstab is allways missing. The mount command shows a usb disk, however mounting it is a complete different chapter.

Commands already provided in post #17 of this thread. Please read the answers you are given.

Logged

Please search before asking, an answer may already exist.
The Search & other links to useful information are at top of Forum.