Koozali.org: home of the SME Server

how to repair a bad SME after problems

Jáder Marasca

how to repair a bad SME after problems
« on: August 02, 2002, 09:44:08 PM »
I´m installing SME 5.1.2 at home to be familiar with it before install on clients.
But now it´s second time I LOST /boot partition because a power failure or incorrect shutdown.
First time I reinstall it without think about nothing!
This time I do
* fsch.ext2 /dev/[hdd] -p -c -f -v
* e2fsck to repair a superblock (something about -b 8193...)
and repair freezed my computer...
Now I´m able to boot SME but have so many errors I´ll reformat it again.
But I  just would like to know how to proceed when Linux has problems with filesystem (a to do list - step by step) to avoid REINSTALL everything.

Right now it´s testing installation, but on clients...  
I read about to generate a reinstall disk (and with nighty backup) I would to be able to recover everything...
I suspect I should to boot from reinstall disk, install a new SME, restore last backup . It´s that?

Thanks!

Jáder

ralph

Re: how to repair a bad SME after problems
« Reply #1 on: August 02, 2002, 10:22:18 PM »
Hi Jáder,

try running fsck without the drive being mounted and WITHOUT -p ( which means you will have to type "yes" quite often ).

Hope it does,

ciao,

ralph

Jáder Marasca

Re: how to repair a bad SME after problems
« Reply #2 on: August 02, 2002, 11:24:44 PM »
Hi Ralph
> try running fsck without the drive being mounted and WITHOUT
> -p ( which means you will have to type "yes" quite often ).

I have tried to boot from RH72 CD on rescue mode. It automount FS on a temp dir (I can remember exact but I think is /mnt/sysmount ), and say to issue chroot /mnt/sysmount to change this FS to default FS. I do that!

I fsck w/o "p" but LEFT my finger on Y after a while... TOO MANY errors!

After a while I got lots of numbers scrolling on screen and computer had frozen.
So I reboot it! On /etc/lost+found I have 4 entries (all unusable) - 2 files and 2 dirs!
They was named #11,#12,#59 and #60 and have a lot of garbage inside!

I already had reformatted the HDD (was on my Thinkpad 770 - P233MMX) and installed SME again in it. Now I´m running Partition Magic to reduce SME partition to half and get space to use with W2K!
But I´m getting scared about FS problems in SME/Linux RH71 on client machines!
And reinstall from ZERO isn´t my preferred approuch... neither is a good way to fix thing ... clients didn´t like it! It´s too much time  off line!!!

What do you do when you have a serios problem with your /boot partition (where this happen to me twice!) ?

Thanks!

Kelvin

Re: how to repair a bad SME after problems
« Reply #3 on: August 03, 2002, 04:22:34 AM »
Hi Jader,

I'm sure already know this but I'll stress this point here. Invest in a supported UPS (APC ones will work with the apcupsd add-on but not USB ones) and a tape / drive backup system. The chances of a filesystem corruption due to powerfailures / improper shutdown are fairly great (I've personally experienced it a couple of times myself and a few times at other sites), sometimes requiring a full reinstall, other times not. It's one thing to corrupt datafiles or open documents when a system goes down during a powerfailure. It's a far greater hassle if your server won't boot because the FS got slightly screwed up ! FSCK does not always bring the FS back to operational status.

Hmm... in all these many years with Windoze servers, I've NEVER had this problem - lose open files yes, lose the server itself - no.

Just my 2 bits worth.

Kelvin

Jáder Marasca

Re: how to repair a bad SME after problems
« Reply #4 on: August 03, 2002, 06:03:23 AM »
Kelvin wrote:

> The chances of a filesystem corruption due to
> powerfailures / improper shutdown are fairly great

I´m very impressed how unstable is EXT2 when not proper shutdown!
I´m from Novell world, and his FS is very stable and have auto correct!
In almost 10 years I can remember to have FS lost twice!
 
> Hmm... in all these many years with Windoze servers, I've
> NEVER had this problem - lose open files yes, lose the server
> itself - no.
 
I´m sorry my english is too bad... I couldn´t understand... do you NEVER lost a FS on a Win Server? Your a lucky guy!
I have a couple of Win servers that NEVER go up again... by SEVERAL reasons.
I have to go back and reinstall them and restore backups.
I just was hoping EXT2 (and new EXT3 w/journaling) was MUCH MORE SAFE!

Thanks!

Tom Keiser

Re: how to repair a bad SME after problems
« Reply #5 on: August 03, 2002, 08:14:07 AM »
Here's what I do, for what it's worth:

1. Separate boot drive and data drive.
2. Mirror or Raid 5 the data drive, back it up remotely to tape    or to another raided disk.
3. Copy the boot drive to a spare boot drive. (to do this, I boot    a Win9 diskette, and run Powerquest's "drivecopy" software.    Both drives must be connected and running to do this copy; the    spare is disconnected after the copy is finished)

If the boot gets corrupted or hacked or just fails, then using an inexpensive removeable drive tray, swap in the good boot drive, and you are up and running again in minutes. Make another spare boot drive next time you can take the server down.

This is not the best procedure, but it works. What I'd rather see is an SME function to copy to a client hard disk everything *EXCEPT* the actual data on the ibays. This info would be easy to store on most any client hard drive, easy to restore, and fast, too, and would avoid the file size limitations under Windows so often complained of on this forum.

Just my $.02.


Jáder Marasca wrote:
>
> Kelvin wrote:
>
> > The chances of a filesystem corruption due to
> > powerfailures / improper shutdown are fairly great
>
> I´m very impressed how unstable is EXT2 when not proper
> shutdown!
> I´m from Novell world, and his FS is very stable and have
> auto correct!
> In almost 10 years I can remember to have FS lost twice!
>  
> > Hmm... in all these many years with Windoze servers, I've
> > NEVER had this problem - lose open files yes, lose the server
> > itself - no.
>  
> I´m sorry my english is too bad... I couldn´t understand...
> do you NEVER lost a FS on a Win Server? Your a lucky guy!
> I have a couple of Win servers that NEVER go up again... by
> SEVERAL reasons.
> I have to go back and reinstall them and restore backups.
> I just was hoping EXT2 (and new EXT3 w/journaling) was MUCH
> MORE SAFE!
>
> Thanks!

Jáder Marasca

Re: how to repair a bad SME after problems
« Reply #6 on: August 03, 2002, 04:17:15 PM »
Hi Tom

I´m impressed with your solution: thanks!

How do you do to separate boot and data drive?
Are you putting you boot drive (where was installed SME) on a removeable drive tray. There is tray with UDMA cable (80 way)? Here in Brazil I just saw the old w/o UDMA support.

I´m just starting with Linux (I came from Novell world) and don´t have a trick&tip bag yet... I really appreciate your help!

Thanks!

Jáder

Tom Keiser

Re: how to repair a bad SME after problems
« Reply #7 on: August 03, 2002, 09:27:10 PM »
Jader:

There's an old how-to that explains how to do it. It is at:
http://www.e-smith.org/docs/howto/contrib/e-smith_adaptec_raid.htm. It is for E-smith 4, but works just the same on SME 5.

The inexpensive removeable drive trays carry the "Genica Mobile Media" brand name, and can be found on the i'net for $16 or less. They are ATA-100 capable, but some old inventory may not be, so check on that when you buy.

Regards,

Tom



Jáder Marasca wrote:
>
> Hi Tom
>
> I´m impressed with your solution: thanks!
>
> How do you do to separate boot and data drive?
> Are you putting you boot drive (where was installed SME) on a
> removeable drive tray. There is tray with UDMA cable (80
> way)? Here in Brazil I just saw the old w/o UDMA support.
>
> I´m just starting with Linux (I came from Novell world) and
> don´t have a trick&tip bag yet... I really appreciate your
> help!
>
> Thanks!
>
> Jáder

Jáder Marasca

Re: how to repair a bad SME after problems
« Reply #8 on: August 03, 2002, 09:54:21 PM »
Tom,

Thanks by the link to how-to! It´s really very easy... I´ll try it someday!

Thanks!