Koozali.org: home of the SME Server

[solved] ext2 inconsistency error (3 in a month) raid 1 UU hd smart ok

Offline Jáder

  • *
  • 1,099
  • +0/-0
    • LinuxFacil
I just see this once, it was a smaller-than-necessary PSU.
My tip: just try your HW with a REALLY OVERESTIMATED power supply. Borrow one if you need.
Or unplug anything not necessary (let just motherboard and 1xHDD) to try.

BTW: Do you installed sysmon or SME7admin ?
Both of them generate nice graphs, so you can see how your server is behaving in time.
Do not scare about to memory be all in use ... it's the linux way.
...

Offline purvis

  • *****
  • 567
  • +0/-0
I had some video memory problems lately that was hard to detect. Most memory also has gold connections. Do not use two different type of metal surfaces where your memory connects to the mobo.

Offline lightman

  • ****
  • 75
  • +0/-0
Hello
First of: Thank you all for your help.
I finally decided to go radical, the motherboard/processor/memory where OK since it didn't failed at all at home after 2 days, so I buy a brand new power supply, case, and one 1TB hard drive, and it's working perfectly so far with SME 7.5

So, It has something to do with the power supply or the disks, I will dedicate this weekind to find out, since my client is now happy with his new working server (well... i didn't tell him that the motherboard/memory was the same one :D, he just saw a new enclosure and figure it out the rest by himself hehe), so now, there is no rush to figure it out what is going on.

I did change the PSU a couple of days ago for a ANTEC BASQ 450w, that i think, is a very good PSU, remember the motherboard is an ATOM dual core based, and 2 WD Green drives, power consumption is FAR below 100watts, and it failed after 8 hours in the same way (with only one drive connected), that's why I rule it out. (it's a good info BTW, never knew that smaller-than-necessary PSU can cause hangups like this, thank you).

I keep the new motherboard (intel D525MW with 2GB DDR3) so i will use it to run tests, will post here results later :) (since my client wants RAID 1, but I will not take any chances, i will test both drives very well with SME here before go into the production server again)
thank you all
light
« Last Edit: September 01, 2011, 03:12:02 AM by lightman »

Offline Jáder

  • *
  • 1,099
  • +0/-0
    • LinuxFacil
Hello
First of: Thank you all for your help.
I finally decided to go radical, the motherboard/processor/memory where OK since it didn't failed at all at home after 2 days, so I buy a brand new power supply, case, and one 1TB hard drive, and it's working perfectly so far with SME 7.5
Glad to know you fixed the issue.
You're welcome.
But now please change subject to include [Solved]
And if you can, donate some money to this project... everyone will thanks you.
 
Quote
So, It has something to do with the power supply or the disks, I will dedicate this weekind to find out, since my client is now happy with his new working server (well... i didn't tell him that the motherboard/memory was the same one :D, he just saw a new enclosure and figure it out the rest by himself hehe), so now, there is no rush to figure it out what is going on.

I did change the PSU a couple of days ago for a ANTEC BASQ 450w, that i think, is a very good PSU, remember the motherboard is an ATOM dual core based, and 2 WD Green drives, power consumption is FAR below 100watts, and it failed after 8 hours in the same way (with only one drive connected), that's why I rule it out. (it's a good info BTW, never knew that smaller-than-necessary PSU can cause hangups like this, thank you).
you're welcome.
I've this kind of problem several years ago... was a pain...

Jáder
...

Offline lightman

  • ****
  • 75
  • +0/-0
Hello
Happened AGAIN!
How is it possible? it is a new machine, new disk, new enclosure, power supply.
now it worse, i cannot fix the / partition

I booted with SME RESCUE, try to e2fsck /dev/sda2 with no sucess (sda1 is fine)
it says: e2fsck: Bad magic number in super-block while trying to open /dev/sda2

I tried -b 8193 and some other ones i found it here in some posts with no sucess, now I have a backup but from 2 days, and that would be a problem if I lost 2 days of work because of this.

I read here: http://wiki.contribs.org/Recovering_SME_Server_with_lvm_drives

that I need to enable raid and LVM in order to fix the filesystem is this correct??

I did mdadm -AR /dev/md8 dev/sda2 and now I have md8 created
but in RESCUE mode I don't have any LVM utilities, when I try to vgs , vgchange it doesn't exists

I'm lost again, what could I do?, I need to recover those files but I have no idea what to do, I don't have another SME server to mount the disk as secondary.

any advice will be very appreciated
thank you

Offline Jáder

  • *
  • 1,099
  • +0/-0
    • LinuxFacil
If you had installed sysmon or sme7admin and keep an eye on your server graphs, you could had see it coming.

Now it's a scare scenario: I woudl start over with all new pieces ... ALL NEW this time... and be sure to put server on a UPS (a good one, please!).

Good luck

Jáder
...

Offline lightman

  • ****
  • 75
  • +0/-0
Hello
The UPS is a backup UPS from APC brand new
now I put the new motherboard again so it's all new equipment, nothing used.

I was able to revive the partition using another live CD (SME RESCUE where USELESS) so I could start LVM and do the file repair from there, it worked, but I had TONS of errors.
now the server starts but I cannot access from CIFS (I can still access from FTP but not CIFs) when I try to modify, add or delete any user, ibay or group it gives me TONS of errors, like: failed to initialize group mapping, adding entry for group user failed no rid or sid specified.
I have no idea what is this.
but it seems that the only way to extract the data would be via FTP :( it would take HOURS since they are all very small files but hundred of thousands of them.
but well, at least I have the data back :)

Offline Jáder

  • *
  • 1,099
  • +0/-0
    • LinuxFacil
Good to know you got your data back.
Search for AFFA in contribs and put it to work...backups everyday offsite using web!
Please do install sysmon and/or sme7admin on this server... so you can watch graphs of server status.

Keep us informed.
Jáder
...