Koozali.org: home of the SME Server

SME 8.1 is "hanging" to some extent even after reboot

Offline turandot

  • *
  • 82
  • +0/-0
SME 8.1 is "hanging" to some extent even after reboot
« on: May 16, 2014, 09:24:59 PM »
All,

I am currently experiencing strange things with my SME (server only; private IP address 192.168.x.2; no DHCP; server is connected to the internet through a separate Linux based firewall which provides DHCP and DNS services) I have never seens before.
  • Server-manager WebGUI cannot be reached. (This is what I have noticed first.) To be more precise, the browser tries to connect, but the connection is lost again. Finally time out in the browser. Restart of the httpd-e-smith daemon does not help, the same applies for an entire SME reboot. httpd regularly pops up in top eating CPU resources. Huge number of error messages in one of the latest files /var/log/httpd/error_log.20140515011204 File size is ~20MByte where it is normally up to 10kByte. Error messages always contain the following:
Code: [Select]
[Thu May 15 01:12:24 2014] [notice] SSL FIPS mode disabled
[Thu May 15 01:12:24 2014] [warn] RSA server certificate is a CA certificate (BasicConstraints: CA == TRUE !?)
[Thu May 15 01:12:24 2014] [warn] RSA server certificate CommonName (CN) `smeserver.home.xx' does NOT match server name!?
[Thu May 15 01:12:24 2014] [notice] Digest: generating secret for digest authentication ...
[Thu May 15 01:12:24 2014] [notice] Digest: done
  • The DNS resolution is lost, although pinging IP addresses of servers in the internet works.
  • Reboot takes quite a long time, although all services are starting up without error messages (local console).
  • Yum seems to run "forever" blocking attempts to update manually: "another app is currently holding the yum lock; the other application is: yum. Latest entry in /var/log/yum/yum.log is from April 27

I should also list what still works:
  • File services through Samba (which is the main service)
  • SSH access (both using PuTTY for remote console as well as FileZilla), but not very stable. Connection is lost once a while. I tried to backup the SME using AFFA, see http://wiki.contribs.org/Moving_SME_to_new_Hardware , but hardly any data seems to be transferred.
  • Local console
  • the file system is clean (according to message during boot)
  • the HDD seems to be in good health (according to smartctl)
  • Lots of disk space available

I don't have a clue what is going on. Where should I have a look? Any recommendations?

Regards, turandot

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: SME 8.1 is "hanging" to some extent even after reboot
« Reply #1 on: May 16, 2014, 10:37:21 PM »
I am currently experiencing strange things with my SME
...
Any recommendations?

I recommend that you report your issues to the bug tracker.

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: SME 8.1 is "hanging" to some extent even after reboot
« Reply #2 on: May 16, 2014, 10:41:30 PM »
Error messages always contain the following:

Those aren't unusual. If you see them frequently, then that's because apache is restarting frequently, which will chew up CPU and disk I/O bandwidth, which will probably explain your other symptoms.

To investigate further, do:

cd /service/httpd-e-smith
sv d .
./run

and see why apache is crashing on startup.

Offline turandot

  • *
  • 82
  • +0/-0
Re: SME 8.1 is "hanging" to some extent even after reboot
« Reply #3 on: May 16, 2014, 11:43:04 PM »
I recommend that you report your issues to the bug tracker.
OK... I wasn't expecting a bug rather than I don't know what.
To investigate further, do:

cd /service/httpd-e-smith
sv d .
./run

and see why apache is crashing on startup.

The console output is "Bus-Zugriffsfehler" (German), probably "Bus access error" or something similar...

Thanks, turandot

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: SME 8.1 is "hanging" to some extent even after reboot
« Reply #4 on: May 17, 2014, 12:13:38 AM »
You should assume bad behaviour is a bug, until proven otherwise.

OK... I wasn't expecting a bug rather than I don't know what.

You should assume bad behaviour is a bug, until proven otherwise.

Quote
The console output is "Bus-Zugriffsfehler" (German), probably "Bus access error" or something similar...

Sounds like you have a hardware problem. Do a memory test and check your disk hardware.

http://stackoverflow.com/questions/212466/what-is-a-bus-error

Offline turandot

  • *
  • 82
  • +0/-0
Re: SME 8.1 is "hanging" to some extent even after reboot
« Reply #5 on: May 18, 2014, 11:17:59 AM »
Sounds like you have a hardware problem. Do a memory test and check your disk hardware.
You seem to be right...

Finally I succeeded to retrieve a backup by Affa. However the final Affa sync http://wiki.contribs.org/Moving_SME_to_new_Hardware#Final_data_synchronization wasn't possible anymore, because the SSH service was'nt running anymore as well. And file shares through Samba were not available anymore too... Login on the local console still worked, but lots of basic shell commands didn't work. So I rebooted my SME. I now found an inconsistent file system, and fsck couldn't be found either. Booting a Linux live CD through an USB CD drive failed as well. I suspect that the mainboard is faulty.

So I it looked like I was able to retrieve a backup with Affa, and the productive SME seemed to have completely died during the backup. Quite obvious, what came next http://wiki.contribs.org/Moving_SME_to_new_Hardware#Switch_over_to_the_new_hardware

And yeah, my SME is up again! No data loss at all! I am lucky! Now time has come to make a thorough hardware inspection on the old box.

Charlie, thanks a lot for your advice.

turandot

Offline turandot

  • *
  • 82
  • +0/-0
Solved: SME 8.1 is "hanging" to some extent even after reboot
« Reply #6 on: May 20, 2014, 07:32:08 PM »
Addendum...

Following a memory check, I made a fresh install on the existing hardware. Seemed to work, but udev takes a loooong time before it actually fails to start. Changing the HDD and making a fresh install as well, I don't observe the same behaviour: udev starts properly. Although the affected HDD does not indicate an issue using smartctl diagnosis, I think the HDD should go to the happy hunting ground.

Thanks, turandot

Offline turandot

  • *
  • 82
  • +0/-0
Solved: SME 8.1 is "hanging" to some extent even after reboot
« Reply #7 on: June 18, 2014, 09:21:44 PM »
Just to confirm: the HDD is completely dead now, even though no errors were shown by smartctl.

turandot