Koozali.org: home of the SME Server

system hangs when CPU is 100% used for couple of hours.

Offline cactus

  • *
  • 4,880
  • +3/-0
    • http://www.snetram.nl
Re: system hangs when CPU is 100% used for couple of hours.
« Reply #15 on: August 30, 2008, 09:47:05 AM »
See below excerpts from messages log, which certainly shows fax activity prior to the hang.

It looks like a "hang" here:

Aug 23 00:09:52 admin-svr init: cannot execute "/usr/sbin/faxgetty"
Aug 23 00:09:52 admin-svr last message repeated 9 times
Aug 23 00:09:52 admin-svr init: Id "fax" respawning too fast: disabled for 5 minutes
Aug 23 00:14:53 admin-svr init: cannot execute "/usr/sbin/faxgetty"
Aug 23 00:14:53 admin-svr last message repeated 9 times
Aug 23 00:14:53 admin-svr init: Id "fax" respawning too fast: disabled for 5 minutes
Aug 23 00:19:54 admin-svr init: cannot execute "/usr/sbin/faxgetty"
Aug 23 00:19:54 admin-svr last message repeated 9 times
Aug 23 00:19:54 admin-svr init: Id "fax" respawning too fast: disabled for 5 minutes
Aug 23 09:09:59 admin-svr syslogd 1.4.1: restart.
Aug 23 09:09:59 admin-svr syslog: syslogd startup succeeded
Aug 23 09:09:59 admin-svr kernel: klogd 1.4.1, log source = /proc/kmsg started.
Aug 23 09:09:59 admin-svr syslog: klogd startup succeeded

and here:

Aug 27 19:03:15 admin-svr init: cannot execute "/usr/sbin/faxgetty"
Aug 27 19:03:15 admin-svr last message repeated 9 times
Aug 27 19:03:15 admin-svr init: Id "fax" respawning too fast: disabled for 5 minutes
Aug 28 07:41:26 admin-svr syslogd 1.4.1: restart.
Aug 28 07:41:26 admin-svr syslog: syslogd startup succeeded
You are right in stating that the lock-up is likely here...

You should remove sme7admin, there have been many reports of inappropriate settings causing trouble.
If you totally uninstall it, then sme7admin cannot be causing the problem.

According to the wiki do: (although yum remove should be used carefully due to the possibility of removing more than you really want to  ie rpm -e packagename is safer until yum remove gets fixed [at least for sme7]).

yum remove sysstat
yum remove hddtemp
yum remove rrdtool
... but although sme7admin is dodgy it does not seem to indicate that it is the problem here. I think first removing hylafax on this machine as that is what is generating the noise and perhaps the problems as well.
Be careful whose advice you buy, but be patient with those who supply it. Advice is a form of nostalgia, dispensing it is a way of fishing the past from the disposal, wiping it off, painting over the ugly parts and recycling it for more than its worth ~ Baz Luhrmann - Everybody's Free (To Wear Sunscreen)

Offline janet

  • *****
  • 4,812
  • +0/-0
Re: system hangs when CPU is 100% used for couple of hours.
« Reply #16 on: August 30, 2008, 12:06:43 PM »
cactus

Quote
I think first removing hylafax on this machine as that is what is generating the noise and perhaps the problems as well.

I'm glad you agree, as I did suggest that also.

Quote
At a quick look, I see plenty of errors regarding fax system, perhaps you have left behind various templates that should have been removed. I think you should clean up the old fax install.

I thought tropicalview had said that hylafx had already been removed, but obviously a lot of "dregs" were left behind due to a less than thorough removal.
Please search before asking, an answer may already exist.
The Search & other links to useful information are at top of Forum.

Offline zatnikatel

  • *****
  • 190
  • +0/-0
Re: system hangs when CPU is 100% used for couple of hours.
« Reply #17 on: August 30, 2008, 06:07:28 PM »
one thing on has talked about the networking maybe the is a problem with the switch seeing that 3 sme server are doing it or a bad cable some were that is causing it some times it can be a simple thing that causes it do all the sme server connect to the same switch try a copy a large 4 gb file over the network and see if it hangs the network share the dar files go to maybe faulty bad network card etc can not handle the high network band width and hands try an etc haard disk if you have one to do the dar back up on and see if you still have the same problem not every thing has to be super complex to cause a problem as an old boss of mine use to say look for the simple things first

Offline cactus

  • *
  • 4,880
  • +3/-0
    • http://www.snetram.nl
Re: system hangs when CPU is 100% used for couple of hours.
« Reply #18 on: August 30, 2008, 06:34:13 PM »
one thing on has talked about the networking maybe the is a problem with the switch seeing that 3 sme server are doing it or a bad cable some were that is causing it some times it can be a simple thing that causes it do all the sme server connect to the same switch try a copy a large 4 gb file over the network and see if it hangs the network share the dar files go to maybe faulty bad network card etc can not handle the high network band width and hands try an etc haard disk if you have one to do the dar back up on and see if you still have the same problem not every thing has to be super complex to cause a problem as an old boss of mine use to say look for the simple things first
Next time try to make use of punctuation as this should not all have been one sentence. Makes reading and understanding your reply a lot easier.
Be careful whose advice you buy, but be patient with those who supply it. Advice is a form of nostalgia, dispensing it is a way of fishing the past from the disposal, wiping it off, painting over the ugly parts and recycling it for more than its worth ~ Baz Luhrmann - Everybody's Free (To Wear Sunscreen)

Offline zatnikatel

  • *****
  • 190
  • +0/-0
Re: system hangs when CPU is 100% used for couple of hours.
« Reply #19 on: August 30, 2008, 09:09:05 PM »
sorry i will next time was tried when i was writing it.

Offline tropicalview

  • *****
  • 196
  • +0/-0
    • http://www.tropicalview.net
Re: system hangs when CPU is 100% used for couple of hours.
« Reply #20 on: August 31, 2008, 05:21:38 AM »
Hi All,

thanks for the many replies.

and mary, sorry if i do not understand all of your instructions and did not execute them all.
I still do not have that much experience with Linux.

I checked out the cron folders but I'm afraid to edit something there.
And i don't think it's a good idea to copy all the text of all the files and place them on the forum.

I Just made a compressed file of the whole /ect folder and of the /var/log folder, i hope you will be able to take the time to take a look into those files:

http://www.tropicalview.net/smeserver/etc.tgz
http://www.tropicalview.net/smeserver/log.tgz

about the hyla fax issue, i installed hylafax on the test server, and with a dar backup / restore to the production server (after a crash) it copied with it.

on the test server i installed to test and consider to buy a modem for it.
but that didn't work out so good and i uninstalled with the instructions of the contrib page.

somewhere some left overs where on the system, but as i already mention, I'm very carefull in what i do with the command line on Linux. (for sure after the last crash that was my error :?)

after all i do not think the hylafax leftovers are causing this problem because my second server does have the same problem and there i do not have hylafax at all.

as far as i can pinpoint it could be more related to the vmware.
when the vmware is active the risk of haning is bigger than when i shutdown the vmware machines.

I hope the logs and the configs i just send will give some more light into this error, if not please let me know what i need to do do further pinpoint the problem

The sky is not the limit, But when I reach the sky, for sure I will not try to go to the limit.... (donated $25,- upto now)

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: system hangs when CPU is 100% used for couple of hours.
« Reply #21 on: August 31, 2008, 05:40:38 AM »
as far as i can pinpoint it could be more related to the vmware.

Do not install or use vmware then. vmware is not part of SME server.

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: system hangs when CPU is 100% used for couple of hours.
« Reply #22 on: August 31, 2008, 05:44:26 AM »
I Just made a compressed file of the whole /ect folder and of the /var/log folder, i hope you will be able to take the time to take a look into those files:

http://www.tropicalview.net/smeserver/etc.tgz
http://www.tropicalview.net/smeserver/log.tgz

Those files are corrupted - perhaps you uploaded them via ftp in text mode. Use binary mode. Or sftp or winSCP.

Offline tropicalview

  • *****
  • 196
  • +0/-0
    • http://www.tropicalview.net
Re: system hangs when CPU is 100% used for couple of hours.
« Reply #23 on: August 31, 2008, 06:09:08 AM »
Hi CharlieBrady,

I reuploaded the files again with bin.
now they work.
but i noticed when i download them for some reason i get .tar files instead of .tgz
and they do not work like that. so i have to rename them.

Perhaps that will aso happen at your place (but also for sure the files where corrupted before)
The sky is not the limit, But when I reach the sky, for sure I will not try to go to the limit.... (donated $25,- upto now)

Offline janet

  • *****
  • 4,812
  • +0/-0
Re: system hangs when CPU is 100% used for couple of hours.
« Reply #24 on: August 31, 2008, 06:27:50 AM »
tropicalview

Quote
I checked out the cron folders but I'm afraid to edit something there.

You were not asked to edit them, just to check (ie read) them, to see what cron jobs ran at what time, to see if they are causing the problem. If they seemed to be a problem, then those cron jobs could be disabled temporarily (or permanently).

You can spend the time going through them !

Quote
about the hyla fax issue, i installed hylafax on the test server, and with a dar backup / restore to the production server (after a crash) it copied with it.
somewhere some left overs where on the system, but as i already mention, I'm very carefull in what i do with the command line on Linux. (for sure after the last crash that was my error :?)

You need to learn what you are doing then, those errors could well be affecting things, at least on the server(s) you are seeing these errors on ie adding to the servers processor and memory load.


Quote
if not please let me know what i need to do do further pinpoint the problem

Did you do all the steps I suggested earlier, I still see the anti virus scan running as a cron job, so you have not disabled that, what about email virus scanning and spam filtering, did you disable those ?
Did you enable RBL ? Have you removed sme7admin ?

How many times do you have to be asked/told, and you give no feedback, as well as expecting others to trawl through Mbytes of log files !!! Stop being lazy and learn to do your own troubleshooting hack work.
Please search before asking, an answer may already exist.
The Search & other links to useful information are at top of Forum.

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: system hangs when CPU is 100% used for couple of hours.
« Reply #25 on: August 31, 2008, 05:15:51 PM »
Stop being lazy and learn to do your own troubleshooting hack work.

Or pay a competent consultant to help you to keep your systems running smoothly.

Offline tropicalview

  • *****
  • 196
  • +0/-0
    • http://www.tropicalview.net
Re: system hangs when CPU is 100% used for couple of hours.
« Reply #26 on: September 04, 2008, 03:30:41 PM »
Hi All,

Sorry it took some time to get back again.
due to serious weather conditions i wasn't able to connect to the server from my location.

Yesterday evening i did a successful backup with the antivirus, anti spam, vmware service disabled and sme7admin uninstalled.

I hope this will stay the case because before sometimes I was able to get trough the process some times.
Tonight i planned to do a backup again with only the VMWARE service enabled and the windows machine in there running.

Thanks for the all the help, and if i get to know where the problem is i will report that back.

Kind regards,
The sky is not the limit, But when I reach the sky, for sure I will not try to go to the limit.... (donated $25,- upto now)

Offline tropicalview

  • *****
  • 196
  • +0/-0
    • http://www.tropicalview.net
Re: system hangs when CPU is 100% used for couple of hours.
« Reply #27 on: September 18, 2008, 10:02:54 PM »
Dear all,

I did a lot of troubleshooting in this issue, mostly by trial and error.

finaly i stripped everything of that i could and the backup was running fine only if i do completely stop the vmware service!!

Just powering down the windows server did not help but ending the service with /etc/init.d/vmwared stop worked.

after that i loaded another 2 gigabyte of memory in the machine. Now everything works great, even with the windows version running!!!!


So i would like to thank you all for the support and if allowed i would like to ask one little question.

As far as i can see DAR should send e-mails out to the admin.
but for some how i do not get any maill, is there any setting needed for that?

Kind regards,


PS. Please give me an indication about what's a good donation amount for this issue, so i can address that in our administration.....


The sky is not the limit, But when I reach the sky, for sure I will not try to go to the limit.... (donated $25,- upto now)