tropicalview
Does anybody have a idea what is happening on my machine?
I gave you some suggestions but you totally overlooked my post.
When I drew it to your attention again, you appear to only answer/comment on one of the points I raised.
If you don't follow/answer all the suggestions (clearly), then there is not much point us/me helping you !
My suggestions are really a means of eliminating possibilities.
If the problem goes away, then look at what you removed as being the likely source of the problem. If the problem remains, than the suggestions I made are not causing the problem. You would then need to get more aggressive in your troubleshooting.
I suggested:
> Take a look to see if any other cron jobs run during the period dar is running,
> and see if they relate to the lockup time.
look in
/etc/cron.d
/etc/cron.daily
/etc/cron.hourly
/etc/cron.weekly
/etc/cron.monthly
Also look in
/var/log/cron
to see what is actually happening
There could also be crontab entries that don't appear in the locations mentioned above. Read up & search on crontab.
> Also try (temporarily at least) disabling spam filtering,
You did not answer whether spam filtering is enabled or not
> and also mail virus scanning,
You did not mention if virus scanning on incoming mail is enabled or not
> and virus scanning of your system,
You did say you disabled virus scanning, and I assume you mean only daily/weekly scanning of the system & data files.
> Do you have RBL rejection enabled to reduce spam load on your systems.
You do not answer this, find out with
config show qpsmtpd
Email virus scanning & spam filtering can put a big load on your system depending on the type of mail you receive etc.
> Look at the messages log file (and other log files for that matter) and see what was happeing prior to the lockup.
You provide a copy of the messages log file but expect us to go through it with a fine comb. That's your job.
At a quick look, I see plenty of errors regarding fax system, perhaps you have left behind various templates that should have been removed. I think you should clean up the old fax install.
I was suggesting you look for the point when your system supposedly hangs, does that show in the log ie no more entries happening after the "hang" time ? Then look prior to that time and see what else has been happening.
Also look in all other log files around and prior to the time of the "hang", they are accessible from server manager.
See below excerpts from messages log, which certainly shows fax activity prior to the hang.
It looks like a "hang" here:
Aug 23 00:09:52 admin-svr init: cannot execute "/usr/sbin/faxgetty"
Aug 23 00:09:52 admin-svr last message repeated 9 times
Aug 23 00:09:52 admin-svr init: Id "fax" respawning too fast: disabled for 5 minutes
Aug 23 00:14:53 admin-svr init: cannot execute "/usr/sbin/faxgetty"
Aug 23 00:14:53 admin-svr last message repeated 9 times
Aug 23 00:14:53 admin-svr init: Id "fax" respawning too fast: disabled for 5 minutes
Aug 23 00:19:54 admin-svr init: cannot execute "/usr/sbin/faxgetty"
Aug 23 00:19:54 admin-svr last message repeated 9 times
Aug 23 00:19:54 admin-svr init: Id "fax" respawning too fast: disabled for 5 minutes
Aug 23 09:09:59 admin-svr syslogd 1.4.1: restart.
Aug 23 09:09:59 admin-svr syslog: syslogd startup succeeded
Aug 23 09:09:59 admin-svr kernel: klogd 1.4.1, log source = /proc/kmsg started.
Aug 23 09:09:59 admin-svr syslog: klogd startup succeeded
and here:
Aug 27 19:03:15 admin-svr init: cannot execute "/usr/sbin/faxgetty"
Aug 27 19:03:15 admin-svr last message repeated 9 times
Aug 27 19:03:15 admin-svr init: Id "fax" respawning too fast: disabled for 5 minutes
Aug 28 07:41:26 admin-svr syslogd 1.4.1: restart.
Aug 28 07:41:26 admin-svr syslog: syslogd startup succeeded
You should remove sme7admin, there have been many reports of inappropriate settings causing trouble.
If you totally uninstall it, then sme7admin cannot be causing the problem.
According to the wiki do: (although yum remove should be used carefully due to the possibility of removing more than you really want to ie rpm -e packagename is safer until yum remove gets fixed [at least for sme7]).
yum remove sysstat
yum remove hddtemp
yum remove rrdtool
To monitor what is going on use
top
top -i
htop
You also do not say what version sme you are running, whether they are fully up to date, and what other contribs are installed and/or what other modifications you have made to the servers.
If you stop dar from running, does the system still hang ?