Koozali.org: home of the SME Server
Legacy Forums => Experienced User Forum => Topic started by: penguinzrool on January 30, 2005, 02:53:08 PM
-
I've been running SME 6.0.1 happily for about a month now, last uptime was about 2 weeks. Came downstairs this morning though and the hard drive light is pretty much solidly on with the machnie making a racket and not responding to anything.
I can ping it, but can't connect with Putty or browse any of the shares. DNS services aren't working and I even connected a keyboard and monitor to it, tried to log in and it just sat there without even giving me the 'password' prompt.
In the end I had to reset it, but this is the second time this has happened, the first was on a new install..
Setup is as follows:
VIA MiniITX mainboard with 500 Mhz C3 processor
256MB ram
2 x Seagate Barracuda 7200.7 ATA 120GB hdds setup in software RAID 1
Jesper's Spamfilter and Clam contribs
Fetchmail contrib
Advanced workgroup contrib
The system is running as an email server (SMTP) and domain controller and fileserver to our 3 Windows boxes (NT4, 2k, XP).
Any ideas?
Thanks,
Chris.
-
Just had a look at the logs and saw this:
Jan 30 09:54:09 server kernel: Out of Memory: Killed process 32622 (httpd).
Jan 30 10:04:32 server kernel: Out of Memory: Killed process 28252 (httpd).
Jan 30 10:54:47 server kernel: Out of Memory: Killed process 32639 (httpd).
Jan 30 10:54:52 server kernel: Out of Memory: Killed process 28879 (httpd).
Jan 30 11:11:29 server kernel: Out of Memory: Killed process 32634 (httpd).
Jan 30 11:11:35 server kernel: Out of Memory: Killed process 28253 (httpd).
which has been repeating for about 3 hours before I had to do a hard reset... :-(
-
.. well that error is pretty definitive!
Your prob would seem to be a virus or spam flood that is blowing away all your systems memory (real and swap).
a) add more memory (always useful)
b) add Gordon Rowells mailfront contrib to kill viruses BEFORE they need to be scanned (at the SMTP level)
c) reduce the # of threads the server will use to process mail to reduce the point load but not the absolute load on the server:
/sbin/e-smith/db configuration setprop qmail ConcurrencyRemote 10
/sbin/e-smith/db configuration setprop qmail ConcurrencyLocal 10
/sbin/e-smith/signal-event email-update
/etc/init.d/qmail restart
I had a client a coupla months ago who was getting 4000 viruses an hour & it crippled their server; did the above and it has the occassional slow down for a coupla minutes hear and there, sweet!
-
Thanks a lot for the reply!
I've run those commands and we'll see how it goes. It seems odd that email could be the problem, though, as this system is only hosting a family website and a very small company, so gets about 10 emails per day at most!
Also, the reference to lots of httpds seems a bit worrying, perhaps the website has been under attack but this seems a bit odd...
-
hmm, skim reading and I missed the httpd - damn senility catching up with me.
The httpd tasks may be some sort of DOS attack or even someone trying to brute force there way into your system. I'd dig through all the logs and check carefully for any other clues.
-
hmm, skim reading and I missed the httpd - damn senility catching up with me.
The httpd tasks may be some sort of DOS attack or even someone trying to brute force there way into your system. I'd dig through all the logs and check carefully for any other clues.
The httpd tasks being killed doesn't mean that httpd is under attack. The system is out of memory, and the kernel starts killing processes of its choice. The httpd are large and inactive, so they get bumped on the head.
Poster still needs to diagnose the cause of memory exhaustion. I agree with you that excessive email scanning could be root cause. What do the logs say? If system gets busy again, disconnect it from the net, and wait for it to slow down, before poking around for any relevant logs. You'll lose those logs if you reset it.
-
I tried disconnecting it from the network while it was doing this, and it made no difference. I would have looked at the logs before resetting it but I couldn't even log into the system with a keyboard plugged in - after typing 'root' and pressing enter I didn't even get a password: prompt :cry:
Thanks for the responses guys!
-
I tried disconnecting it from the network while it was doing this, and it made no difference.
Maybe you didn't wait long enough. Linux is amazingly resilient. You do sometimes have to be very patient for it to get done with its overload.
-
maybe, i suppose i only gave it 5 mins or so after disconnecting the network...
surely, though, if it was just killing the httpds to save memory there wouldn't be that many - the logfile i posted was just an exert, there were at least 50 odd httpds being killed, which cant be normal!
and the first time this happened before i reinstalled the system was only after having the server up and running for about 3 hours, and then it wasnt working as a webserver and email was being collected via pop3...
strange :-?
-
surely, though, if it was just killing the httpds to save memory there wouldn't be that many - the logfile i posted was just an exert, there were at least 50 odd httpds being killed, which cant be normal!
You saw a power struggling happening. Apache tries to keep its MinSpare servers available, so starts new ones if any die.
-
The reason for the same behaviour on one of my servers was a whole bunch of incorrectly addressed Spam to non-existant users.
ClamAv and Spam Assassin tries to scan all these messages and uses all the memory.
My suggestion would be one of two solutions
1) Change your concurrencyremote value (search for file) to 10 and monitor.
2) Disable doublebounce messages - There is existing posts with howto's in this forum as well
Keep us posted