Koozali.org: home of the SME Server

Legacy Forums => Experienced User Forum => Topic started by: Dan Brown on September 02, 2002, 08:04:49 PM

Title: Kernel panic
Post by: Dan Brown on September 02, 2002, 08:04:49 PM
This morning, after 43 days of uptime, I found my SME 5.1.2 server unresponsive to any services except ping and packet filtering--no SSH, POP, IMAP, www, even the console was not working.  Displayed on the screen was the message "Kernel panic: Free list empty".

I rebooted the server and it seems to be functioning properly.  The only relevant entry in /var/log/messages is this line:

Sep  2 04:47:41 e-smith kernel: Kernel panic: Free list empty

The next preceeding entry is two hours earlier, and the following one is almost six hours later (when I restarted the machine).  I don't see any other log files which look like they'd be relevant.  Any ideas what could have caused this?  I didn't find anything in a forum search.
Title: Re: Kernel panic
Post by: Greg Zartman on September 02, 2002, 08:52:28 PM
Dan,

I've read that the log message you've listed COULD be an indication of a DOS attack.

Read the following (quick Google search):

http://msgs.securepoint.com/cgi-bin/get/bugtraq0001/203.html
http://www.securiteam.com/unixfocus/5YP0I000DG.html
http://packetstormsecurity.nl/0001-exploits/krnl110.htm


Greg
Title: Re: Kernel panic
Post by: Dan Brown on September 02, 2002, 09:15:27 PM
Interesting...  Any way to confirm (or deny) whether this is what happened?  My system monitor doesn't show any indication of unusual bandwidth usage before the outage.  It does, however, seem to have happened at about the time of day that the normal daily functions run.  Thanks for the google pointer, should have tried that myself.

Sounds like I should also drop a line to security@e-smith.com on this one.
Title: Re: Kernel panic
Post by: Greg Zartman on September 02, 2002, 09:55:23 PM
> Interesting...  Any way to confirm (or deny) whether this is
> what happened?  My system monitor doesn't show any indication

With SME only, I think it would be difficult to back track something like this.  

You'd think people could find a better use of their time and talents than something like this.

Good luck.

Greg
Title: Re: Kernel panic
Post by: GW on September 02, 2002, 10:17:21 PM
Do we all  have to be worried about this??

Rrdgs


GW
Title: Re: Kernel panic
Post by: Greg Zartman on September 02, 2002, 11:24:23 PM
> Do we all  have to be worried about this?

In my original post, I stated that the error in question "COULD" be the result of a DOS attack.  The references that I came across indicated that this was an issue with Free BSD.   The Free BSD and Linux kernels are cousins, so to speak, so it isn't out of the question that this could now be an issue with Linux.

It could also be something else completely unrelated to a DOS attack such as a hardware issue.  

Dan indicated that he is going to send a message to Mitels security address with his issue (which is what Mitel has asked us to do).  If Mitel finds something I'm certain they'll let us know of any vulnerabilities as soon as they can.

I wouldn't lose any sleep over it.  Most of use that use SME are pretty low profile when compared to many of the other companies with a net presence.  

Greg
Title: Re: Kernel panic
Post by: Scott Smith on September 03, 2002, 12:52:27 AM
To ease minds (or cloud the issue) I saw this a couple of times a while back with two 4.x servers, both connected to a test network with only one PC attached. We were doing some mail development, with one server setup to serve the PC and the other setup as the recipient system. In both instances, we had inadvertently left the systems in a state where they were generating increasingly large messages (multi-MB) to each other, and then let them sit over a weekend. When I came in after the weekend, one e-smith system was crashed just as Dan describes.

Whether this was mail related or not was never confirmed. However, there was no external access to the network. Just two e-smith servers and one PC. The system that did not crash had thousands of files in the qmail queue, though, and was getting very sluggish when I shut it down. Unfortunately, at the time it was not a critical issue and received no further investigation.
Title: Re: Kernel panic
Post by: Charlie Brady on September 03, 2002, 07:09:19 AM
Dan Brown wrote:

> Displayed on the screen was the message "Kernel
> panic: Free list empty".

You've probably encountered a kernel bug, and as Scott suggests, possibly triggerred because of a resource starvation issue(*). The kernel has tried to remove a disk block from the free list, but the free list is empty.

It could be a race condition between different kernel threads - which is much more likely is your system is multi-processor. Is it possible that (one of) your disk(s) became full?

Other possible causes are hardware problems in RAM or disk, but I think those less likely.

Charlie

(*) And a DoS attack is one possible cause. I don't believe we yet have any strong evidence of one.
Title: Re: Kernel panic
Post by: Dan Brown on September 03, 2002, 08:09:25 AM
The machine is a single processor Celeron 300A.  It is overclocked to 450 MHz, but it's been running e-smith/SME in that configuration 24/7 for almost the last two years, with the only downtime prior to today being for power outages, hardware changes, or system upgrades.

I suppose it's possible that the hard drive (single "11 GB" (manufacturer's GB) drive) became full, but I don't see any indication of that.  The system monitor indicates a small bump in usage (about 100 MB) shortly before the crash, but then that freed up.  Might some process have gone wild and filled up the drive in a short enough timeframe that it wouldn't show on the chart?  I guess it's possible, but it doesn't sound too probable to me (but what do I know?).  df -h reports:

[dan@e-smith dan]$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/hda6              11G  7.7G  2.4G  77% /
/dev/hda1              15M  3.6M   10M  25% /boot

I'm certainly not married to the idea of its being a DoS attack.  I also don't think it'd be possible to confirm that if it were the case, as there don't appear to be any logs which would show it.  Is there anything else we can do at this point to troubleshoot further?
Title: Re: Kernel panic
Post by: Scott Smith on September 03, 2002, 06:17:08 PM
Available disk free space isn't necessarily an indicator. To put it in DOS/Windows OS terms, *nix systems are subject to the same "too many open files" issues -- that is, you can run out of file handles long before you run out of disk space. Or, to relate it to the older Windows versions, you can use up the resource block of memory and get an "out of memory" error, even though you might have several MB of unused RAM.

If you've had some situation that results in a large number of files being opened -- runaway spoolers are noted for this -- then you can run out of (again, in DOS/Windows terms) "file handles". This, I believe, is what the "free list empty" error is reporting -- the fact that the *nix kernel can have only a finite number of files open, and for whatever reason that limit has been reached.

Note that it is also possible to run out of inodes, which are roughly (not exactly) equivalent to directory entries. Think of them as directory entries, except without the name itself, which resides elsewhere. There are a finite number of inodes available on a file system. If you install a gazillion small files, or have an application that goes wild creating tmp files, then you can easily run out of inodes before you run out of free disk space.

Both the free list and inode table are static, meaning the kernel and filesystem have the size established ahead of time and do not dynamically grow them to fit. Tunable parameters, as it were. Usually, changing the free list table requires recompiling the kernel, and setting the available inodes can only be done when the filesystem is created.

In your case, I'd be looking for something that is opening and holding open lots of files. Maybe something breaking a large file into smaller chunks, or a daemon that is spawning and hanging prolifically, etc.

Scott