Koozali.org: home of the SME Server

SME Server falling over

Antarius

SME Server falling over
« on: April 30, 2004, 03:46:03 PM »
G'day all,

I started using E-Smith around version 3 and have kept up-to-date over the years. Well, except for one box (which was a 5.5 box and some lovely person got root. That'll teach me to be slack!)

I reinstalled the compromised box (with a new, clean hard drive - I wanted to do some forensics and catch the little star bard that took me down in the first place) with SME 6.0.

I then wen along the long, arduous process of adding all of the usual tools, antivirus, dialin, hylafax, etc.

It wasn't long before I noticed that the server was unusable in its new form - and that it was swapping to buggery. RAM upgrades solved that, of course.

But a new problem reared its ugly head - and it has been getting worse. Periodically the internet access falls over and won't respond to even the most violent of kicks.

When this happens, Web access is impossible & DNS lookups fail, meanwhile internal mail works perfectly - and external mail is received okay.

The queer thing; to get it "happy" again is not a consistent thing. Sometimes restarting tinydns will get it back in action. Sometimes restarting dnscache is the key. Sometimes it's ipmasq, sometimes its squid, sometimes it requires a reboot!  (When rebooting, I always reset the ADSL modem to get a new IP address - dynamic IP).

To make matters worse, when these problems occur, SSH slows down to a crawl (takes 30-90 seconds for a "password" prompt to appear after giving the username of root.)  Sometimes SSH connections are even refused! After the last little darling, I'm extremely paranoid that there might be an intruder.

(Unfortunately, after the SME6.0 acted up, I wanted to have a "bare minimum install," so didn't re-install Snort and Acid. As such, I've only got a string of rather large Deny logs to wade through!)

If one of these solutions was consistent, at least I'd have something to start looking at.

Thinking that something went screwy with the add-ons, I backed up my data, re-installed (from scratch) with a vanilla install. Seemed okay for a while, then the same problems appeared.

Getting really irked by this time, I then used the 6.01-custom ISO (which is great!) and replaced the entire box. (All new hardware). Same problem - even with all updates installed via YUM later.


I've searched the forums with no joy; I've googled with no joy. (Assuming my choice of keywords is okay - hard to know what to search for with such stupid symptoms.)

Has anyone had similar problems? Any suggestions for places to start looking?

I'm getting utterly frustrated to the point that I'm considering migrating my gateway to Debian or OpenBSD...  I don't want to stop using SME - I've been an advocate for too long!

Thanks in advance for any advice.

Cheers,

Ant


Current hardware:
AMD Athlon 2000+ XP
RAM: 512Mb
Ethernet 1: Realtek 8139
Ethernet 2: Realtek 8139
Modem: Alcatel SpeedTouch Pro (not USB! - uses PPPoA, not requiring any special configs on SME)

Anonymous

Re: SME Server falling over
« Reply #1 on: May 10, 2004, 10:24:15 PM »
HI,

Did you find the problem with this?

I read a post where an individual was having a
similar problem. He had wemail active and
about every two weeks the system ran out
of memory and he had to do a reboot.

Do you have webmail activated?

So if you have not found the problem yet, you
mignt want to try disabling webmail and see what
how that affects the system.

Scott

RayG

SME Server falling over
« Reply #2 on: May 12, 2004, 07:43:36 PM »
I've had similar problems in the past and still do but far less frequently with SME 6. The reduction in frequency may be a coincidence rather than version related.

I tracked the issue to a problem where SME requests a lease renewal and the ISP denies the request. I havn't bothered to track it further. In my case, doing an ifdown/ifup sequence on the external interface got me back online every time.

Antarius

Re: SME Server falling over
« Reply #3 on: May 28, 2004, 07:08:20 AM »
Quote from: "Scott"
Did you find the problem with this?


Not at this stage; I was actually getting to the point of debating a separate firewall/gateway and web/mail server. (A colleague even suggested Exchange - but I doubt that they'll find his body any time soon...)  :evil:


Quote
Do you have webmail activated?


I did have it active, as I periodically found it useful, however I have just blown it away and I will see if the problem disappears.

(If it does, I'll just look at an alternative webmail platform. I've never been a huge fan of horde anyway...)

Thanks for the suggestion.


Quote from: "RayG"
I've had similar problems in the past and still do but far less frequently with SME 6. The reduction in frequency may be a coincidence rather than version related.


Stange; I had begun to think it was version related, as I never had the problem with any of the 3, 4 or 5 series releases (and the same hardware was used on each of those 3 series - with each release too! Worked like a dream.)


Quote from: "RayG"
I tracked the issue to a problem where SME requests a lease renewal and the ISP denies the request. I havn't bothered to track it further. In my case, doing an ifdown/ifup sequence on the external interface got me back online every time.


The first thing I tried, of course, was to simply restart the interface. Unlike you, though, this brought me no joy.

As stated above, I've disabled Webmail now, so we'll see if the problem goes away. I'm not expecting that it will; I've thrown in several hardware upgrades and the machine is only using a fraction of its resources.

If the problem persists, I'll take your lead of the lease renewals and dig a little further - either way, I'll track what happens over the next month or two and report back to this thread. :-D

PhilV

SME Server falling over
« Reply #4 on: May 28, 2004, 11:18:22 AM »
I too am having this problem. I am running version 6.0 on a very low end machine, but it is doing very little other than acting as a gateway, with the occasional use as a server. I will be interested to see what you find, I find that sometimes doing the down/up on the external interface works... sometimes... other times a whole reboot is needed. I AM running webmail, how do I stop this? (I still want to be able to use mail tho, tho not thru a web-interface).

Thanks and let us know how you get on Antarius!

Phil

Offline NickR

  • *
  • 283
  • +0/-0
    • http://www.witzendcs.co.uk/
SME Server falling over
« Reply #5 on: May 28, 2004, 10:33:52 PM »
Quote from: "PhilV"
I AM running webmail, how do I stop this? (I still want to be able to use mail tho, tho not thru a web-interface).


Configuration / E-mail - then set enable/disable webmail in the panel.
--
Nick......

Offline NickR

  • *
  • 283
  • +0/-0
    • http://www.witzendcs.co.uk/
Re: SME Server falling over
« Reply #6 on: May 28, 2004, 10:43:25 PM »
Quote from: "Antarius"

To make matters worse, when these problems occur, SSH slows down to a crawl (takes 30-90 seconds for a "password" prompt to appear after giving the username of root.)  Sometimes SSH connections are even refused! After the last little darling, I'm extremely paranoid that there might be an intruder.

Has anyone had similar problems? Any suggestions for places to start looking?



Funnily enough, yes!  It was your SSH problems I was seeing (within minutes of the box being up).  Restarting sshd would cure it for a few more minutes but that was all.  In my case, it turned out to be the NIC sharing an interupt with a SCSI controller. I gave the NIC an interupt of its own & it has been happy ever since.  

This may not be your problem, but it's worth checking what /proc/pci is telling you.
--
Nick......

Antarius

Re: SME Server falling over
« Reply #7 on: June 23, 2004, 03:19:54 PM »
Quote from: "NickR"
In my case, it turned out to be the NIC sharing an interupt with a SCSI controller. I gave the NIC an interupt of its own & it has been happy ever since.  

This may not be your problem, but it's worth checking what /proc/pci is telling you.


Ye Ghods, why didn't I think of that?  I'm telling you that somedays Plug'n'Pray makes me too complacent...

Anyway, I introduced my cat to /proc/pci and found that one NIC is indeed sharing an interrupt (specifically IRQ 10 with a <shudder> Trident Blade video card.)

It also reminded me that this beast is running with two of those insidious RealTek 8139s in it.

I'll try playing with the interrupts first (hell, the video card doesn't even need one!) and if that doesn't work, I'll look at putting a decent NIC in it! :)


Thanks for the thoughts - I'll again let you know how I go. :)