Koozali.org: home of the SME Server

pppoe service does not restart although supervised

pwalter

pppoe service does not restart although supervised
« on: May 07, 2005, 07:16:27 PM »
All,

Environment: SME 6.01 with smeplus installed
From time to time, the pppoe service terminates, apparently either because the bridged dsl modem resets (power failure) or the dsl service provider stops responding.

The server log contains the following entries:
Code: [Select]
pppoe[1734]: Session terminated -- received PADT from peer
pppd[1696]: Script /usr/sbin/pppoe -I eth1 -T 120 -U -m 1412 finished (pid 1734), status = 0x0

or
Code: [Select]
pppd[1695]: No response to 2 echo-requests
pppd[1695]: Serial link appears to be disconnected.
pppd[1695:] sent [LCP TermReq id=0x2 "Peer not responding"]
pppd[1695]: sent [LCP TermReq id=0x3 "Peer not responding"]
pppd[1695]: Connection terminated.


The pppoe service is supposed to be supervised, and be restarted automatically, according to the docs for daemontools - but it does not restart.

although the daemon appears to shut down, svstat /service/pppoe reports:
Code: [Select]
/service/pppoe: up (pid xxxx), xxxx seconds, normally down


 The only ways I can get it to start again is to :

1) determine the supervised pid reported by svstat and manually kill it - it restarts almost immediately
2) restart the server
3) manually run the script /service/pppoe/run

All three restore order, but I would really like to figure out why the process is not terminating cleanly in the first place.

Any suggestions, anyone?

Peter

pwalter

pppoe service does not restart although supervised
« Reply #1 on: May 18, 2005, 08:04:02 PM »
The problem is still not fixed. n the interim, I have implemented a temporary cron job which checks every minute if the ppp0 device exists, and kills the supervised pppoe process if it does not - causing the supervised process to restart automatically. I did this to give me time to try to investigate what the problem really is. What I think it is is that I had included the following template changes to sme:
Code: [Select]

<Script (smtp-redir located in /etc/e-smith/events/actions):>
#!/bin/sh

#------------------------------------------------------------
# Hacked script to try to redirect smtp traffic from external
# port 2525 to internal port 25
#------------------------------------------------------------

# description: Configures IP redirection from an external port to an alternate internal port.

OUTERNET=$(/sbin/e-smith/db configuration get ExternalIP)
/usr/local/bin/redir --lport=2525 --laddr=$OUTERNET --cport=25 --caddr=127.0.0.1

exit 0
</script>

Change the permissions on the script to 0554, and the owner and group to root.
Include symlinks in /etc/e-smith/events/ip-up & ip-change of S88-smtp-redir that point at this script :
ln -s /etc/e-smith/events/actions/smtp-redir /etc/e-smith/events/ip-up/S88-smtp-redir
ln -s /etc/e-smith/events/actions/smtp-redir /etc/e-smith/events/ip-change/S88-smtp-redir

What I suspect is happening is that when pppoe fails and terminates, it is not automatically restarting because the changes I made above mean that the redir program is still running in the same "supervised" process as was pppoe, therefore supervise thinks that the process is still active, and does not restart it.

Is my analysis correct? If so, what would be the "best" way to move the "redir" script so that it is supervised separately, but started after pppoe?

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
pppoe service does not restart although supervised
« Reply #2 on: May 20, 2005, 03:09:12 AM »
Quote from: "pwalter"

What I suspect is happening is that when pppoe fails and terminates, it is not automatically restarting because the changes I made above mean that the redir program is still running in the same "supervised" process as was pppoe, therefore supervise thinks that the process is still active, and does not restart it.

Is my analysis correct?
[/quit]

Yes, I would think so. Any script in the ip-up event directory is expected to terminate as soon as possible.

Quote

If so, what would be the "best" way to move the "redir" script so that it is supervised separately, but started after pppoe?


Have a look in the bug tracker. There might be a fix which allows portforwarding to localhost. If not, then yes, you could supervise redir separately. Remember to use "exec redir" rather than "redir" as the last line of the run script.

You shouldn't really need to start it after pppoe. You could have it redirect from 0.0.0.0. Or you could drop in a script to restart it during the ip-change event. I'd try the former first.

You might also look at running redir from /etc/inittab.

pwalter

pppoe service does not restart although supervised
« Reply #3 on: May 20, 2005, 03:20:40 AM »
Charlie,
Thank you very much.

Peter