Koozali.org: home of the SME Server
Legacy Forums => Experienced User Forum => Topic started by: Vince Levalois on January 24, 2003, 03:33:02 AM
-
When connecting via PPTP VPN to an SME v5.6 we've found issues related to GRE (Protocol 47) errors. This was reported to Mitel 3 weeks ago and since then we've been trying to reproduce the problem with an XP Workstation (on which the problems have been consistent). Whether from a NAT'ed remote station, from a dialup or public IP, the problem was the same.
Analysis of simultaneous tcpdumps from the SME Server and the XP machine revealed little since we could not reproduce the problem.... until this morning :)
Constantly pinging the SME server on the local side, then stopping and initiating a browser session of the Server Manager was enough to cause a GRE error in the messages log on the SME and halt ALL activity on the PPTP connection to the point of disconnection if degraded enough.
The close analysis found that the XP machine was giving out a consistent Call ID set to 0 on all packets. The SME was set to Call ID 16384. When the GRE error occurs, the XP shows sending 0 but the SME is receiving 16384... Coincidence? We don't think so.
So, the solution is seems is to roll the PPTP RPM package back one level to see if the problem dissapears. This would expose the culprit being the PPTP in the RPM and the balance of the 5.6 system can continue working as is
I'll update this post tomorrow after testing.
Vince L.
-
Vince Levalois wrote:
> So, the solution is seems is to roll the PPTP RPM package
> back one level to see if the problem dissapears. This would
> expose the culprit being the PPTP in the RPM and the balance
> of the 5.6 system can continue working as is
Vince, I now suspect that the kernel modules which do NAT and connection tracking for PPTP, to support outbound connections, are the most likely cause. The first thing I'd suggest is that you unload the modules, by doing:
/sbin/rmmod ip_nat_pptp
/sbin/rmmod ip_conntrack_pptp
and then try a remote connection. You'll need to make sure that you don't have any outbound connections before you will be able to unload the modules.
Alternatively you could temporarily delete the module load lines from /etc/rc.d/init.d/masq and then reboot.
If this works we can make that change semi-permanent, and that'll solve your problem, but it doesn't help those who want to masquerade outbound connections :-(
Regards
Charlie
-
We went to a prior version of pptp and no change...
It's up to you now Charlie! :)
Vince Levalois
-
If it helps, I get the same problem on Win 98 2nd Ed. and on Windows 2000 Pro.
Also, when I could not pptp into my server across the Internet, I tried to connect to some other services that were running and could not access them across the Internet either.
I know the firewall has changed, could it have to do with the firewall or packet filtering??
steve
-
Charlie Brady wrote:
> Vince, I now suspect that the kernel modules which do NAT and
> connection tracking for PPTP, to support outbound
> connections, are the most likely cause. The first thing I'd
> suggest is that you unload the modules, by doing:
...
I suspect your are right ... I can't see anything in the iptables rules that would affect the masquerading.
> If this works we can make that change semi-permanent, and
> that'll solve your problem, but it doesn't help those who
> want to masquerade outbound connections :-(
That would be truly unfortunate
Steve
---
Steve Landers Scripting Design Studio
Digital Smarties steve@digital-smarties.com
Perth, Western Australia DigitalSmarties.com
-
steve wrote:
> I know the firewall has changed, could it have to do with the
> firewall or packet filtering??
Possibly. I tried to replicate the problem this afternoon without success. But I am running a development version of the firewall code. You're welcome to try that. You can get it from ftp://ftp.e-smith.org/pub/e-smith/contrib/CharlieBrady/5.6-PPTP/.
You'll also find there an RPM of the pptp nat/connection tracking kernel modules which has debugging code compiled in.
To activate, upgrade the RPMs, then do:
/sbin/e-smith/db config setprop masq Logging all
/sbin/e-smith/signal-event post-upgrade
/sbin/e-smith/signal-event reboot
The modules do quite verbose logging. If the packet filter changes help you, and you don't want the logging, you might want to roll back to the production version of the rpm:
mount /mnt/cdrom
rpm -Uhv --oldpackage /mnt/cdrom/e-smith/RPMS/\
pptp-conntrack-nat-1.0.0-3es.i686.rpm
umount /mnt/cdrom
and then you'll need to unload the ip_nat_pptp and ip_conntrack_pptp modules. The modules will need to be idle for you to unload them - it might be easier just to reboot.
Regards
Charlie
-
Charlie Brady wrote:
> Possibly. I tried to replicate the problem this afternoon
> without success. But I am running a development version of
> the firewall code. You're welcome to try that.
...
> You'll also find there an RPM of the pptp nat/connection
> tracking kernel modules which has debugging code compiled in.
I first installed the e-smith-packetfilter-1.13.0-07.noarch.rpm RPM, ran post-upgrade and rebooted but no change.
Then did the same with pptp-conntrack-nat-1.0.0-4es.i686.rpm and had success. Can now have an outoing pptp connection through the SME 5.6 box.
I haven't tried regressing and just installing the second RPM - I'm just happy it is working :-) Let me know if there's anything else I can do to help track this down.
Cheers
Steve
-
I wrote:
> Then did the same with pptp-conntrack-nat-1.0.0-4es.i686.rpm
> and had success. Can now have an outoing pptp connection
> through the SME 5.6 box.
Not quite ....
The outgoing connection stayed up for around 4 minutes, then disconnected and couldn't be re-established until I rebooted the server.
So, I've turned on tracing. The results of tracing an attempted connection are at http://www.digital-smarties.com/pub/pptp-trace
HTH
Steve
-
I'm having problems with INBOUND connection on 5.6 as wellas outgoing.
Since the upgrade to 5.6 inbound PPTP has been difficult to establish and once established its been flakey at best.
In order to establish a connection in either direction I have to disable the auto negotiation on the W2K client and fix the setting at PPTP.
I wont be able to try Charlies beta rpms until Tuesday as the box is at another site.
.
-
Steve Landers wrote:
> Let me know if there's
> anything else I can do to help track this down.
Steve, if you could run three simultaneous tcpdumps, one on your client machine, and one on the internal and one on the external interface of your gateway, then send the three files, together with the relevant section of /var/log/messages, and a description of exactly what you were trying to do, and what you observed, to bugs@e-smith.com, then that would be a great help.
When you run tcpdump, please use "-s 0" to capture full packets, "-w xxx" to write to a file, and "-i ethX" to select the interface.
Thanks
Charlie
-
The great support team at Mitel has band-aided this situation successfully by removing (remming)
/sbin/rmmod ip_nat_pptp
/sbin/rmmod ip_conntrack_pptp
from the loadup script (masq). Trying to unload the services simply crashed the server likely because of some dependency.
Once this was accomplished the connections are smooth as silk. As stated though, no outbound masquerading of pptp connections.
Vince Levalois
-
Could you give a brief instruction how to do this for a running esmith 5.6
-
pico /etc/rc.d/init.d/masq
add the lines
/sbin/rmmod ip_nat_pptp
/sbin/rmmod ip_conntrack_pptp
to be the first actual commands in the file
save
reboot
Millsey
-
Brian Mills wrote:
>
> pico /etc/rc.d/init.d/masq
>
> add the lines
>
> /sbin/rmmod ip_nat_pptp
> /sbin/rmmod ip_conntrack_pptp
>
> to be the first actual commands in the file
>
> save
> reboot
I doubt that will help. That will try to rmmod some modules which aren't loaded, and then will load those very modules. I recommend instead commenting out the "/sbin/modprobe ip_nat_pptp" and "/sbin/modprobe ip_conntrack_pptp" lines, then reboot.
And remember that this file is templated, so you should create a custom template fragment to prevent the loss of your change on reconfiguration.
Charlie
-
OMG what a pratt!
I could not find the existence of these lines in the masq file but I found them as soon as you replied! I have now commented out the lines and will try connecting later, thanks!
I have no idea how to set this up as a template fragment, could you give me a little info on how to do this?
Millsey
-
could someone please give me easy instruction to fix this.
Thank you
-
This is how I did it:
1) mkdir /etc/e-smith/templates-custom/etc/rc.d/init.d/masq
2) cp /etc/e-smith/templates/etc/rc.d/init.d/masq/10masq.pptp /etc/e-smith/templates-custom/etc/rc.d/init.d/masq
3) edit the 10masq.pptp in the templates-custom directory and comment out the two lines:
# /sbin/modprobe ip_nat_pptp
# /sbin/modprobe ip_conntrack_pptp
4) remove the 10masq.pptp from the templates directory
5) run /sbin/e-smith/expand-template /etc/rc.d/init.d/masq
6) reboot
The /etc/init.d/masq should now reflect the custom template changes. Hope this is easy enough.
_Mark
-
Thanks, I will be trying this later when I finish work
Millsey
-
Yes setting up exactly as Mark Otupal listed works just fine for my PPTP dial IN to the server. Thanks for all input.
Millsey
-
Mark Otoupal wrote:
>
> 4) remove the 10masq.pptp from the templates directory
Custom templates take precedence over the defaults. You do not need to remove the default template, and its absence could cause problems if you later remove your custom template.
http://www.e-smith.org/custom/
-
I installed the most recent rpms from your contrib directory, and my outbound pptp connections seem to run a LOT better. Haven't tested inbound. Do you have the code for the modules without debugging included? My log is being filled with too much information for me :-) Everytime I open up a message in exchange (connected through the vpn), I hear my sme's disk going :-)
Joost
PS: What I installed:
e-smith-packetfilte~3.0-07.noarch.rpm
ppp-mppe-modules-2.4.2-4es.i686.rpm
pptp-conntrack-nat-1.0.1-1es.i686.rpm
-
I also installed the dev PPTP contribs (ftp://ftp.e-smith.org/pub/e-smith/contrib/CharlieBrady/5.6-PPTP/) on a 5.6 test server. Unfortunately, I used it as a gateway (XP pro machine connected to the 5.6 server/gateway) to connect to a client's production SME 5.5U2 server using PPTP. I crashed the production server twice within a few hours (see my previous post http://www.e-smith.org/bboard//read.php?v=t&f=1&i=24464&t=23997). I saw that the message log on the production server was filled with hundreds if not thousands of PPTP messages. Below is a section of the message log from the second crash. After I reverted to a 5.5 server/gateway to make the connection to the 5.5 server, the server has run continuosly with no problems. I thought I was safe going OUT to connect to the production server. This behavior concerns me because of the possibility that anyone with this faulty 5.6 setup may be able to crash a production SME 5.5 server just by making a PPTP connection. Perhaps there was some other issue. Unfortunately, it is not something I want to try again to test with the client's server. If anyone else has enough test servers to try this on, the results could be useful in debugging this problem or concluding that my case was a fluke.
Thanks,
jim
MESSAGE LOG extract: (Note that the first 4 lines had been repeated hundreds of times, as though some sort of negotiation was taking place - soory I am not expert on the proper PPTP terminolgy.)
------------------------------------
Feb 19 02:52:38 gateway pptpd[1740]: CTRL: Received PPTP Control Message (type: 5)
Feb 19 02:52:38 gateway pptpd[1740]: CTRL: Made a ECHO RPLY packet
Feb 19 02:52:38 gateway pptpd[1740]: CTRL: I wrote 20 bytes to the client.
Feb 19 02:52:38 gateway pptpd[1740]: CTRL: Sent packet to client
Feb 19 02:52:58 gateway kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000
Feb 19 02:52:58 gateway kernel: current->tss.cr3 = 2ca55000, %%cr3 = 2ca55000
Feb 19 02:52:58 gateway kernel: *pde = 00000000
Feb 19 02:52:58 gateway kernel: Oops: 0000
Feb 19 02:52:58 gateway kernel: CPU: 0
Feb 19 02:52:58 gateway kernel: EIP: 0010:[kmalloc+81/348]
Feb 19 02:52:58 gateway kernel: EIP: 0010:[]
Feb 19 02:52:58 gateway kernel: EFLAGS: 00010006
Feb 19 02:52:58 gateway kernel: eax: f7ff86a0 ebx: f7ff86a0 ecx: 00000000 edx: 00000550
Feb 19 02:52:58 gateway kernel: esi: f7fff2c0 edi: 00000015 ebp: 00000286 esp: eca57d98
Feb 19 02:52:58 gateway kernel: ds: 0018 es: 0018 ss: 0018
Feb 19 02:52:58 gateway pptpd[1596]: MGR: Reaped child 1740
Feb 19 02:52:58 gateway pppd[1741]: Modem hangup
Feb 19 02:52:58 gateway kernel: Process smbd (pid: 1771, process nr: 82, stackpage=eca57000)
Feb 19 02:52:58 gateway kernel: Stack: 00000015 eca57e34 c014f549 00000604 00000015 e9ffeb80 eca57e30 e9ffeb80
Feb 19 02:52:58 gateway kernel: c014ed4c 00000600 00000015 0000ff00 00000550 c01692c5 e9ffeb80 00000600
Feb 19 02:52:58 gateway kernel: 00000000 00000015 eca57e30 eca57ed4 e9ffeb80 c01783ec 00000001 00000600
Feb 19 02:52:58 gateway kernel: Call Trace: [alloc_skb+113/220] [sock_wmalloc_err+52/104] [tcp_do_sendmsg+965/2168] [inet_sendmsg+0/144] [tcp_v4_sendmsg+92/104] [inet_sendmsg+131/144] [sock_sendmsg+136/172]
Feb 19 02:52:58 gateway kernel: Call Trace: [] [] [] [] [] [] []
Feb 19 02:52:58 gateway kernel: [inet_sendmsg+0/144] [sys_sendto+199/236] [update_atime+94/100] [do_generic_file_read+1492/1504] [generic_file_read+99/124] [sys_send+30/36] [sys_socketcall+269/480] [system_call+52/56]
Feb 19 02:52:58 gateway kernel: [] [] [] [] [] [] [] []
Feb 19 02:52:58 gateway kernel: Code: 8b 01 89 03 85 c0 74 2b 8b 7b 04 85 ff 75 10 89 19 89 c8 2b
Feb 19 02:52:58 gateway kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000
Feb 19 02:52:58 gateway kernel: current->tss.cr3 = 2c804000, %%cr3 = 2c804000
Feb 19 02:52:58 gateway kernel: *pde = 00000000
Feb 19 02:52:58 gateway kernel: Oops: 0000
Feb 19 02:52:58 gateway kernel: CPU: 0
Feb 19 02:52:58 gateway kernel: EIP: 0010:[kmalloc+81/348]
Feb 19 02:52:58 gateway kernel: EIP: 0010:[]
Feb 19 02:52:58 gateway kernel: EFLAGS: 00010006
Feb 19 02:52:58 gateway kernel: eax: f7ff86a0 ebx: f7ff86a0 ecx: 00000000 edx: ffffffe0
Feb 19 02:52:58 gateway kernel: esi: f7fff2c0 edi: 00000015 ebp: 00000286 esp: ec819e18
Feb 19 02:52:58 gateway kernel: ds: 0018 es: 0018 ss: 0018
Feb 19 02:52:58 gateway kernel: Process pptpctrl (pid: 1740, process nr: 81, stackpage=ec819000)
Feb 19 02:52:58 gateway kernel: Stack: 00000015 00000040 c014f549 000005c4 00000015 e9ffe340 00000000 ec818000
Feb 19 02:52:58 gateway kernel: c014eceb 000005bc 00000015 e9ffe340 c014ef5f e9ffe340 000005bc 00000000
Feb 19 02:52:58 gateway kernel: 00000015 00000010 00000040 0000059d e5781a20 ec818000 c0167144 e9ffe340
Feb 19 02:52:58 gateway kernel: Call Trace: [alloc_skb+113/220] [sock_wmalloc+35/80] [sock_alloc_send_skb+99/168] [ip_build_xmit+228/780] [raw_sendmsg+556/612] [raw_getfrag+0/36] [inet_sendmsg+0/144]
Feb 19 02:52:58 gateway kernel: Call Trace: [] [] [] [] [] [] []
Feb 19 02:52:58 gateway kernel: [inet_sendmsg+131/144] [sock_sendmsg+136/172] [inet_sendmsg+0/144] [sock_write+146/156] [sys_write+229/280] [sock_write+0/156] [system_call+52/56]
Feb 19 02:52:58 gateway kernel: [] [] [] [] [] [] []
Feb 19 02:52:58 gateway kernel: Code: 8b 01 89 03 85 c0 74 2b 8b 7b 04 85 ff 75 10 89 19 89 c8 2b
Feb 19 02:52:58 gateway pppd[1741]: Connection terminated.
Feb 19 02:52:58 gateway pppd[1741]: Connect time 151.4 minutes.
Feb 19 02:52:58 gateway pppd[1741]: Sent 830358081 bytes, received 22174413 bytes.
Feb 19 08:39:32 gateway syslogd 1.4.1: restart.
Feb 19 08:39:32 gateway syslog: syslogd startup succeeded
-------------------------------------------------
MESSAGE LOG EXTRACT ENDS
-
After running an outgoing PPTP session for a day from behind a 5.6 sme with the latest modules from Charly Brady, I still have problems, but less than before.
- If the link is up very long without traffic, it just stays up, but no traffic can flow accross it.
- If I ping something on the other side from my windows box: (ping xx.xx.xx.xx -t -l 1 it also stays up.
I suspect it might be iptables forgetting about gre and closing the hole, or am I completely missing the point here?
Joost
-
Joost De Raeymaeker wrote:
> I suspect it might be iptables forgetting about gre and
> closing the hole, or am I completely missing the point here?
No, you are likely to be correct in your diagnosis. Do you have firewall (masq) logging enabled? You might be able to confirm your diagnosis if it is, by seeing blocked packets being logged.
Try adding:
lcp-echo-interval 30
lcp-echo-failure 4
to /etc/ppp/options.pptpd
Thanks for your testing and clear result reporting, BTW.
Charlie
-
Charlie Brady wrote:
> Try adding:
>
> lcp-echo-interval 30
> lcp-echo-failure 4
>
> to /etc/ppp/options.pptpd
Don't bother. That will enable lcp-echo/keepalive packets for inbound PPTP connections, but won't do anything for masqueraded connections.
See if you can find any configuration options for the VPN client. Or enable lcp-echo on the remote server you are connecting too. The packets every thirty seconds should keep the firewall pinhole open for you.
Let me know how you get on.
Charlie
-
I realy need some help getting my vpn to work.
I have tried Mark Otoupal post but with out any luck.
would some mind helping me. I realy need to get my vpn to work.
Thanks...
-
Update:
I went into Charlie's contrib directory and downloaded and installed pptp-conntrack-nat-1.0.1-2es, apart from the other updates that are also in there. This doesn't seem to have debugging enabled :-)
The problems remain:
-If I run a PPTP session from behind an SME server it fails pretty fast (usually only a couple of minutes, unless you ping a host on the other side ping host -t -l 1 (from a windows machine)
If I comment the following two lines in /etc/rc.d/init.d
# /sbin/modprobe ip_nat_pptp
# /sbin/modprobe ip_conntrack_pptp
and reboot the server, the connection seems to be more stable, although I must confess I never really tried to shut down the ping for too long.
-Off course, if you do that, only 1 machine from behind the sme server can connect via pptp to the outside world, so now I'm back to loading these modules and pinging from both machines.
Any other news on this?
Joost