Koozali.org: home of the SME Server

SME server-only mode does not resolve names

Anonymous

SME server-only mode does not resolve names
« on: April 15, 2004, 12:58:59 AM »
I have been scratching my head over this one for days. In short, I can't resolve DNS names on a client's  SME6.0, and I can't work out why.

We are using a Draytek 2600Plus ADSL modem-router. Its local IP is 192.168.2.1. The SME is running in server-only mode, on IN 192.168.2.4, pointing to the modem-router as the default router.

The modem-router provides DHCP for IPs 10 to 64.

The ADSL service has five IP public addresses available to it. I am using one of the public IPs to forward specific ports to the SME server (HTTP, HTTPS and SMTP, port 25).

Now, local PCs on the DHCP part of the network can access the Internet fine. All names are resolved. However, I can't ping from the SME server. 'route', 'traceroute' and 'dig' fail to provide any reverse lookups. 'ping' fails to resolve any names.

I can log onto the modem-router and ping any IP, and also can see that it has a DNS supplied by the ISP for resolving names.

So - why can Windows PCs on the network resolve names, but the SME server - also on the network (though using a fixed IP, rather than a DHCP IP) not resolve any names?

I can send e-mail into the server, through a domain pointing to the public IP forwarded to the server, but I can't send out (presumably, if names are not resolving, then nothing will work properly).

Any advice on where I should look for a solution? Everything looks 'just fine' to me - but perhaps I am completely overlooking some obvious setting?

Thanks,

-- Jason Judge

Anonymous

Also...
« Reply #1 on: April 15, 2004, 01:01:09 AM »
I also have two extra local networks set up, so I can manage the server over a VPN: 192.168.1.0 and 192.168.5.0. Not sure if that makes a difference.

Anonymous

Further information
« Reply #2 on: April 15, 2004, 02:08:36 AM »
The plot thickens...

I can ping some addresses, but not everything. For example, I can ping www.contribs.org, but it pauses about 20 seconds between each packet.

I can ping my own SME box on a cable connection (and that comes back immediately).

If I try and ping, say, database.clamav.net, then it fails to retrieve an IP address for it, and times out then attempts to ping database.clamav.net.my.clients.domain, resolving the IP to the internal network IP of the local router-modem (with a 500mS return trip - again: there could be a clue there).

I don't understand this. It should either work or not. What I am finding is that some routes ping, some do not, some names resolve to public IPs, and some do not. In each of these cases, I can resolve and ping from a Windows PC connected through DHCP to the same network, or from another machine connected to my own cable connection.

-- Jason

Anonymous

More info, and still pulling my hair out...
« Reply #3 on: April 15, 2004, 03:58:30 PM »
If I run, say, dig www.google.co.uk, I get this error:

# dig www.google.co.uk
; <<>> DiG 9.2.1 <<>> www.google.co.uk
;; global options:  printcmd
;; connection timed out; no servers could be reached

Trying ping, after a delay, the ping gets brounced from the cable router (xx.xx.xx.xx is the public IP of the cable router).

# ping www.google.co.uk
PING www.google.co.uk.my.domain.uk (xx.xx.xx.xx) from 192.168.2.4 : 56(84) bytes of data.
64 bytes from hostxx-xx-xx-xx.in-addr.btopenworld.com (xx.xx.xx.xx): icmp_seq=1 ttl=255 time=0.495 ms

If I do the same thing on another machine to find the real IP address of Google, I can then ping it directly using the IP. I know, therefore, that the routing is working okay. It just appears to be name resolution that is not working.

I've never had this problem on e-smith 5.x - only on 6.x, so it is likely (?) to be a problem with tinydns. e-smith 5.x was a pleasure to set up and administer, but 6.x has been proving to be a bit of a nightmare so far.

I still don't know where to look to solve this problem. I am not a DNS expert, and do not know what tools I should be running to analyse the DNS stuff, nor how to interpret the information that those tools return. Any help would be much appreciated.

-- Jason

Anonymous

More info...
« Reply #4 on: April 15, 2004, 11:49:00 PM »
It looks like I'm trying to bump this message - but I'm not, honest. I hope that by the time I find a fix, there will be a lot of information in this thread to help anyone else to fix a similar problem.

The DNS name resolution worked fine until the normal system restart on 18 March 2004. From that point on, the DNS cache has been reporting this error when it needs to go out to a remote DNS:

2004-03-18 00:47:00.270946500 starting
2004-03-18 00:48:15.478242500 query 1 c0a8014c:0400:0069 1 servertest.e-smith.com.
2004-03-18 00:48:15.478266500 tx 0 1 servertest.e-smith.com. . c0a80108
2004-03-18 00:48:20.405080500 query 2 c0a8014c:0400:0069 1 servertest.e-smith.com.
2004-03-18 00:48:20.405104500 tx 0 1 servertest.e-smith.com. . c0a80108
2004-03-18 00:48:25.415432500 query 3 c0a8014c:0400:006a 1 servertest.e-smith.com.my.domain.
2004-03-18 00:48:25.415458500 tx 0 1 servertest.e-smith.com.my.domain. . c0a80108
2004-03-18 00:48:30.425058500 query 4 c0a8014c:0400:006a 1 servertest.e-smith.com.my.domain.
2004-03-18 00:48:30.425085500 tx 0 1 servertest.e-smith.com.my.domain. . c0a80108
2004-03-18 00:49:14.564942500 servfail servertest.e-smith.com. input/output error
2004-03-18 00:49:14.564967500 sent 1 40
2004-03-18 00:49:19.494778500 servfail servertest.e-smith.com. input/output error
2004-03-18 00:49:19.494803500 sent 2 40
2004-03-18 00:49:24.504774500 servfail servertest.e-smith.com.my.domain. input/output error
2004-03-18 00:49:24.504800500 sent 3 54
2004-03-18 00:49:29.514765500 servfail servertest.e-smith.com.my.domain. input/output error
2004-03-18 00:49:29.514792500 sent 4 54

Obviously something changed that day (17 March), but I have no idea what. Clam AV had been installed and running for a number of weeks, and there was nothing else of significance installed on the server. I'll check through the logs and see what I can find.

The following reference: http://dqd.com/~mayoff/notes/djbdns/dnscache-log.html

lists these possible causes of that error (which all sound more like configuration or permissions problems):

Some of the errors that can make dnscache do this:


- failure to allocate storage for a received DNS packet
- failure to create a UDP socket
- failure to set the O_NONBLOCK flag on the UDP socket
- failure to bind the UDP socket to a port
- failure to transmit a packet to any of up to 16 nameservers and receive a response packet with an rcode of 0 (no error) or 3 (NXDOMAIN), with four attempts per nameserver
- failure to create a TCP socket
- failure to set the O_NONBLOCK flag on the TCP socket
- failure to bind the TCP socket to a port
- failure to connect the TCP socket to any of up to 16 nameservers (one attempt per nameserver), transmit a query to the nameserver, and receive a response packet with an rcode of 0 (no error) or 3 (NXDOMAIN)

Trouble is, I can ping most of the nameservers listed in /etc/dnsroots.global, though that file on SME 6 is actually out of date (based on what I understand - at least four or five of the IPs listed need changing).

-- Jason

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: More info...
« Reply #5 on: April 16, 2004, 06:06:18 PM »
Quote from: "Anonymous"
It looks like I'm trying to bump this message - but I'm not, honest. I hope that by the time I find a fix, there will be a lot of information in this thread to help anyone else to fix a similar problem.

The DNS name resolution worked fine until the normal system restart on 18 March 2004. From that point on, the DNS cache has been reporting this error when it needs to go out to a remote DNS:

2004-03-18 00:47:00.270946500 starting
2004-03-18 00:48:15.478242500 query 1 c0a8014c:0400:0069 1 servertest.e-smith.com.
2004-03-18 00:48:15.478266500 tx 0 1 servertest.e-smith.com. . c0a80108
2004-03-18 00:48:20.405080500 query 2 c0a8014c:0400:0069 1 servertest.e-smith.com.
2004-03-18 00:48:20.405104500 tx 0 1 servertest.e-smith.com. . c0a80108
2004-03-18 00:48:25.415432500 query 3 c0a8014c:0400:006a 1 servertest.e-smith.com.my.domain.
2004-03-18 00:48:25.415458500 tx 0 1 servertest.e-smith.com.my.domain. . c0a80108
2004-03-18 00:48:30.425058500 query 4 c0a8014c:0400:006a 1 servertest.e-smith.com.my.domain.
2004-03-18 00:48:30.425085500 tx 0 1 servertest.e-smith.com.my.domain. . c0a80108
2004-03-18 00:49:14.564942500 servfail servertest.e-smith.com. input/output error
2004-03-18 00:49:14.564967500 sent 1 40
2004-03-18 00:49:19.494778500 servfail servertest.e-smith.com. input/output error
2004-03-18 00:49:19.494803500 sent 2 40
2004-03-18 00:49:24.504774500 servfail servertest.e-smith.com.my.domain. input/output error
2004-03-18 00:49:24.504800500 sent 3 54
2004-03-18 00:49:29.514765500 servfail servertest.e-smith.com.my.domain. input/output error
2004-03-18 00:49:29.514792500 sent 4 54


Your log file can be better interpreted after processing by dnscache-log.pl (use google to find) and tai64nlocal:

03-18 00:47:00 starting
03-18 00:48:15 query 1 192.168.1.76:1024:105 a servertest.e-smith.com.
03-18 00:48:15 tx 0 a servertest.e-smith.com. . 192.168.1.8
03-18 00:48:20 query 2 192.168.1.76:1024:105 a servertest.e-smith.com.
03-18 00:48:20 tx 0 a servertest.e-smith.com. . 192.168.1.8
03-18 00:48:25 query 3 192.168.1.76:1024:106 a servertest.e-smith.com.my.domain.
03-18 00:48:25 tx 0 a servertest.e-smith.com.my.domain. . 192.168.1.8
03-18 00:48:30 query 4 192.168.1.76:1024:106 a servertest.e-smith.com.my.domain.
03-18 00:48:30 tx 0 a servertest.e-smith.com.my.domain. . 192.168.1.8
03-18 00:49:14 servfail servertest.e-smith.com. input/output error
03-18 00:49:14 sent 1
03-18 00:49:19 servfail servertest.e-smith.com. input/output error
03-18 00:49:19 sent 2
03-18 00:49:24 servfail servertest.e-smith.com.my.domain. input/output error
03-18 00:49:24 sent 3
03-18 00:49:29 servfail servertest.e-smith.com.my.domain. input/output error
03-18 00:49:29 sent 4

So, why is dnscache asking 192.168.1.8 (which doesn't answer). My guess is you'll work that one out, and it's not a problem with dnscache.

Anonymous

Re: More info...
« Reply #6 on: April 16, 2004, 07:33:50 PM »
Quote from: "CharlieBrady"

So, why is dnscache asking 192.168.1.8 (which doesn't answer). My guess is you'll work that one out, and it's not a problem with dnscache.


Thanks Charlie - that is a very good question. That is the local address for the SME server on the network on which the duff server was originally set up (and worked).

That IP does not figure in any admin or setup screen now (on the console setup, I've tried pointing to <blank> and to the local router - 192.168.2.1 - with no improvement), so I have no idea why the server is trying to use it as the DNS. That IP must be stuck somewhere, and isn't getting reset or flushed. I'll search the machine and see if I can see it in any config files.

Thanks for that - I would never have thought that those hex numbers were IPs.

-- JJ

Anonymous

Re: More info...
« Reply #7 on: April 16, 2004, 08:23:27 PM »
Thinking about it, on the 18th of March, the server *was* still on my old network, so that IP would have been correct, and there is no reason why that server (at 192.168.1.8) would not have responded. That means it stopped working before I took it to the client's site.

I am therefore looking for a fault of some sort on the SME server, and not necessarily with the routeing on the external network (i.e. the router or the ISP).

I think dnscache is simply not getting out of the machine. Perhaps it is not binding to its require port? I don't know if this looks right:

# netstat -an | grep 53
tcp        0      0 192.168.2.4:53          0.0.0.0:*               LISTEN
udp        0      0 192.168.2.4:53          0.0.0.0:*
udp        0      0 192.168.2.4:53          0.0.0.0:*

The tcp binding and *one* of the udp bindings is dnscache, the other udp binding is tinydns. I can tell that by shutting the services down one at a time. Won't dnscache and tinydns conflict, both listening on the same IP and port? Perhaps I am just misunderstanding how the bindings work...?

-- Jason


-- Jason

Anonymous

Re: More info...
« Reply #8 on: April 16, 2004, 11:18:01 PM »
Quote from: "CharlieBrady"

So, why is dnscache asking 192.168.1.8 (which doesn't answer). My guess is you'll work that one out, and it's not a problem with dnscache.


My next question would be: why doesn't dnscache then go to the root servers when it fails to get any reply from the local server? Could I work around this by dumping the IPs of the 16 root servers into /etc/resolv.conf? (I guess then there would be a good chance of skipping over tinydns, but at least it would work better than not at all.)

-- Jason

Anonymous

Re: More info...
« Reply #9 on: April 17, 2004, 01:09:39 AM »
Quote from: "Anonymous"

My next question would be: why doesn't dnscache then go to the root servers when it fails to get any reply from the local server?


Hmmm. DNS resolution works if I do this:

cat /etc/dnsroots.conf >> /var/service/dnscache/root/servers/@

I'm sure that is not a permanent solution, but it works for now (until I next configure the server). According to the template that creates that file, this should already be done, provided the DNS is not set up as a 'forwarder' (I'm sure it is not, in my case - but it obviously thinks it is).

That template also contains a copy of the contents of /etc/dnsroots.conf. Better, I would have thought, to define that list in one place, especially since the root server list supplied with SME 6.0.1 is out-of-date anyway. But ho-hum, getting closer.

-- Jason

Anonymous

Re: More info...
« Reply #10 on: April 17, 2004, 01:19:14 AM »
Quote from: "AnonymousThat template also contains a copy of the contents of /etc/dnsroots.conf. Better, I would have thought, to define that list in one place, especially since the root server list supplied with SME 6.0.1 is out-of-date anyway. But ho-hum, getting closer.
[/quote


I believe this is the current list:

198.41.0.4
192.33.4.12
192.228.79.201
128.8.10.90
192.203.230.10
192.5.5.241
192.112.36.4
128.63.2.53
192.36.148.17
192.58.128.30
193.0.14.129
198.32.64.12
202.12.27.33

And can be obtained using this pipeline:

dnsip dnsqr ns . | awk '/answer:/ { print $5 ; }' | sort

-- Jason

mark_53

Server-only no DNS to local net
« Reply #11 on: April 18, 2004, 10:20:57 PM »
I'm having a similar problem in server-only mode.  If I try and ping a local host from another local host I get no reply.  Yes, DNS is enabled on the local machines and points to the SME as the DNS server.  I can ping yahoo.com but can't ping host1.mydomain.com (unless I add to the hosts file).

Anyone have a fix

Mark

nomis911

SME server-only mode does not resolve names
« Reply #12 on: April 30, 2004, 05:55:22 AM »
:hammer:

Have the same problem with sme 6

Havent found a fix - but if someone has - then please post reply !

goodevans

SME server-only mode does not resolve names
« Reply #13 on: August 16, 2004, 12:44:22 PM »
I've had a similar problem over the past couple of weeks with our dns settings. Mostly, it was fine, but certain domains just would not lookup. Our many problem was that our parent company, a .is address, just wasn't there.

I am running 2 SME 6.0 servers, one as server+gateway, the other server only, both behind a Zoom X3 dsl router. The problem existed on both of them.

I finally found it thanks to this thread, but my fix was slightly different. I updated the @ list of name servers, but the problem was still there. Instead, my fix was to replace the entire list with the ip of my router. a quick restart of dnscache, and everything has worked perfectly since (touch wood).

Hope this helps someone.