Koozali.org: home of the SME Server

dnscache wont run from service script

Offline timb

  • ***
  • 41
  • +0/-0
    • http://www.tbitc.com
dnscache wont run from service script
« on: April 13, 2005, 09:13:23 AM »
Just upgraded a series of 601 boxes to e-smith-dnscache-0.5.0-07.noarch.rpm and e-smith-lib-1.15.1-36.noarch.rpm

Only one is giving any issues (the last one I did)

A service dnscache start seems to start the services but no dns is available.

If as root I change to /service/dnscache and  service dnscache stop to kill of any non functioning but running dnscache. I can then ./run and dns works a treat.

So it's got to be a file permissions problem.... all files are chowned dnslog:dnslog or root:root exactly the same as two other functioning sister servers.

Nothing appears to be written to /var/log/dnscache/current
and when I ./run current is being written to /service/dnscache

I certainly dont want to continue to run dnscache as root.

What have I missed?

Are there any other places I can go to get dnscache to tell me what it's doing when it's started by the service script?

thanks

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: dnscache wont run from service script
« Reply #1 on: April 13, 2005, 08:53:42 PM »
Quote from: "timb"
Just upgraded a series of 601 boxes to e-smith-dnscache-0.5.0-07.noarch.rpm and e-smith-lib-1.15.1-36.noarch.rpm


No wonder you are having problems. e-smith-dnscache-0.5.0-07 is an alpha test version, and with known problems on 7.0alpha, and completely untested on 6.x.

I'd suggest you go back to a known working dnscache configuration.

Offline timb

  • ***
  • 41
  • +0/-0
    • http://www.tbitc.com
Problem solved - what was it?
« Reply #2 on: April 15, 2005, 09:12:14 AM »
First Charlie - you are right about installing untested etc etc.

I had checked this out on other machines prior - and thought I was ok. However, I think I fell foul of something altogether different.


What happened?

I did a signal event post-upgrade, then reboot at 6am the next morning (remotely) and the machine never came back on line.... AAHHGHGHHG! I drove the 100+ Kms to see what had happened.

When I got there - it was as though it was the very first boot of the sme installation - I was asked for domain machine name and root password ONLY and up she came. BTW I could login as root in ALT-F2

All networking was dead.

I stepped through the configuration - all settings were lost! I certainly didnt see this on any of my test machines!

I reset the configuration by going through all the screens, rebooted and away she ran.

I downgraded to the recommended dnscache 304 just in case. It's a 100kms away after all.

What happened? Dunno - I am a bit lost. Prior to my reboot the server had been up 67 days.

I confess that I have 2 boxes runing the alpha 7 versions of dnscache seeminly just fine for at least a couple of weeks.


So in the end what I didnt know until a reboot revealed in desperation was that something was amiss.

Surely simply putting in the alpha 7 version of dnscache doesnt cause what I saw?

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: Problem solved - what was it?
« Reply #3 on: April 15, 2005, 04:15:11 PM »
Quote from: "timb"

When I got there - it was as though it was the very first boot of the sme installation - I was asked for domain machine name and root password ONLY and up she came. BTW I could login as root in ALT-F2

All networking was dead.

I stepped through the configuration - all settings were lost! I certainly didnt see this on any of my test machines!

I reset the configuration by going through all the screens, rebooted and away she ran.


So apparently something deleted your configuration database. I'd advise you to thoroughly scour the logs looking for evidence of bugs, and also to check the security of the server.

Unless you just happened to type "rm configuration" and have forgotten about it.

Offline timb

  • ***
  • 41
  • +0/-0
    • http://www.tbitc.com
deleted configuration database
« Reply #4 on: April 16, 2005, 01:14:24 AM »
Charlie,
My thoughts EXACTLY - Except for one thing. In the madness of an organisation being down (one full of computer controlled lathes etc that need files from the network etc) and me seemingly having no configuration settings, a flu headache and not having a clue what the internet settings were (all written down and safely stored 100kms away :-{) a flash of inspiration struck!

"I will get the configuration from the database" - it hadnt occured to me that the configuration was "lost" - it doesnt read like I mean trust me. I did think they were lost - but years of computers have told me when the crisis stikes use 50% logic 50% instinct... Worry about why LATER.

So I did a /sbin/e-smith/db configuration show >14Mar05Configuration.txt and re-entered the settings by reading my dump of the settings - so surely they were not lost! Why was I asked for all of them?

So I think you can now see why i have NFI why this all happened, and am now inclined to beleive I stumbled into a trap and it had nothing to do with dnscache at all.

Did someone say hacked? I must admit one of todays activity is to go and run some rootkit tools and look around... but dare I say ... the machine doesnt FEEL hacked - some readers will know what I mean others will perhaps want to flame me - please dont bother....

Certainly the logs show plenty of attempts - like all servers, and there is no sudden change in diskspace or other stats... I know this doesnt mean definitely not hacked... Anyway I will check today.

Any other things you can think of Charlie?

I would love a tool that checked the configuration database - yes the dump followed by a read in less would seem enough  - but in this case it wasnt...

I am still damaged by the whole affair   :-o

Offline timb

  • ***
  • 41
  • +0/-0
    • http://www.tbitc.com
ps
« Reply #5 on: April 16, 2005, 01:15:48 AM »
I also did an rpm --verify and only things one would expect to have changed had... eg the odd conf file etc.

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: deleted configuration database
« Reply #6 on: April 16, 2005, 01:27:32 AM »
Quote from: "timb"

So I did a /sbin/e-smith/db configuration show >14Mar05Configuration.txt and re-entered the settings by reading my dump of the settings - so surely they were not lost!


If "db configuration show" showed you all the settings, then they weren't lost, and in fact there was no need to re-enter them.

Quote

Why was I asked for all of them?


Because the console application (running as bootstrap-console) couldn't read them (for a reason we don't know). Unless it finds "PasswordSet = yes" then it will go into the first-time-boot stuff.

The lack of networking looks to be just because bootstrap-console hadn't finished executing.

Nothing relevant in syslog?

Offline timb

  • ***
  • 41
  • +0/-0
    • http://www.tbitc.com
what motherboard fry ups can teach you
« Reply #7 on: April 19, 2005, 06:42:09 AM »
Read all the above saga first - if you havent already.

Then at about 2pm on Sunday afternoon the motherboard in my server fries. IO chip burns fingers - nasty.

I replace it and boot my machine up. Lo and behold same problems as above.


Sheesh - final conclusion....
I had installed perl-PAM-Authen ...15 from an rpm from dag.weeirs for rh73 because I didnt want to use the centos compatible one from the mitel alpha 7 build - I was trying to do the right thing!

It seems that the dag weeir rh73 rpm broke e-smith so the problem was not dnscache or the e-smithlib 115 it was that PAM e-smith communication wasnt working. So PAM didnt allow anything much.

This is why the server console played up. I confess I dont know how server console works - but I sure know about PAM

So why did my server exhibit the syptoms only after a mother board fry up - because the different model m/b required a configuration change.

Lesson learnt

Send all unkind comments to /dev/null

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: what motherboard fry ups can teach you
« Reply #8 on: April 19, 2005, 04:47:14 PM »
Quote from: "timb"

Sheesh - final conclusion....
I had installed perl-PAM-Authen ...15 from an rpm from dag.weeirs for rh73 because I didnt want to use the centos compatible one from the mitel alpha 7 build - I was trying to do the right thing!


I still haven't worked out why you are using bleeding edge packages on a 6.x server. What do you think is wrong with e-smith-dnscache on 6.x? And for that matter, what's wrong with perl-Authen-PAM on 6.x?

Puzzled.