Koozali.org: home of the SME Server

Server locks up with NIC panic after upgrade from 6.0.1

Offline jimgoode

  • **
  • 40
  • +0/-0
Server locks up with NIC panic after upgrade from 6.0.1
« on: May 18, 2009, 06:53:06 PM »
Was running 6.0.1-01 as 'server and gateway'.
Made backups of 6.0.1-01.
Ran '/sbin/e-smith/signal-event pre-backup'.
Did a shutdown, removed primary disk, installed a different disk.
Installed 7.4 as 'server and gateway' on the new disk.
Did a shutdown, installed and mounted old primary disk and ran
    '/sbin/e-smith/signal-event restore-tape /mnt/tmp' to copy data and settings.
Installed software updates and modified some configuration via user-panels.

PROBLEM: Each day since the upgrade, the server has locked up at least one time. Each time the keyboard and mouse have been unresponsive, and both NICs have displayed solid lights, no flickering. On two of the lock ups I was able to see about a page of errors (which I cannot find duplicated anywhere in any log files) that indicate a panic_epic100 error. I have two SMSC EPIC/100 NICs in use. I have looked at the message logs for both the before and after upgrade and they look pretty similar except for the IRQ assignments. Pre-upgrade used IRQs 5 and 10, post-upgrade uses IRQs 209 and 217 (which is pretty strange).

Has anyone else encountered this issue? Any suggestions on trouble shooting? I can post more data from the message logs and ifconfig.

Thanks,
Jim

Offline uniqsys

  • *
  • 133
  • +0/-0
Re: Server locks up with NIC panic after upgrade from 6.0.1
« Reply #1 on: May 19, 2009, 03:19:51 AM »
I suggest you post this to the bug tracker.
...

Offline janet

  • *****
  • 4,812
  • +0/-0
Re: Server locks up with NIC panic after upgrade from 6.0.1
« Reply #2 on: May 19, 2009, 05:00:48 AM »
jimgoode

sme 7.4 is a very different "beast" than sme 6.1, and uses a very different underlying OS version. You should not have waited so long to update as there will be little interest shown in resolving upgrade issues now. sme7.0 was released nearly 3 years ago on 1 July 2006 !

Ensure your hardware (ie NICs & everything else) are compatible with CentOS 4.6

If you suspect the NICs are the problem, then it would be very easy to replace those NICs with known compatible brands/models, and then see what happens.
Please search before asking, an answer may already exist.
The Search & other links to useful information are at top of Forum.

Offline cactus

  • *
  • 4,880
  • +3/-0
    • http://www.snetram.nl
Re: Server locks up with NIC panic after upgrade from 6.0.1
« Reply #3 on: May 19, 2009, 09:21:41 AM »
Ensure your hardware (ie NICs & everything else) are compatible with CentOS 4.6
I believe it is even CentOS 4.7, not that it would matte rto much for HW compatibility.
Be careful whose advice you buy, but be patient with those who supply it. Advice is a form of nostalgia, dispensing it is a way of fishing the past from the disposal, wiping it off, painting over the ugly parts and recycling it for more than its worth ~ Baz Luhrmann - Everybody's Free (To Wear Sunscreen)

Offline janet

  • *****
  • 4,812
  • +0/-0
Re: Server locks up with NIC panic after upgrade from 6.0.1
« Reply #4 on: May 19, 2009, 11:18:46 PM »
cactus

Quote
I believe it is even CentOS 4.7....

Yes I did "think" that at the time of writing, but did not consider it overly important to check as it just uses a few more minutes of my time, and as you say, should not impact greatly on the compatibility issue.

At times it's hard to keep up with the frequent changes, thanks to the work of those involved in the upgrade releases ! (that's meant as a good comment).
Please search before asking, an answer may already exist.
The Search & other links to useful information are at top of Forum.

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: Server locks up with NIC panic after upgrade from 6.0.1
« Reply #5 on: May 20, 2009, 03:35:39 AM »
On two of the lock ups I was able to see about a page of errors (which I cannot find duplicated anywhere in any log files) that indicate a panic_epic100 error.

When the kernel panics, it stops writing to disk. If you get a kernel panic, you should take a digital photo of the screen and/or write down exactly what you see.

Google doesn't know of panic_epic100, so I doubt that is exactly what you saw.