Koozali.org: home of the SME Server

Error-Help!

noah genner

Error-Help!
« on: December 14, 2000, 08:01:55 PM »
Here's the deal. I'm running e-smith 4.0 (gateway/file server mode) on a PII 400, 128mbram, 13gb fujitsu ide drive. Everything is operating great except for one big problem. On the console I get the error below. Things are still working alright, but every so often the server completely freezes (i.e. can't reboot) and the following message (or a variant of the same) is logged to the screen (and the messages log):

Dec 12 04:03:00 e-smith kernel: kfree: Bad obj c16afe60
Dec 12 04:03:00 e-smith kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000
Dec 12 04:03:00 e-smith kernel: current->tss.cr3 = 00ef7000, %cr3 = 00ef7000
Dec 12 04:03:00 e-smith kernel: *pde = 00000000
Dec 12 04:03:00 e-smith kernel: Oops: 0002
Dec 12 04:03:00 e-smith kernel: CPU:    0
Dec 12 04:03:00 e-smith kernel: EIP:    0010:[kfree+403/424]
Dec 12 04:03:00 e-smith kernel: EFLAGS: 00010286
Dec 12 04:03:00 e-smith kernel: eax: 0000001b   ebx: c7fb2620   ecx: 0000001a   edx: 00000021
Dec 12 04:03:00 e-smith kernel: esi: c16afe60   edi: c2e80550   ebp: 00000688   esp: c09f7e68
Dec 12 04:03:00 e-smith kernel: ds: 0018   es: 0018   ss: 0018
Dec 12 04:03:00 e-smith kernel: Process slocate (pid: 4258, process nr: 83, stackpage=c09f7000)
Dec 12 04:03:00 e-smith kernel: Stack: c7fb3060 c16afa00 c2e80550 00000688 c16afa00 c2e80550 c01317c4 c16afe60  
Dec 12 04:03:00 e-smith kernel:        c09f7ed0 c09f7ed0 c021ba64 00001006 c09f7ed0 00000001 00001006 c013275b  
Dec 12 04:03:00 e-smith kernel:        fffff682 00001006 00000000 c0258450 c021ba64 c0258450 c3db9640 c3f03cb0  
Dec 12 04:03:00 e-smith kernel: Call Trace: [prune_dcache+220/300] [try_to_free_inodes+199/264] [grow_inodes+30/384] [get_new_inode+173/280] [get_new_inode+185/280] [iget+88/96] [ext2_lookup+84/124]  
Dec 12 04:03:00 e-smith kernel:        [real_lookup+79/160] [lookup_dentry+296/488] [__namei+40/88] [sys_newlstat+14/96] [system_call+52/56] [startup_32+43/285]  
Dec 12 04:03:00 e-smith kernel: Code: c7 05 00 00 00 00 00 00 00 00 83 c4 08 5b 5e 5f 5d 83 c4 08  

The only way to recover is a hard reboot. This forces me into fsck, which find numerous errors in the filesystem. Have the system fix these and everything appears normal again. Wait a couple of days and the same problem. It is getting worse.

I'm not really sure what is going on to cause the error (I'm still trying to trace). It might be happening when reasonably sized (10mb+) file transfres are happening.

Any ideas?

Thanks

Charlie Brady

RE: Error-Help!
« Reply #1 on: December 14, 2000, 08:18:59 PM »
noah genner wrote:

> Everything is operating great except for one big problem. On
> the console I get the error below.

You've either hit a kernel bug, or have bad hardware (memory/motherboard). The latter is more likely, I think.

Charlie

noah genner

RE: Error-Help!
« Reply #2 on: December 14, 2000, 08:24:22 PM »
I don't like either. Makes it very hard to trace. Any ideas?

alejandro

RE: Error-Help!
« Reply #3 on: December 14, 2000, 11:09:59 PM »
It Really sounds like a hardware trouble
try changing the disk to another similar system.
Ihave done betwen my home and office systems and have only to reconfigure the nic but everything else works ok!
Alejandro

SKIP

RE: Error-Help!
« Reply #4 on: December 19, 2000, 10:47:49 AM »
You can almost bet it is a memory stick problem. One hint is those numbers are all memory addresses that are pointed to.  The other hint is that when you download large files it put a strain on your memory system.  If your systems seems to crash when you are untarring, then yes, it is your memory. Get a new stick! It could possibly be bad sectors on your disk, but a different error usually occurs. Good Luck!