Koozali.org: home of the SME Server

hw or sw

frisk

hw or sw
« on: October 20, 2003, 09:24:42 PM »
Is this a hardware or software problem?  I've swapped out the RAM.  That didn't seem to make a difference.

Unable to handle kernel paging request at virtual address 80000000
Oct 18 09:56:49 e-smith kernel: current->tss.cr3 = 039ba000, %%cr3 = 039ba000
Oct 18 09:56:49 e-smith kernel: *pde = 00000000
Oct 18 09:56:49 e-smith kernel: Oops: 0000
Oct 18 09:56:49 e-smith kernel: CPU:    0
Oct 18 09:56:49 e-smith kernel: EIP:    0010:[kmem_cache_alloc+49/292]
Oct 18 09:56:49 e-smith kernel: EFLAGS: 00010002
Oct 18 09:56:49 e-smith kernel: eax: c3faafe0   ebx: c3faafe0   ecx: 80000000   edx: c24e405c
Oct 18 09:56:49 e-smith kernel: esi: c167fdb0   edi: c7edf680   ebp: 00000282   esp: c167fc78
Oct 18 09:56:49 e-smith kernel: ds: 0018   es: 0018   ss: 0018
Oct 18 09:56:49 e-smith kernel: Process qmail-queue (pid: 1190, process nr: 26, stackpage=c167f000)
Oct 18 09:56:49 e-smith kernel: Stack: c167fe68 bffe0000 c012b321 c7edf680 00000015 00000000 c167fdb0 c01de857
Oct 18 09:56:49 e-smith kernel:        c3f9fc00 bffffe62 c0134591 0001fe62 c167fe68 c021e330 fffffff8 c167e000
Oct 18 09:56:49 e-smith kernel:        c167fe68 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Oct 18 09:56:49 e-smith kernel: Call Trace: [setup_arg_pages+69/264] [cprt+1399/20000] [load_elf_binary+1729/3480] [do_generic_file_read+1496/1508] [cprt+1396/20000] [read_exec+194/316] [search_binary_handler+71/288]
Oct 18 09:56:49 e-smith kernel:        [do_execve+383/480] [do_execve+417/480] [sys_execve+47/88] [system_call+52/56] [startup_32+43/285]
Oct 18 09:56:49 e-smith kernel: Code: 8b 01 89 03 85 c0 74 2b 8b 73 04 85 f6 75 10 89 19 89 c8 2b
Oct 18 09:56:49 e-smith kernel: Unable to handle kernel paging request at virtual address 80000000
Oct 18 09:56:49 e-smith kernel: current->tss.cr3 = 039ba000, %%cr3 = 039ba000
Oct 18 09:56:49 e-smith kernel: *pde = 00000000
Oct 18 09:56:49 e-smith kernel: Oops: 0000
Oct 18 09:56:49 e-smith kernel: CPU:    0
Oct 18 09:56:49 e-smith kernel: EIP:    0010:[kmem_cache_alloc+49/292]
Oct 18 09:56:49 e-smith kernel: EFLAGS: 00010006
Oct 18 09:56:49 e-smith kernel: eax: c3faafe0   ebx: c3faafe0   ecx: 80000000   edx: c24e405c
Oct 18 09:56:49 e-smith kernel: esi: c167fdb0   edi: c7edf680   ebp: 00000282   esp: c167fc78
Oct 18 09:56:49 e-smith kernel: ds: 0018   es: 0018   ss: 0018
Oct 18 09:56:49 e-smith kernel: Process qmail-queue (pid: 1191, process nr: 26, stackpage=c167f000)
Oct 18 09:56:49 e-smith kernel: Stack: c167fe68 bffe0000 c012b321 c7edf680 00000015 00000000 c167fdb0 c01de857
Oct 18 09:56:49 e-smith kernel:        c3f10680 bffffe62 c0134591 0001fe62 c167fe68 c021e330 fffffff8 c167e000
Oct 18 09:56:49 e-smith kernel:        c167fe68 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Oct 18 09:56:49 e-smith kernel: Call Trace: [setup_arg_pages+69/264] [cprt+1399/20000] [load_elf_binary+1729/3480] [do_generic_file_read+1496/1508] [cprt+1396/20000] [read_exec+194/316] [search_binary_handler+71/288]
Oct 18 09:56:49 e-smith kernel:        [do_execve+383/480] [do_execve+417/480] [sys_execve+47/88] [system_call+52/56] [startup_32+43/285]
Oct 18 09:56:49 e-smith kernel: Code: 8b 01 89 03 85 c0 74 2b 8b 73 04 85 f6 75 10 89 19 89 c8 2b
Oct 18 09:56:49 e-smith kernel: Unable to handle kernel paging request at virtual address 80000000
Oct 18 09:56:49 e-smith kernel: current->tss.cr3 = 039ba000, %%cr3 = 039ba000
Oct 18 09:56:49 e-smith kernel: *pde = 00000000
Oct 18 09:56:49 e-smith kernel: Oops: 0000
Oct 18 09:56:49 e-smith kernel: CPU:    0
Oct 18 09:56:49 e-smith kernel: EIP:    0010:[kmem_cache_alloc+49/292]
Oct 18 09:56:49 e-smith kernel: EFLAGS: 00010002

frisk

Re: hw or sw
« Reply #1 on: October 20, 2003, 09:29:27 PM »
And I also got this message:

Oct 18 22:49:19 e-smith kernel: swap_duplicate: entry 08000000, offset exceeds max
Oct 18 22:49:19 e-smith kernel: VM: killing process perl5.6.0
Oct 18 22:49:19 e-smith kernel: swap_free: offset exceeds max
Oct 18 22:49:43 e-smith kernel: swap_duplicate: entry 08000000, offset exceeds max
Oct 18 22:49:43 e-smith kernel: VM: killing process perl5.6.0
Oct 18 22:49:43 e-smith kernel: swap_free: offset exceeds max

frisk

Re: hw or sw
« Reply #2 on: October 20, 2003, 09:32:24 PM »
And this:

Oct 20 11:20:10 e-smith kernel: attempt to access beyond end of device
Oct 20 11:20:10 e-smith kernel: 03:06: rw=0, want=544869896, limit=39736746
Oct 20 11:20:10 e-smith kernel: dev 03:06 blksize=4096 blocknr=136217473 sector=1089739784 size=4096 count=1
Oct 20 11:20:10 e-smith kernel: attempt to access beyond end of device
Oct 20 11:20:10 e-smith kernel: 03:06: rw=0, want=544869896, limit=39736746
Oct 20 11:20:10 e-smith kernel: dev 03:06 blksize=4096 blocknr=136217473 sector=1089739784 size=4096 count=1
Oct 20 11:20:10 e-smith e-smith-bg: depmod: error reading ELF section data /lib/modules/2.2.16-22/misc/buz.o: Function not implemented

neurotecimbecile

hw., sw
« Reply #3 on: October 20, 2003, 11:50:09 PM »
hi there,,,
   actually you cannot just swapped a ram from your system,,,even if it does the same amount and brand of ram that you swapped,,,your system has crashed due to swapping of ram,,,,every ram that is installed in default by the setup is identified by the system,,so swapping of ram is a no,no or any hardware,,,your system can be repaired by inserting the bootdisk and debugged the problem

frisk

Re: hw., sw
« Reply #4 on: October 20, 2003, 11:54:46 PM »
Hey neuro,

The problem appeared about a month BEFORE I swapped out the ram.  So, ram isn't to blame.

Could it be the processor or MB?  The server needs to be restarted once a day.

Klaus Eckert

Re: hw., sw
« Reply #5 on: October 21, 2003, 04:40:56 AM »
try to run the memory-test to check the ram.
search the forum for "memory-test" to find the programm...

cheers klaus

Tom Keiser

Re: hw., sw
« Reply #6 on: October 21, 2003, 05:52:13 AM »
This is a devilish problem to isolate and fix. In addition to memory, motherboard and cpu, you could also have developed a problem with the code on your boot drive -- I know I've fixed this same problem more than once with a fresh install on a new boot drive. If you don't use a separate boot drive from your data drive(s) then you'll have to backup and replace your "main" drive, then install SME and restore your data.

Ultimately, you may just have to replace things until the problem stops.

Good luck.

Tom

Reinhold

Re: hw., sw
« Reply #7 on: October 21, 2003, 02:18:06 PM »
"devilish problem "... Tom's right..

Since I looked up C.Brady's stuff just the other day:
info: http://forums.contribs.org/index.php?topic=16744.msg64825#msg64825
dl:   http://www.ibiblio.org/pub/linux/distributions/e-smith/contrib/CharlieBrady/memtest/

You might want to have a close look at your power supply (maybe via the system monitor you hopefully installed).
You also might want to inspect the mainboard, especially cpu-fan(s) and capacitors near to the cpu.
good luck

frisk

Re: hw., sw
« Reply #8 on: October 21, 2003, 08:31:21 PM »
Thank you all for your insight.  Yes, I have reinstalled several RPM's that had corrupt code (maybe from bad ram??) and that seems to have helped; however, I still get the odd error message now and again like:

Oct 21 07:00:00 e-smith kernel: swap_free: offset exceeds max

But, at least there are less and less error messages now and the system seems more stable (after I swapped the ram out and fixed the corrupt software RPMS).

I'll keep you posted.