Koozali.org: home of the SME Server

Ethernet Bonding: NIC failure - legit hardware issue?

Offline NTILZA

  • *
  • 14
  • +0/-0
Ethernet Bonding: NIC failure - legit hardware issue?
« on: December 16, 2009, 06:43:27 PM »
Greetings, all.

I've researched this issue extensively over the last couple of days - but not located anything remotely similar.  If there is, either in the forums or in Bugzilla, I'd appreciate the pointer.

I configured the dual LAN on a server motherboard as a bonded interface for failover purposes, and plugged both into the gigabit ports on the switch.  At some point, eth0 failed, but the network continued to run on eth1.  The messages log was flooded with these messages (at rate of 10 per SECOND) for weeks until I was called in for maintenance:

Code: [Select]
Dec 10 18:50:34 server kernel: sky2 eth0: transmit descriptor error (hardware problem)
Dec 10 18:50:34 server kernel: sky2 eth0: Link is down.

Power cycling the machine brought the offending NIC back, but only for a few minutes, then the console display (where I was logged in) was flooded with the above messages, and the console remained unusable, and access to the web console and SSH was impossible.  Hard reset was required.  I disabled the NIC in the BIOS and tried to reconfigure the system, but I couldn't disable bonding without killing the working NIC, so I re-enabled it and left well-enough alone.  (Yes - I will be filling out a bug report about the messages flood and configuration issue).

So this screams NIC failure to me.

But after researching Ethernet channel bonding, other questions were raised which force me to ask the question:

Is Ethernet bonding in SME really for fault-tolerance/failover, or for one of the other bonding scenarios, like load balancing, which would require switch configuration??  If the latter, then did my NIC really fail, or is it just a software/switch issue??

Insight appreciate.  Thanks in advance.

James



*edited to reflect proper use of switch
« Last Edit: December 17, 2009, 09:04:08 AM by NTILZA »

Offline janet

  • *****
  • 4,812
  • +0/-0
Re: Ethernet Bonding: NIC failure - legit hardware issue?
« Reply #1 on: December 17, 2009, 12:09:54 AM »
NTILZA

Primarily for throughput speed on a server only installation with 2 NICs, where the rest of your system will support it.
Please search before asking, an answer may already exist.
The Search & other links to useful information are at top of Forum.

Offline NTILZA

  • *
  • 14
  • +0/-0
Re: Ethernet Bonding: NIC failure - legit hardware issue?
« Reply #2 on: December 17, 2009, 08:54:58 AM »
The messages log indicates that the two NICs are being 'enslaved' "as a backup interface"...

This strongly implies that the NICs are configured for failover - not throughput.

I've skimmed over the documentation for NIC Bonding, and there are a lot of options.  For SME to 'just work', I would suspect bonding would be configured as simply as possible (failover).  If these are indeed configured for throughput, then what configuration is being used?  Switch configuration may be necessary in some modes from what I have read, and I find that a little much for an out-of-box SME installation...

Thoughts??

Thanks again.

James