Koozali.org: home of the SME Server

eno1: Detected Hardware Unit Hang: LAN card Driver problem?

Offline stavi

  • 11
  • +0/-0
eno1: Detected Hardware Unit Hang: LAN card Driver problem?
« on: December 29, 2022, 04:39:36 PM »
Hello,
It seems that my network card is not working 100%. Sometimes the incoming internet connection is interrupted.
In the sme messages.log it says:


Dec 29 08:41:52 jarvis kernel: e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:   TDH                  <cc>   TDT                  <e4>   next_to_use          <e4>   next_to_clean        <c9> buffer_info[next_to_clean]:   time_stamp           <1527b89ca>   next_to_watch        <cc>   jiffies              <1527b989c>   next_to_watch.status <0> MAC Status             <80083> PHY Status             <796d> PHY 1000BASE-T Status  <3800> PHY Extended Status    <3000> PCI Status             <10>
Dec 29 08:41:54 jarvis kernel: [1384066.200720] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
Dec 29 08:41:54 jarvis kernel: [1384066.200720]   TDH                  <cc>
Dec 29 08:41:54 jarvis kernel: [1384066.200720]   TDT                  <e4>
Dec 29 08:41:54 jarvis kernel: [1384066.200720]   next_to_use          <e4>
Dec 29 08:41:54 jarvis kernel: [1384066.200720]   next_to_clean        <c9>
Dec 29 08:41:54 jarvis kernel: [1384066.200720] buffer_info[next_to_clean]:
Dec 29 08:41:54 jarvis kernel: [1384066.200720]   time_stamp           <1527b89ca>
Dec 29 08:41:54 jarvis kernel: [1384066.200720]   next_to_watch        <cc>
Dec 29 08:41:54 jarvis kernel: [1384066.200720]   jiffies              <1527ba06c>
Dec 29 08:41:54 jarvis kernel: [1384066.200720]   next_to_watch.status <0>
Dec 29 08:41:54 jarvis kernel: [1384066.200720] MAC Status             <80083>
Dec 29 08:41:54 jarvis kernel: [1384066.200720] PHY Status             <796d>
Dec 29 08:41:54 jarvis kernel: [1384066.200720] PHY 1000BASE-T Status  <3800>
Dec 29 08:41:54 jarvis kernel: [1384066.200720] PHY Extended Status    <3000>
Dec 29 08:41:54 jarvis kernel: [1384066.200720] PCI Status             <10>
Dec 29 08:41:54 jarvis kernel: e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:   TDH                  <cc>   TDT                  <e4>   next_to_use          <e4>   next_to_clean        <c9> buffer_info[next_to_clean]:   time_stamp           <1527b89ca>   next_to_watch        <cc>   jiffies              <1527ba06c>   next_to_watch.status <0> MAC Status             <80083> PHY Status             <796d> PHY 1000BASE-T Status  <3800> PHY Extended Status    <3000> PCI Status             <10>
Dec 29 08:41:55 jarvis kernel: [1384067.212222] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
Dec 29 08:41:55 jarvis kernel: e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
Dec 29 08:41:59 jarvis kernel: [1384070.441241] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Dec 29 08:41:59 jarvis kernel: e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Dec 29 08:41:59 jarvis kernel: [1384071.266124] e1000e: eno1 NIC Link is Down
Dec 29 08:41:59 jarvis kernel: e1000e: eno1 NIC Link is Down
Dec 29 08:42:02 jarvis kernel: [1384074.273124] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Dec 29 08:42:02 jarvis kernel: e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None


I think that the driver of the network card is not good. Base driver, installed by SME. How can I update it?
As far as I've come:


WAN:
eno1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet ------------  netmask 255.255.255.128  broadcast --.---.--.---
        ether 34:17:eb:a1:53:9c  txqueuelen 1000  (Ethernet)
        RX packets 131794932  bytes 111342822731 (103.6 GiB)
        RX errors 0  dropped 4276  overruns 0  frame 0
        TX packets 210651591  bytes 256347429783 (238.7 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 20  memory 0xf7d00000-f7d20000

lot of drop
LAN:
enp2s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.8.1  netmask 255.255.255.0  broadcast 192.168.8.255
        ether e8:de:27:01:4c:63  txqueuelen 1000  (Ethernet)
        RX packets 213078452  bytes 256094133341 (238.5 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 128300984  bytes 110600305563 (103.0 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


[root@jarvis ~]# lspci | egrep -i --color 'network|ethernet'
00:19.0 Ethernet controller: Intel Corporation Ethernet Connection I217-LM (rev 04)
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)



[root@jarvis ~]# lshw -class network -short
H/W path         Device      Class          Description
=======================================================
/0/100/19        eno1        network        Ethernet Connection I217-LM
/0/100/1c.4/0    enp2s0      network        RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
/1               dummy0      network        Ethernet interface



[root@jarvis ~]# ethtool -i eno1
driver: e1000e
version: 3.2.6-k
firmware-version: 0.13-4
expansion-rom-version:
bus-info: 0000:00:19.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no

It's part of a home network, just a "sandbox".
Server type: dell optiplex 9020 https://www.dell.com/support/home/en-uk/product-support/product/optiplex-9020-desktop/drivers

WAN card:

https://ark.intel.com/content/www/us/en/ark/products/60019/intel-ethernet-connection-i217lm.html

driver from intel:
https://www.intel.com/content/www/us/en/products/sku/60019/intel-ethernet-connection-i217lm/downloads.html
https://www.intel.com/content/www/us/en/download/15084/intel-ethernet-adapter-complete-driver-pack.html


I would like help to move forward. Is this good for me?
How can I install it?

Thank you for your answers in advance.

Hi Stavi
« Last Edit: December 29, 2022, 04:44:40 PM by stavi »

Offline ReetP

  • *
  • 3,783
  • +5/-0
Re: eno1: Detected Hardware Unit Hang: LAN card Driver problem?
« Reply #1 on: December 29, 2022, 11:03:35 PM »
Check the RH/CentOS 7 hardware compatibility list.

It isn't recommended to use other drivers.
...
1. Read the Manual
2. Read the Wiki
3. Don't ask for support on Unsupported versions of software
4. I have a job, wife, and kids and do this in my spare time. If you want something fixed, please help.

Bugs are easier than you think: http://wiki.contribs.org/Bugzilla_Help

If you love SME and don't want to lose it, join in: http://wiki.contribs.org/Koozali_Foundation

Offline TerryF

  • grumpy old man
  • *
  • 1,829
  • +6/-0
Re: eno1: Detected Hardware Unit Hang: LAN card Driver problem?
« Reply #2 on: December 30, 2022, 01:30:10 AM »
So not that unusual..other threads tossed up by Mr Google

https://access.redhat.com/discussions/5928261
« Last Edit: December 30, 2022, 01:34:02 AM by TerryF »
--
qui scribit bis legit

Offline bunkobugsy

  • *
  • 281
  • +4/-0
Re: eno1: Detected Hardware Unit Hang: LAN card Driver problem?
« Reply #3 on: December 31, 2022, 02:32:14 AM »

Offline stavi

  • 11
  • +0/-0
Re: eno1: Detected Hardware Unit Hang: LAN card Driver problem?
« Reply #4 on: January 02, 2023, 08:40:20 AM »
Happy New Year to Everyone!

Thanks for the help.
Strange error, previously it had Win 2012 Srv, it did not produce such an error.
I will try this driver replacement. I've never done that before. What to watch out for? If it still doesn't work, is it possible to restore the previous state?

hello Stavi

Offline ReetP

  • *
  • 3,783
  • +5/-0
Re: eno1: Detected Hardware Unit Hang: LAN card Driver problem?
« Reply #5 on: January 02, 2023, 11:29:39 AM »
Strange error, previously it had Win 2012 Srv, it did not produce such an error.

But this is not Windows..... Quite possibly had some hacky Windows driver.

Quote
If it still doesn't work, is it possible to restore the previous state?

Yup. Have a read on installing, uninstalling & downgrading rpms.

Also you can buy decent compatible second hand cards for pennies.
...
1. Read the Manual
2. Read the Wiki
3. Don't ask for support on Unsupported versions of software
4. I have a job, wife, and kids and do this in my spare time. If you want something fixed, please help.

Bugs are easier than you think: http://wiki.contribs.org/Bugzilla_Help

If you love SME and don't want to lose it, join in: http://wiki.contribs.org/Koozali_Foundation

Offline bunkobugsy

  • *
  • 281
  • +4/-0
Re: eno1: Detected Hardware Unit Hang: LAN card Driver problem?
« Reply #6 on: January 02, 2023, 06:57:43 PM »
wget http://elrepo.reloumirrors.net/elrepo/el7/x86_64/RPMS/kmod-e1000e-3.8.7-1.el7_9.elrepo.x86_64.rpm
yum localinstall kmod-e1000e-3.8.7-1.el7_9.elrepo.x86_64.rpm
signal-event reboot

Revert driver https://www.tecmint.com/view-yum-history-to-find-packages-info/
yum history
yum history undo top_ID_from_above

or simply
yum erase kmod-e1000e-3.8.7-1.el7_9.elrepo.x86_64.rpm

and
signal-event reboot

Offline stavi

  • 11
  • +0/-0
Re: eno1: Detected Hardware Unit Hang: LAN card Driver problem?
« Reply #7 on: January 03, 2023, 12:00:21 PM »
Hi,
Thank you very much for all your help.
It seems to have worked.

[root@jarvis cucc]# sudo ethtool -i eno1
driver: e1000e
version: 3.2.6-k
firmware-version: 0.13-4

I wonder if there will be any more errors.
I'm new to Linux, but I have to start somewhere :)

Offline bunkobugsy

  • *
  • 281
  • +4/-0
Re: eno1: Detected Hardware Unit Hang: LAN card Driver problem?
« Reply #8 on: January 03, 2023, 01:04:30 PM »
Actually not, version same as before. Did you restart?

Offline stavi

  • 11
  • +0/-0
Re: eno1: Detected Hardware Unit Hang: LAN card Driver problem?
« Reply #9 on: January 03, 2023, 02:05:32 PM »
ohh,  wrong clipboard :)

[root@jarvis ~]# sudo ethtool -i eno1
driver: e1000e
version: 3.8.7-NAPI

Offline stavi

  • 11
  • +0/-0
Re: eno1: Detected Hardware Unit Hang: LAN card Driver problem?
« Reply #10 on: January 04, 2023, 03:26:40 PM »
Damn it.

Unfortunately the error persists  :(

Jan  4 10:58:37 jarvis kernel: e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:   TDH                  <96>   TDT                  <bf>   next_to_use          <bf>   next_to_clean        <94> buffer_info[next_to_clean]:   time_stamp           <1054ffcff>   next_to_watch        <96>   jiffies              <10550128c>   next_to_watch.status <0> MAC Status             <80083> PHY Status             <796d> PHY 1000BASE-T Status  <3800> PHY Extended Status    <3000> PCI Status             <10>
Jan  4 10:58:39 jarvis kernel: [89436.537790] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
Jan  4 10:58:39 jarvis kernel: [89436.537790]   TDH                  <96>
Jan  4 10:58:39 jarvis kernel: [89436.537790]   TDT                  <bf>
Jan  4 10:58:39 jarvis kernel: [89436.537790]   next_to_use          <bf>
Jan  4 10:58:39 jarvis kernel: [89436.537790]   next_to_clean        <94>
Jan  4 10:58:39 jarvis kernel: [89436.537790] buffer_info[next_to_clean]:
Jan  4 10:58:39 jarvis kernel: [89436.537790]   time_stamp           <1054ffcff>
Jan  4 10:58:39 jarvis kernel: [89436.537790]   next_to_watch        <96>
Jan  4 10:58:39 jarvis kernel: [89436.537790]   jiffies              <105501a5c>
Jan  4 10:58:39 jarvis kernel: [89436.537790]   next_to_watch.status <0>
Jan  4 10:58:39 jarvis kernel: [89436.537790] MAC Status             <80083>
Jan  4 10:58:39 jarvis kernel: [89436.537790] PHY Status             <796d>
Jan  4 10:58:39 jarvis kernel: [89436.537790] PHY 1000BASE-T Status  <3800>
Jan  4 10:58:39 jarvis kernel: [89436.537790] PHY Extended Status    <3000>
Jan  4 10:58:39 jarvis kernel: [89436.537790] PCI Status             <10>
Jan  4 10:58:39 jarvis kernel: e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:   TDH                  <96>   TDT                  <bf>   next_to_use          <bf>   next_to_clean        <94> buffer_info[next_to_clean]:   time_stamp           <1054ffcff>   next_to_watch        <96>   jiffies              <105501a5c>   next_to_watch.status <0> MAC Status             <80083> PHY Status             <796d> PHY 1000BASE-T Status  <3800> PHY Extended Status    <3000> PCI Status             <10>
Jan  4 10:58:40 jarvis kernel: [89437.549525] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
Jan  4 10:58:40 jarvis kernel: e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
Jan  4 10:58:43 jarvis kernel: [89440.546745] e1000e 0000:00:19.0 eno1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Jan  4 10:58:43 jarvis kernel: e1000e 0000:00:19.0 eno1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None

I will try it with another NIC.

Offline ReetP

  • *
  • 3,783
  • +5/-0
Re: eno1: Detected Hardware Unit Hang: LAN card Driver problem?
« Reply #11 on: January 04, 2023, 03:37:18 PM »
Damn it.

Unfortunately the error persists  :(

I will try it with another NIC.

:-(

Yes, save yourself the hassle.....
...
1. Read the Manual
2. Read the Wiki
3. Don't ask for support on Unsupported versions of software
4. I have a job, wife, and kids and do this in my spare time. If you want something fixed, please help.

Bugs are easier than you think: http://wiki.contribs.org/Bugzilla_Help

If you love SME and don't want to lose it, join in: http://wiki.contribs.org/Koozali_Foundation

Offline smeghead

  • *
  • 562
  • +0/-0
Re: eno1: Detected Hardware Unit Hang: LAN card Driver problem?
« Reply #12 on: May 20, 2024, 10:46:08 AM »
Holy thread resurrection batman!

I'm fighting this issue on a Dell Optiplex 9020 with its embedded NIC.

After much research & digging the general consensus is:

. use a diff NIC
. use the latest driver, done!
. turn off NIC offloading (https://forum.proxmox.com/threads/intel-nic-e1000e-hardware-unit-hang.106001/)

It's a long way to go to get to the server so replacing the NIC is my last resort.

This leaves me with the offloading option.  For now I've implemented the offloading commands manually & it's looking promising so I'm looking to see how I can append/add the offloading command as part of the server startup; I would prefer to have something that attaches or is linked to the ifup command so it would be invoked if the ifup process is run manually.

The command I need to run is:  ethtool -K eno1 gso off gro off tso off tx off rx off tso off

Of course if there's a way to make these settings permanent then that would be optimal but as far as I can see it's only possible if using nmclient.  NetworkManager is available in the updates repo if this is the best way to do this.

Suggestions/recommendations on where/how best to implement this.

Cheers all
..................

Offline mmccarn

  • *
  • 2,635
  • +10/-0
Re: eno1: Detected Hardware Unit Hang: LAN card Driver problem?
« Reply #13 on: May 20, 2024, 01:11:50 PM »
It looks to me like ethtool options are supported in the config database.

Quote from: /etc/e-smith/templates/etc/sysconfig/network-scripts/ifcfg-ethX/10ETHTOOL
{
    my $this_device;

    $this_device = \%InternalInterface if $is_internal;
    $this_device = \%ExternalInterface if $is_external;
    return unless $this_device;

    my $ethtool_options = $this_device->{EthtoolOpts};
    return unless $ethtool_options;

    return "ETHTOOL_OPTS=\"$ethtool_options\"";
}

Combining that with this post from serverfault:
https://serverfault.com/questions/463111/how-to-persist-ethtool-settings-through-reboot

These commands insert the indicated values into ifcfg-xxxx, but I don't have an offloading NIC to test with:

Code: (WAN) [Select]
config setprop ExternalInterface EthtoolOpts "-K $(config getprop ExternalInterface Name) gso off gro off tso off tx off rx off tso off"
signal-event console-save

Code: (LAN) [Select]
config setprop InternalInterface EthtoolOpts "-K $(config getprop InternalInterface Name) gso off gro off tso off tx off rx off tso off"
signal-event console-save

Note (for others who might end up here) -
I assume that "ETHTOOL_OPTS" would not do anything unless ethtool has been installed with yum install ethtool...

Offline smeghead

  • *
  • 562
  • +0/-0
Re: eno1: Detected Hardware Unit Hang: LAN card Driver problem?
« Reply #14 on: May 21, 2024, 12:08:44 AM »
Awesome, thanks.

I didn't expect ethtool to already be supported
..................