1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 <2020> 2021 2022 2023 2024 | Index | 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 <2020> 2021 2022 2023 2024 |
<== Date ==> | <== Thread ==> |
---|
Subject: | RE: Ethernet question |
From: | Mark Rivers via Tech-talk <[email protected]> |
To: | "'J. Lewis Muir'" <[email protected]> |
Cc: | "[email protected]" <[email protected]> |
Date: | Tue, 7 Jan 2020 00:28:29 +0000 |
> All of the packets are still going through the same 10 GbE switch, though, right (i.e., the one labeled "10 Gbit switch #1" in the network path diagram you included in a message upthread)?
> So, that switch has been involved in all of the tests conducted so far, so it still could be the problem, right? No, the two different Linux systems have different switches. The one I showed in a previous message is on the APS experiment hall floor and is our Centos 7 server corvette. The second Linux machine I have tested is an Ubuntu 18 system in my office. Its topology is: Linux machine (has both 10 Gbit and 1 Gbit NICs) | 10 Gbit switch #1 (in my office) | (1 Gbit uplink) 1 Gbit switch #2 (in APS network closet) | (possibly additional switches in here, I'm not sure) | 1 Gbit switch | Device (10 Mbit AUI) > Also, are the 10 GbE switches labeled "10 Gbit switch #1" and "10 Gbit switch #2" in the network path diagram identical and running the same OS or firmware version, or are they different? Different OS and firmware. Switch #1 is a Dell in the system on the floor is a Dell N1548, while switch #1 in my office is a Netgear X5712T. Switch #2 in both cases is managed by the APS in the network closet. It is definitely not
a Dell, but is probably an HP or Cisco and they may or may not be the same switch. > Also, are those two switches managed or unmanaged? If managed, can you find out what the switch has set the speed and duplex to for the ports involved to ensure that it set them correctly? Switches #1 and #2 are managed for both configurations. The link between them is definitely 10 Gbit full-duplex. The final 1 Gbit switch to the device is unmanaged. But it clearly is configured correctly because when the Linux NIC
is 1 Gbit it works fine. And using a different NIC on Linux has no effect on the configuration of the port on the final 1 Gbit switch. And I have tested 2 devices that are connected to 2 different 1 Gbit switches. They are both Dell but different models.
> Could you show the output of "ethtool -S p5p1" just in case it shows more detail about exactly what it means by RX "frame"?
Here is the current output of ifconfig and ethtool (abbreviated). ifconfig "frame" is 235, which is the same as ethtool "rx_length_errors", so those are the same thing. They are not CRC errors, which is what I think Michael was assuming. corvette:areaDetector/ADCore/iocBoot>/sbin/ifconfig p5p1 p5p1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 164.54.160.82 netmask 255.255.255.0 broadcast 164.54.160.255 inet6 fe80::3efd:feff:fea3:f258 prefixlen 64 scopeid 0x20<link> ether 3c:fd:fe:a3:f2:58 txqueuelen 1000 (Ethernet) RX packets 147929456684 bytes 111438844085337 (101.3 TiB) RX errors 0 dropped 920 overruns 0 frame 235 TX packets 100625271243 bytes 29596808110595 (26.9 TiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 corvette:areaDetector/ADCore/iocBoot>/sbin/ethtool -S p5p1 NIC statistics: rx_packets: 147929466577 tx_packets: 100625281563 rx_bytes: 111438844918608 tx_bytes: 29596809166293 rx_errors: 0 tx_errors: 0 rx_dropped: 891 tx_dropped: 0 collisions: 0 rx_length_errors: 235 rx_crc_errors: 0 rx_unicast: 147182042495 tx_unicast: 100555363340 rx_multicast: 5115 tx_multicast: 412 rx_broadcast: 747419793 tx_broadcast: 69917765 rx_unknown_protocol: 0 tx_linearize: 2166 tx_force_wb: 0 rx_alloc_fail: 0 rx_pg_alloc_fail: 0 > Is the NIC driver the same for the 1 GbE and the 10 GbE NICs on Linux? I'm not sure. How can I tell that? > Do you have a Windows machine with a 10 GbE NIC that you could try? Yes, I could try that, but I have not yet. > Do you have a Mac with a 10 GbE NIC that you could try? No. I should add that I normally actually communicate with these devices from vxWorks, going through the same switches, and that is working fine and has been for 20 years (with switch upgrades over the years of course). I would like to
move from vxWorks to Linux but have hit this problem with the 10 Gbit NICs. Thanks, Mark -----Original Message----- On 12/23, Mark Rivers via Tech-talk wrote: > On further investigation I found that the problem only occurs when using 10 Gbit Ethernet adapters. When using a 1 Gbit adapter it works fine. I was able to see this in a single machine that has both 10 Gbit and 1 Gbit adapters.
If I use the 1 Gbit adapter it works fine, if I use the 10 Gbit adapter it fails (but kind of works as described above). On 3 separate systems 1 Gbit works, and 3 other systems 10 Gbit fails. Is the NIC driver the same for the 1 GbE and the 10 GbE NICs on Linux? Do you have a Windows machine with a 10 GbE NIC that you could try? Do you have a Mac with a 10 GbE NIC that you could try? Lewis |