Bill,
The UDP server is receiving undecipherable protocol from port
1030 from several different IP addresses (the R3.12 message
prints the source IP address as a hexadecimal 32 bit word which
may need to be byte swapped).
For CA, the fact that the client's UDP port does not vary is odd
as client ports are typically dynamically assigned and would not
be consistent with the source address.
What if anything might I do to debugg???
o You might have some rogue program that is using CA's ports, but
why from so many different source addresses? Therefore I am
inclined to guess that something else is occurring - probably
packet damage.
o We have seen troubles with corrupt UDP messages in the past if
there is an IOC that has failed its 10/100 auto-negotiation. This
typically occurs (with vxWorks) if the IOC is turned on before
the switch is turned on. If you have the old thin-wire Ethernet
then packet damage might occur because of improper thin-wire
branching causing reflected signals to be summed together
including time delays (phase errors) induced by reflections off
the end of branches.
There are Ethernet and IP level CRC checksums for the purpose of
catching, and discarding bad frames, but they are not
particularly effective if many bits in the frame are damaged. In
that scenario the source address, destination address, source
port, and destination port, might be damaged and the packet might
still pass through the CRC checksum filter.
You are not seeing any errors at the Ethernet level based on
"ifShow" (unless of course your vxWorks interface driver does not
bother to increment the error counters). You might also have a
look at the IP level error counters by typing ipstatShow.
o I sometimes employ sniffer programs like etherfind, tcpdump, or
a standalone packet sniffer to investigate this sort of issue.
You appear to have a corrupt Ethernet broadcast packet based on
the number of systems involved so you might set your sniffer to
filter for Ethernet broadcasts. If you see bogus packet header
fields that might point to corruption.
o You might also try selectively disconnecting portions of your
network, while waiting to see if the corrupt messages stop, to
isolate the sender.
Jeff
-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: Thursday, April 28, 2005 12:04 PM
To: [email protected]
Cc: [email protected]
Subject: bad UDP msg
Hi,
I see that this subject in the archives but I am not having
much luck
tracking this down.
A few days back we started seeing (from a window where the alh
is
started):
CAC: post_msg(): Corrupt cmd in msg d00
../iocinf.c: bad UDP msg from port=47585 addr=7f000001
Of coures any window that opens an medm also shows the message:
CAC: Undecipherable UDP message from 127.0.0.1:47615
CAC: Undecipherable ( bad msg code 3328 ) UDP message from
127.0.0.1:47615
at Thu Apr 28 2005 13:10:25
(of course the 127.0.0.1:47615 47615 is dynamicly assigned..
so 47615
varies)
A telnet to some (not all) of the processors shows:
CAC: post_msg(): Corrupt cmd in msg d00
../iocinf.c: bad UDP msg from port=1030 addr=82c73cc1
iocLogtext shows
vtpc2.starp.bnl.gov Thu Apr 28 13:32:23 2005 ../iocinf.c: bad
UDP msg from
port=1030 addr=82c73cbe
vtpc5.starp.bnl.gov Thu Apr 28 13:32:23 2005 ../iocinf.c: bad
UDP msg from
port=1030 addr=82c73cc1
stargate.starp.bnl.gov Thu Apr 28 13:32:23 2005 ../iocinf.c:
bad UDP msg
from port=1030 addr=82c73d30
vtpc7.starp.bnl.gov Thu Apr 28 13:32:23 2005 ../iocinf.c: bad
UDP msg from
port=1030 addr=82c73d4e
vtpc4.starp.bnl.gov Thu Apr 28 13:32:23 2005 ../iocinf.c: bad
UDP msg from
port=1030 addr=82c73cc0
creighton5.starp.bnl.gov Thu Apr 28 13:32:23 2005 ../iocinf.c:
bad UDP msg
from port=1030 addr=82c73ce5
one processor gives this on port=1029
several others on 1031
and one on both 1031 and 1032.
For this processor
-> ifShow
ei (unit number 0):
Flags: (0x63) UP BROADCAST ARP RUNNING
Internet address: 130.199.60.188
Broadcast address: 130.199.61.255
Netmask 0xffff0000 Subnetmask 0xfffffe00
Ethernet address is 08:00:3e:28:f1:f5
Metric is 0
Maximum Transfer Unit size is 1500
3758343 packets received; 1173400 packets sent
0 input errors; 0 output errors
5579 collisions
lo (unit number 0):
Flags: (0x69) UP LOOPBACK ARP RUNNING
Internet address: 127.0.0.1
Netmask 0xff000000 Subnetmask 0xff000000
Metric is 0
Maximum Transfer Unit size is 4096
433567 packets received; 433567 packets sent
0 input errors; 0 output errors
0 collisions
value = 18 = 0x12
inetstatShow
Active Internet connections (including servers)
PCB Proto Recv-Q Send-Q Local Address Foreign
Address
(state)
-------- ----- ------ ------ ------------------ --------------
----
-------
f8860c TCP 0 0 130.199.60.188.506
130.199.60.238.526
ESTABLISHED
cb688c TCP 0 0 130.199.60.188.23
130.199.60.27.4141
ESTABLISHED
fb710c TCP 0 0 130.199.60.188.506
130.199.60.27.4022
ESTABLISHED
fb640c TCP 0 0 130.199.60.188.506
130.199.60.27.6085
ESTABLISHED
cc858c TCP 0 0 130.199.60.188.103
130.199.61.48.5064
ESTABLISHED
f8858c TCP 0 0 130.199.60.188.506
130.199.60.27.6085
ESTABLISHED
fb5e8c TCP 0 0 0.0.0.0.5064 0.0.0.0.0
LISTEN
fb650c TCP 0 0 130.199.60.188.103
130.199.60.27.7004
CLOSE_WAIT
fb688c TCP 0 0 0.0.0.0.111 0.0.0.0.0
LISTEN
fb6c8c TCP 0 0 0.0.0.0.21 0.0.0.0.0
LISTEN
fb6a0c TCP 0 0 0.0.0.0.1008 0.0.0.0.0
LISTEN
fb6b0c TCP 0 0 0.0.0.0.23 0.0.0.0.0
LISTEN
fb6c0c TCP 0 0 0.0.0.0.513 0.0.0.0.0
LISTEN
cb670c UDP 0 0 0.0.0.0.978 0.0.0.0.0
fb610c UDP 0 0 0.0.0.0.980 0.0.0.0.0
fb608c UDP 0 0 0.0.0.0.982 0.0.0.0.0
fb678c UDP 0 0 130.199.60.188.103
130.199.60.188.103
f8850c UDP 0 0 130.199.60.188.103
130.199.60.188.102
cc838c UDP 0 0 0.0.0.0.1030 0.0.0.0.0
fb630c UDP 0 0 0.0.0.0.1029 0.0.0.0.0
fb658c UDP 0 0 0.0.0.0.5064 0.0.0.0.0
fb660c UDP 0 0 0.0.0.0.5065 0.0.0.0.0
f8818c UDP 0 0 0.0.0.0.1028 0.0.0.0.0
fb5d8c UDP 0 0 0.0.0.0.1027 0.0.0.0.0
f8870c UDP 0 0 0.0.0.0.1026 0.0.0.0.0
fb6d0c UDP 0 0 0.0.0.0.111 0.0.0.0.0
value = 1 = 0x1
Another processor suspicously (?) shows
ifShow
ei (unit number 0):
Flags: (0x63) UP BROADCAST ARP RUNNING
Internet address: 130.199.60.193
Broadcast address: 130.199.61.255
Netmask 0xffff0000 Subnetmask 0xfffffe00
Ethernet address is 08:00:3e:28:ed:fa
Metric is 0
Maximum Transfer Unit size is 1500
3192815 packets received; 233539 packets sent
4 input errors; 0 output errors
1667 collisions
lo (unit number 0):
Flags: (0x69) UP LOOPBACK ARP RUNNING
Internet address: 127.0.0.1
Netmask 0xff000000 Subnetmask 0xff000000
Metric is 0
Maximum Transfer Unit size is 4096
168135 packets received; 168135 packets sent
0 input errors; 0 output errors
0 collisions
value = 18 = 0x12
inetstatShow
Active Internet connections (including servers)
PCB Proto Recv-Q Send-Q Local Address Foreign
Address
(state)
-------- ----- ------ ------ ------------------ --------------
----
-------
a2fc0c TCP 0 0 130.199.60.193.506
130.199.60.27.4021
ESTABLISHED
fb620c TCP 0 0 130.199.60.193.506
130.199.60.27.3368
ESTABLISHED
f6258c TCP 0 0 130.199.60.193.506
130.199.61.48.1038
ESTABLISHED
fb610c TCP 0 0 130.199.60.193.506
130.199.60.159.264
ESTABLISHED
f6368c TCP 0 0 130.199.60.193.506
130.199.60.27.5899
ESTABLISHED
f6240c TCP 0 0 130.199.60.193.506
130.199.60.27.5898
ESTABLISHED
f5f58c TCP 0 0 0.0.0.0.5064 0.0.0.0.0
LISTEN
f6358c TCP 0 0 130.199.60.193.103
130.199.60.27.7004
CLOSE_WAIT
fb688c TCP 0 0 0.0.0.0.111 0.0.0.0.0
LISTEN
fb6c8c TCP 0 0 0.0.0.0.21 0.0.0.0.0
LISTEN
fb6a0c TCP 0 0 0.0.0.0.1008 0.0.0.0.0
LISTEN
fb6b0c TCP 0 0 0.0.0.0.23 0.0.0.0.0
LISTEN
fb6c0c TCP 0 0 0.0.0.0.513 0.0.0.0.0
LISTEN
a2fd0c UDP 0 0 0.0.0.0.926 0.0.0.0.0
fb668c UDP 0 0 0.0.0.0.928 0.0.0.0.0
f6228c UDP 0 0 0.0.0.0.930 0.0.0.0.0
fb690c UDP 0 0 0.0.0.0.932 0.0.0.0.0
a2fe0c UDP 0 0 0.0.0.0.934 0.0.0.0.0
f6248c UDP 0 0 0.0.0.0.936 0.0.0.0.0
a0b80c UDP 0 0 0.0.0.0.938 0.0.0.0.0
f5f38c UDP 0 0 0.0.0.0.940 0.0.0.0.0
f5f20c UDP 0 0 0.0.0.0.942 0.0.0.0.0
fb648c UDP 0 0 0.0.0.0.944 0.0.0.0.0
f6348c UDP 0 0 0.0.0.0.946 0.0.0.0.0
f60d8c UDP 0 0 0.0.0.0.948 0.0.0.0.0
fb658c UDP 0 0 0.0.0.0.950 0.0.0.0.0
fb6e8c UDP 0 0 130.199.60.193.103
130.199.60.193.102
f5f50c UDP 0 0 0.0.0.0.1029 0.0.0.0.0
f60c0c UDP 0 0 0.0.0.0.5064 0.0.0.0.0
f5f40c UDP 0 0 0.0.0.0.5065 0.0.0.0.0
f60c8c UDP 0 0 0.0.0.0.1028 0.0.0.0.0
f60b8c UDP 0 0 0.0.0.0.1027 0.0.0.0.0
fb638c UDP 0 0 0.0.0.0.1026 0.0.0.0.0
fb6d0c UDP 0 0 0.0.0.0.111 0.0.0.0.0
value = 1 = 0x1
----
So I have rebooted the host, and also many of the processors.
We are
running and I wont have a long access to these processors or
the terminal
server for a few days. What if anything might I do to
debugg???
Note this is old epics 3.12.. and I also dont have caSnooper
etc..
These messages are only annoying right now, as they dont seem
to have any
effect on our controls.
Right now only I loose sleep over this!
I greatly appreciate any advice or suggestions.
Thanks,
Bill Waggoner
STAR
Creighton University
BNL
[email protected]