On 3/18/25 09:41, Wells, Alex (DLSLtd,RAL,LSCI) wrote:
> Hi Michael,
>
> I can confirm the CPU use goes to zero when the network cable is unplugged. We do image and file loading over the network, but once all that has completed I can pull the cable out and observe no CPU use by that thread. The system also becomes fully responsive
on the console, which correlates the significant drop in CPU use.
>
> I have only tested RTEMS 5.1, which I believe is using the "legacy" IP stack. In a few months, during the next shutdown, I may be able to test RTEMS6 but we have no particular reason to expect the newer stack to work better, as the MVME5500 is 30 years old
now so the new stack is unlikely to be optimized for its use case.
I concur with your assessment. Part of the reason that RTEMS
is juggling two different IP stacks is that not all of the
NIC drivers exist in the newer libbsd stack.
That fact that the CPU load under vxWorks was apparently less
does suggest that the RTEMS NIC driver could be improved.
Although whether this would be cost-effective for such an old
board though...?
I think the network code in VxWorks also delegates the job of examining and dispatching incoming packets to a lower priority thread (“tNet0” IIRC). RTEMS may be doing that in the
ISR, or in a thread at a higher priority than the shell.
So your practical choices may be limited to isolating this
crate behind a cagateway to limit broadcast traffic.
We use CA name-servers for that. Our IOCs live in a different subnet to our client workstations so they don’t see the client broadcasts, but the clients do connect directly
to the IOCs over TCP so there are no delays from an intermediate CA gateway. This is probably most important for IOCs that are providing large arrays, since the gateway’s server side is single-threaded. We do use gateways to protect IOCs that don’t have much
memory and are easily overloaded with too many clients, but they don’t have high bandwidth needs.
Here are some results of snooping for searches for 1 minute in our IOC and client subnets simultaneously, Rate is PV searches per second:
2025-03-18 17:01:48.223 Clients= 72 Searches= 3397 Rate= 56.59
2025-03-18 17:01:48.440 Clients=484 Searches=23277 Rate= 387.79
We could reduce the searches in the IOC subnet more by moving our beamline gateways to the client subnet.
- Andrew
--
Complexity comes for free, Simplicity you have to work for.
> *From:* Michael Davidsaver <mdavidsaver at gmail.com>
> *Sent:* Friday, March 14, 2025 4:39 PM
> *To:* Wells, Alex (DLSLtd,RAL,LSCI) <alex.wells at diamond.ac.uk>
> *Cc:* tech-talk at aps.anl.gov <tech-talk at aps.anl.gov>
> *Subject:* Re: Channel Access performance with RTEMS on MVME5500
> Hello Alex,
>
> As a quick test, have you tried unplugging the ethernet cable to ensure that
> the reported CPU usage for CAS-UDP goes to zero?
>
> Also, which RTEMS version(s) have you tested?
>
> Is this the RTEMS "legacy" IP stack, or the newer libbsd stack?
>
>
> On 3/14/25 06:30, Wells, Alex (DLSLtd,RAL,LSCI) via Tech-talk wrote:
>> Hello Tech Talk,
>>
>> I'm in the process of moving some VME crates from VxWorks to RTEMS. This also moves from EPICS 3.14.12.7 to EPICS 7.0.7. I'm seeing significant amounts of CPU activity in the "CAS-UDP" thread, and was hoping someone may be able to explain it.
>>
>> Averaged over time, RTEMS reports that the CAS-UDP thread is using ~40% of the CPU. This persists even when removing all the processing from all records (thus removing all other possible CPU use). During initialization of the IOC, and for several minutes
after initialization finished, I see 80%+ CPU usage on the IOC overall, mostly on this one thread, and it makes the overall IOC very unresponsive. This same behaviour also happens intermittently during operation, where even record processing seems to become
delayed due to Channel Access processing - for example the devIocStats Heartbeat record that should just increase by 1 per second sometimes pauses for multiple seconds.
>>
>> The network the IOC is on is requesting a fair number of PVs from this IOC. CaSnooper shows between 20 and 40 requests per second to PV(s) on this IOC. Approximately 500 PVs are being requested from the IOC. These requests are spread across approximately
30 separate clients.
>>
>> When I compare this to the VxWorks version, running on the same hardware and network, I see less almost 0% CPU usage on the CAS-UDP thread, and no record processing hitches. The CPU also idles significantly lower than for RTEMS.
>>
>> Our current theory is that the RTEMS network stack for UDP processing is much less efficient, and is struggling with the volume of requests. Is anyone able to confirm/deny this, and does anyone have any potential workarounds?
>>
>> Thanks,
>> Alex Wells
>> Diamond Light Source