Am Donnerstag, 13. Oktober 2011, um 20:27:44 schrieb Shankar, Murali:
> Thank you. I tried this out on my dev box and /proc/cpuinfo happily reports
> only one CPU.
>
> The test with CA alone performs better in that it succeeds about 50% of the
> time. But a little less than 50% of the time, I still get monitors that
> have not received callbacks.
>
> The original test with sequence programs performs much better. So far, I
> have been able to reproduce the issue only once in about 10 tests. But the
> issue still occurs.
>
> I have attached the thread dumps as well.
Sorry for keeping silent so long, I was sick during the last days.
Jeff, you should note that one peculiarity of the problem is that those
channels that miss the initial event after connecting *never* again get one.
The "undelivered" number (from dbel output) just keeps increasing until it
stabilizes at
epics> dbel t1218 10
VAL { VALUE ALARM } undelivered=96, thread=0x11214b0, unused entries=32,
discarded by replacement=150, duplicate count =95
when CA starts discarding new entries. At least that is what I observe here
(on a dual core AMD Phenom(tm) II X2 550 Processor running linux kernel
2.6.35-30). It should be possible to find out what is happening by looking
(with gdb) at the task that supposedly takes entries off the queue and sends
them to the client.
Anyway, if the client waits for all channels to receice their initial monitor
event before doing anything else, then the problem seems to disappear. In my
tests, I can decrease the probability of the bug happening by setting the SCAN
field of every record that has received one monitor to Passive. The dbel
output I pasted above comes from such a run: there was exactly this one record
remaining that produces events at all, all other 1999 records have been set to
Passive.
On the other hand, if I disconnect the channel after it has received its first
monitor event (using pvAssign(VAL,"") or by exiting the program) the
probability /increases/ greatly (i.e. I get much more channels that are in the
buggy state of never again receiving an event).
BTW, one of the reasons this was never observed before is that waiting for
initial monitor events is the default behaviour in SNL and practically
everyone leaves it at that. It is only because this feature is broken in
seq-2.1.2 that Mike couldn't "fix" this simply by enabling the +c option (i.e.
reverting to the default). The bug in seq-2.1.2 is that (by default or if
option +c is given) the sequencer only waits for connections to be stablished,
but not for the initial monitors. I will publish a new version that fixes this
problem (the trivial bug in the sequencer, not the subtle one in CA) as soon
as I feel fit enough.
Cheers
Ben
________________________________
Helmholtz-Zentrum Berlin für Materialien und Energie GmbH
Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V.
Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn-Rudolph
Geschäftsführer: Prof. Dr. Anke Rita Kaysser-Pyzalla, Dr. Ulrich Breuer
Sitz Berlin, AG Charlottenburg, 89 HRB 5583
Postadresse:
Hahn-Meitner-Platz 1
D-14109 Berlin
http://www.helmholtz-berlin.de
- References:
- Re: Sequence monitor not getting callback Shankar, Murali
- Re: Sequence monitor not getting callback J. Lewis Muir
- RE: Sequence monitor not getting callback Shankar, Murali
- Navigate by Date:
- Prev:
RE: Sequence monitor not getting callback Shankar, Murali
- Next:
EPICS support for Granville Phillips Series 307 Vacuum Gauge Controller Linda.Pratt
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
<2011>
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
RE: Sequence monitor not getting callback Shankar, Murali
- Next:
RE: Sequence monitor not getting callback Jeff Hill
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
<2011>
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|