EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024 
<== Date ==> <== Thread ==>

Subject: crash in CAS-event (high load?)
From: Heinz Junkes via Tech-talk <tech-talk at aps.anl.gov>
To: EPICS Tech Talk <tech-talk at aps.anl.gov>
Date: Tue, 21 Sep 2021 12:03:45 +0200
I am having problems with an RTEMS system (5.1) running Epics 7.

The system runs with a MVME6100.

With 10 Hz a Struck SIS3316 (digitizer, 16 channels) is read out.
The data is stored in PVs (4000 values, DOUBLE, per channel).
(ca. 5 MByte/s)

Several clients subscribe to these PVs.

Unfortunately, there is a crash after some time (variable, sometimes after 5 minutes, 
sometimes after 2 hours).

It looks to me that the crash happens in a CAS event - task.

I don't know the best way to resolve the problem (debug, how?). 
Setting CASDEBUG outputs too much and I can not cope with this.

Gruss Heinz


… 
         CAS-TCP      0x1bf14d8    184614936     18      46       OK
         CAS-UDP      0x1bf1640    184614937     16      41       OK
      CAS-beacon      0x1bf19c0    184614938     17      44       OK
       CAS-event      0x1d85440    184614939     19      49       OK
      CAS-client      0x1d855b0    184614940     20      51       OK
    save_restore       0xc8f208    184614941     20      51       OK
       CAS-event      0x1fcac68    184614942     35      89       OK
      CAS-client      0x1fcb260    184614943     36      92       OK
       CAS-event      0x1fda060    184614944     19      49       OK
      CAS-client      0x1fda1d0    184614914     20      51       OK
       CAC-event      0x1fd84e8    184614918     21      54       OK
       CAS-event      0x202fcf8    184614919     19      49       OK
      CAS-client      0x202fe68    184614945     20      51       OK
…

! The dump does not match the thread IDs from above, the dump is from a previous crash
! 0x0b010024 was a CAS-event thread

> fatal source: INTERNAL_ERROR_CORE
bsp_fatal_extension(): RTEMS terminated -- no way back to MotLoad so I reset the card
Printing a stack trace for your convenience :-)
fatal source: RTEMS_FATAL_SOURCE_EXCEPTION
exception vector 3 (0x3)
  next PC or address of fault = 0x00230314
  saved MSR = 0x02009032
  context = interrupt, ISR nest level = 1
  thread dispatch disable level = 2
  R0  = 0x0022a67c R1  = 0x002fe690 R2  = 0x00000000 R3  = 0x00000030
  R4  = 0x0000000a R5  = 0x0028b37c R6  = 0x002fe6e8 R7  = 0x002272c4
  R8  = 0x002fe6c8 R9  = 0xa5a5a5a5 R10 = 0x002fe6c8 R11 = 0x00000000
  R12 = 0x40842202 R13 = 0x002bfc98 R14 = 0x002f4e00 R15 = 0x002c0000
  R16 = 0x00000000 R17 = 0x00000055 R18 = 0x002f4648 R19 = 0x002f0000
  R20 = 0x00000019 R21 = 0x0028ce74 R22 = 0x00000004 R23 = 0x002b8d50
  R24 = 0x002f6814 R25 = 0x00000000 R26 = 0x002747e0 R27 = 0x0014cdf0
  R28 = 0x002fe7c8 R29 = 0x002fe698 R30 = 0x0027484c R31 = 0x01fe9208
  CR  = 0x40842208
  CTR = 0x0000001b
  XER = 0x00000000
  LR  = 0x0022a680
  DAR = 0xa5a5a5a9
  executing thread ID = 0x0b010024, name =
Stack Trace:
  IP: 0x00230314, LR: 0x0022a680
--^ 0x0022a67c--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x0014d144--^ 0x00149df8
--^ 0x00142e50--^ 0x001217c0--^ 0x0011a704--^ 0x00231b64--^ 0x0022b0ec
--^ 0x002272c4fatal source: RTEMS_FATAL_SOURCE_EXCEPTION
exception vector 3 (0x3)
  next PC or address of fault = 0x002267dc
  saved MSR = 0x02001032
  context = interrupt, ISR nest level = 1
  thread dispatch disable level = 2
  R0  = 0x002267e8 R1  = 0x002fe380 R2  = 0x00000000 R3  = 0x0028a594
  R4  = 0x00000034 R5  = 0x00000000 R6  = 0x80000000 R7  = 0x0028a59f
  R8  = 0x0028a59e R9  = 0xa5a5a5a5 R10 = 0xf1120005 R11 = 0x00000000
  R12 = 0x40442804 R13 = 0x002bfc98 R14 = 0x002fe460 R15 = 0x002c0000
  R16 = 0x00000000 R17 = 0x00000055 R18 = 0x002f4648 R19 = 0x002f0000
  R20 = 0x00000019 R21 = 0x0028ce74 R22 = 0x00000004 R23 = 0x002b8d50
  R24 = 0x002f6814 R25 = 0x00000000 R26 = 0x002747e0 R27 = 0x0024ff10
  R28 = 0xcccccccd R29 = 0x0028a594 R30 = 0x002fe950 R31 = 0x0000000c
  CR  = 0x40442808
  CTR = 0x0021f9c8
  XER = 0x20000000
  LR  = 0x002267e8
  DAR = 0xa5a5a5a9
  executing thread ID = 0x0b010024, name =
Stack Trace:
  IP: 0x002267dc, LR: 0x002267e8
--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914--^ 0x00227438
--^ 0x40842202--^ 0x0022a67c--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x0014d144
--^ 0x00149df8--^ 0x00142e50--^ 0x001217c0--^ 0x0011a704--^ 0x00231b64
--^ 0x0022b0ec--^ 0x002272c4fatal source: RTEMS_FATAL_SOURCE_EXCEPTION
exception vector 3 (0x3)
  next PC or address of fault = 0x002267dc
  saved MSR = 0x02001032
  context = interrupt, ISR nest level = 1
  thread dispatch disable level = 2
  R0  = 0x002267e8 R1  = 0x002fe070 R2  = 0x00000000 R3  = 0x0028a594
  R4  = 0x00000034 R5  = 0x00000000 R6  = 0x80000000 R7  = 0x0028a59f
  R8  = 0x0028a59e R9  = 0xa5a5a5a5 R10 = 0xf1120005 R11 = 0x00000000
  R12 = 0x40442804 R13 = 0x002bfc98 R14 = 0x002fe150 R15 = 0x002c0000
  R16 = 0x00000000 R17 = 0x00000055 R18 = 0x002f4648 R19 = 0x002f0000
  R20 = 0x00000019 R21 = 0x0028ce74 R22 = 0x00000004 R23 = 0x002b8d50
  R24 = 0x002f6814 R25 = 0x00000000 R26 = 0x002747e0 R27 = 0x0024ff10
  R28 = 0xcccccccd R29 = 0x0028a594 R30 = 0x002fe950 R31 = 0x00000012
  CR  = 0x40442808
  CTR = 0x0021f9c8
  XER = 0x20000000
  LR  = 0x002267e8
  DAR = 0xa5a5a5a9
  executing thread ID = 0x0b010024, name =
Stack Trace:
  IP: 0x002267dc, LR: 0x002267e8
--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914--^ 0x00227438
--^ 0x002267e8--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914
--^ 0x00227438--^ 0x40842202--^ 0x0022a67c--^ 0x0014ceb0--^ 0x0014d0ec
--^ 0x0014d144--^ 0x00149df8--^ 0x00142e50--^ 0x001217c0--^ 0x0011a704
--^ 0x00231b64--^ 0x0022b0ec--^ 0x002272c4fatal source: RTEMS_FATAL_SOURCE_EXCEPTION
exception vector 3 (0x3)
  next PC or address of fault = 0x002267dc
  saved MSR = 0x02001032
  context = interrupt, ISR nest level = 1
  thread dispatch disable level = 2
  R0  = 0x002267e8 R1  = 0x002fdd60 R2  = 0x00000000 R3  = 0x0028a594
  R4  = 0x00000034 R5  = 0x00000000 R6  = 0x80000000 R7  = 0x0028a59f
  R8  = 0x0028a59e R9  = 0xa5a5a5a5 R10 = 0xf1120005 R11 = 0x00000000
  R12 = 0x40442804 R13 = 0x002bfc98 R14 = 0x002fde40 R15 = 0x002c0000
  R16 = 0x00000000 R17 = 0x00000055 R18 = 0x002f4648 R19 = 0x002f0000
  R20 = 0x00000019 R21 = 0x0028ce74 R22 = 0x00000004 R23 = 0x002b8d50
  R24 = 0x002f6814 R25 = 0x00000000 R26 = 0x002747e0 R27 = 0x0024ff10
  R28 = 0xcccccccd R29 = 0x0028a594 R30 = 0x002fe950 R31 = 0x00000018
  CR  = 0x40442808
  CTR = 0x0021f9c8
  XER = 0x20000000
  LR  = 0x002267e8
  DAR = 0xa5a5a5a9
  executing thread ID = 0x0b010024, name =
Stack Trace:
  IP: 0x002267dc, LR: 0x002267e8
--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914--^ 0x00227438
--^ 0x002267e8--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914
--^ 0x00227438--^ 0x002267e8--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec
--^ 0x00230914--^ 0x00227438--^ 0x40842202--^ 0x0022a67c--^ 0x0014ceb0
--^ 0x0014d0ec--^ 0x0014d144--^ 0x00149df8--^ 0x00142e50--^ 0x001217c0
--^ 0x0011a704--^ 0x00231b64--^ 0x0022b0ec--^ 0x002272c4fatal source: RTEMS_FATAL_SOURCE_EXCEPTION
exception vector 3 (0x3)
  next PC or address of fault = 0x002267dc
  saved MSR = 0x02001032
  context = interrupt, ISR nest level = 1
  thread dispatch disable level = 2
  R0  = 0x002267e8 R1  = 0x002fda50 R2  = 0x00000000 R3  = 0x0028a594
  R4  = 0x00000034 R5  = 0x00000000 R6  = 0x80000000 R7  = 0x0028a59f
  R8  = 0x0028a59e R9  = 0xa5a5a5a5 R10 = 0xf1120005 R11 = 0x00000000
  R12 = 0x40442804 R13 = 0x002bfc98 R14 = 0x002fdb30 R15 = 0x002c0000
  R16 = 0x00000000 R17 = 0x00000055 R18 = 0x002f4648 R19 = 0x002f0000
  R20 = 0x00000019 R21 = 0x0028ce74 R22 = 0x00000004 R23 = 0x002b8d50
  R24 = 0x002f6814 R25 = 0x00000000 R26 = 0x002747e0 R27 = 0x0024ff10
  R28 = 0xcccccccd R29 = 0x0028a594 R30 = 0x002fe950 R31 = 0x0000001e
  CR  = 0x40442808
  CTR = 0x0021f9c8
  XER = 0x20000000
  LR  = 0x002267e8
  DAR = 0xa5a5a5a9
  executing thread ID = 0x0b010024, name =
Stack Trace:
  IP: 0x002267dc, LR: 0x002267e8
--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914--^ 0x00227438
--^ 0x002267e8--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914
--^ 0x00227438--^ 0x002267e8--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec
--^ 0x00230914--^ 0x00227438--^ 0x002267e8--^ 0x0021f4ec--^ 0x0014ceb0
--^ 0x0014d0ec--^ 0x00230914--^ 0x00227438--^ 0x40842202--^ 0x0022a67c
--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x0014d144--^ 0x00149df8--^ 0x00142e50
--^ 0x001217c0--^ 0x0011a704--^ 0x00231b64--^ 0x0022b0ec--^ 0x002272c4
fatal source: RTEMS_FATAL_SOURCE_EXCEPTION
exception vector 3 (0x3)
  next PC or address of fault = 0x002267dc
  saved MSR = 0x02001032
  context = interrupt, ISR nest level = 1
  thread dispatch disable level = 2
  R0  = 0x00226808 R1  = 0x002fd740 R2  = 0x00000000 R3  = 0x0028a594
  R4  = 0x0000000a R5  = 0x0024ff10 R6  = 0x002fd6c8 R7  = 0x0024ff12
  R8  = 0x0024ff11 R9  = 0xa5a5a5a5 R10 = 0xf1120005 R11 = 0x00000000
  R12 = 0x40442202 R13 = 0x002bfc98 R14 = 0x002fd820 R15 = 0x002c0000
  R16 = 0x00000000 R17 = 0x00000055 R18 = 0x002f4648 R19 = 0x002f0000
  R20 = 0x00000019 R21 = 0x0028ce74 R22 = 0x00000004 R23 = 0x002b8d50
  R24 = 0x002f6814 R25 = 0x00000000 R26 = 0x002747e0 R27 = 0x0024ff10
  R28 = 0xcccccccd R29 = 0x0028a594 R30 = 0x002fe950 R31 = 0x00000024
  CR  = 0x40442808
  CTR = 0x0021f9c8
  XER = 0x20000000
  LR  = 0x00226808
  DAR = 0xa5a5a5a9
  executing thread ID = 0x0b010024, name =
Stack Trace:
  IP: 0x002267dc, LR: 0x00226808
--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914--^ 0x00227438
--^ 0x002267e8--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914
--^ 0x00227438--^ 0x002267e8--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec
--^ 0x00230914--^ 0x00227438--^ 0x002267e8--^ 0x0021f4ec--^ 0x0014ceb0
--^ 0x0014d0ec--^ 0x00230914--^ 0x00227438--^ 0x002267e8--^ 0x0021f4ec
--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914--^ 0x00227438--^ 0x40842202
--^ 0x0022a67c--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x0014d144--^ 0x00149df8
--^ 0x00142e50--^ 0x001217c0--^ 0x0011a704--^ 0x00231b64--^ 0x0022b0ec
--^ 0x002272c4fatal source: RTEMS_FATAL_SOURCE_EXCEPTION
exception vector 3 (0x3)
  next PC or address of fault = 0x002267dc
  saved MSR = 0x02001032
  context = interrupt, ISR nest level = 1
  thread dispatch disable level = 2
  R0  = 0x002267e8 R1  = 0x002fd430 R2  = 0x00000000 R3  = 0x0028a594
  R4  = 0x00000034 R5  = 0x00000000 R6  = 0x80000000 R7  = 0x0028a59f
  R8  = 0x0028a59e R9  = 0xa5a5a5a5 R10 = 0xf1120005 R11 = 0x00000000
  R12 = 0x40442804 R13 = 0x002bfc98 R14 = 0x002fd510 R15 = 0x002c0000
  R16 = 0x00000000 R17 = 0x00000055 R18 = 0x002f4648 R19 = 0x002f0000
  R20 = 0x00000019 R21 = 0x0028ce74 R22 = 0x00000004 R23 = 0x002b8d50
  R24 = 0x002f6814 R25 = 0x00000000 R26 = 0x002747e0 R27 = 0x0024ff10
  R28 = 0xcccccccd R29 = 0x0028a594 R30 = 0x002fe950 R31 = 0x0000002a
  CR  = 0x40442808
  CTR = 0x0021f9c8
  XER = 0x20000000
  LR  = 0x002267e8
  DAR = 0xa5a5a5a9
  executing thread ID = 0x0b010024, name =
Stack Trace:
  IP: 0x002267dc, LR: 0x002267e8
--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914--^ 0x00227438
--^ 0x00226808--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914
--^ 0x00227438--^ 0x002267e8--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec
--^ 0x00230914--^ 0x00227438--^ 0x002267e8--^ 0x0021f4ec--^ 0x0014ceb0
--^ 0x0014d0ec--^ 0x00230914--^ 0x00227438--^ 0x002267e8--^ 0x0021f4ec
--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914--^ 0x00227438--^ 0x002267e8
--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914--^ 0x00227438
--^ 0x40842202--^ 0x0022a67c--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x0014d144
--^ 0x00149df8--^ 0x00142e50--^ 0x001217c0--^ 0x0011a704--^ 0x00231b64
--^ 0x0022b0ec--^ 0x002272c4fatal source: RTEMS_FATAL_SOURCE_EXCEPTION
exception vector 3 (0x3)
  next PC or address of fault = 0x002267dc
  saved MSR = 0x02001032
  context = interrupt, ISR nest level = 1
  thread dispatch disable level = 2
  R0  = 0x002267e8 R1  = 0x002fd120 R2  = 0x00000000 R3  = 0x0028a594
  R4  = 0x00000034 R5  = 0x00000000 R6  = 0x80000000 R7  = 0x0028a59f
  R8  = 0x0028a59e R9  = 0xa5a5a5a5 R10 = 0xf1120005 R11 = 0x00000000
  R12 = 0x40442804 R13 = 0x002bfc98 R14 = 0x002fd200 R15 = 0x002c0000
  R16 = 0x00000000 R17 = 0x00000055 R18 = 0x002f4648 R19 = 0x002f0000
  R20 = 0x00000019 R21 = 0x0028ce74 R22 = 0x00000004 R23 = 0x002b8d50
  R24 = 0x002f6814 R25 = 0x00000000 R26 = 0x002747e0 R27 = 0x0024ff10
  R28 = 0xcccccccd R29 = 0x0028a594 R30 = 0x002fe950 R31 = 0x00000030
  CR  = 0x40442808
  CTR = 0x0021f9c8
  XER = 0x20000000
  LR  = 0x002267e8
  DAR = 0xa5a5a5a9
  executing thread ID = 0x0b010024, name =
Stack Trace:
  IP: 0x002267dc, LR: 0x002267e8
--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914--^ 0x00227438
--^ 0x002267e8--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914
--^ 0x00227438--^ 0x00226808--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec
--^ 0x00230914--^ 0x00227438--^ 0x002267e8--^ 0x0021f4ec--^ 0x0014ceb0
--^ 0x0014d0ec--^ 0x00230914--^ 0x00227438--^ 0x002267e8--^ 0x0021f4ec
--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914--^ 0x00227438--^ 0x002267e8
--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914--^ 0x00227438
--^ 0x002267e8--^ 0x0021f4ec--^ 0x0014ceb0--^ 0x0014d0ec--^ 0x00230914
--^ 0x00227438--^ 0x40842202--^ 0x0022a67c--^ 0x0014ceb0--^ 0x0014d0ec
--^ 0x0014d144--^ 0x00149df8--^ 0x00142e50--^ 0x001217c0--^ 0x0011a704
Too many stack frames (stack possibly corrupted), giving up...
bsp_fatal_extension(): RTEMS terminated -- no way back to MotLoad so I reset the card
Printing a stack trace for your convenience :-)

0x02270848--> 0x02270844--> 0x01363632--> 0x01364204--> 0x02296084
0x02257976--> 0x02254824--> 0x02225388--> 0x01363632--> 0x01364204
0x02296084--> 0x02257976--> 0x02254824--> 0x02225388--> 0x01363632
0x01364204--> 0x02296084--> 0x02257976--> 0x02254856--> 0x02225388
0x01363632--> 0x01364204--> 0x02296084--> 0x02257976--> 0x02254824
0x02225388--> 0x01363632--> 0x01364204--> 0x02296084--> 0x02257976
0x02254824--> 0x02225388--> 0x01363632--> 0x01364204--> 0x02296084
0x02257976--> 0x02254824--> 0x02225388--> 0x01363632






Viele Grüße
Heinz Junkes
--
Experience directly varies with equipment ruined.



Attachment: smime.p7s
Description: S/MIME cryptographic signature


Replies:
Re: crash in CAS-event (high load?) Michael Davidsaver via Tech-talk

Navigate by Date:
Prev: Re: Allied Vision camera oddity Cobb, Tom (DLSLtd,RAL,LSCI) via Tech-talk
Next: Re: crash in CAS-event (high load?) Michael Davidsaver via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024 
Navigate by Thread:
Prev: Re: Install EPICS on debian11 Florian Feldbauer via Tech-talk
Next: Re: crash in CAS-event (high load?) Michael Davidsaver via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024 
ANJ, 21 Sep 2021 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·