EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  2025  <2026 Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  2025  <2026
<== Date ==> <== Thread ==>

Subject: RE: Investigating Archiver Appliance lost events
From: Abdalla Ahmad via Tech-talk <tech-talk at aps.anl.gov>
To: "'Hu, Yong'" <yhu at bnl.gov>, "'tech-talk at aps.anl.gov'" <tech-talk at aps.anl.gov>
Date: Tue, 27 Jan 2026 05:53:03 +0000

Unfortunately, the script did not work for some PVs but this might be a separate issue. The script pauses archiving, consolidates data, and then resumes archiving. I noticed on a number of PVs the pause and resume fails with HTTP 500 returned from the server, this error is the result of a java null pointer exception (attached) which is printed in the MGMT catalina.err. After some troubleshooting, I found out that these failed PVs are all aliases, I was not able to figure out why this happened so I ended up building the master branch and it is working so far.

 

Best Regards,

Abdalla Al-Dalleh

Control Engineer

SESAME

 

From: Abdalla Ahmad
Sent: Monday, January 26, 2026 8:57 AM
To: 'Hu, Yong' <yhu at bnl.gov>; tech-talk at aps.anl.gov
Subject: RE: Investigating Archiver Appliance lost events

 

Hi Yong

 

Your observation is valid, and we faced it before but it is not the case this time as the fastest PV we have is archived at 10 Hz, and the issue happens with a large number of PVs, we are archiving 18K+ PVs and I think there was less than 2K PVs in the STS. I am not sure how valid is this, but I remember the issue happened when I executed a script that fetches PV names list and consolidates data for all PVs to the LTS (The management BPL consolidateDataForPV). Now the server is working, I will execute the script again and see how it goes.

 

Best Regards,

Abdalla Al-Dalleh

Control Engineer

SESAME

 

From: Hu, Yong <yhu at bnl.gov>
Sent: Sunday, January 25, 2026 4:41 PM
To: Abdalla Ahmad <Abdalla.Ahmad at sesame.org.jo>; tech-talk at aps.anl.gov
Subject: Re: Investigating Archiver Appliance lost events

 

Hi Abdalla,

 

I recall that we had similar issues at NSLS-2. The buffer refers to the troublesome PV on the AA, not the OS. For us, 'buffer full' means the PV updates too fast (maybe someone has made the changes on the PV). I remember the default monitor-based archiving rate is limited to 1Hz for scaler PVs. I am guessing your problematic PVs update faster the default rate.  You can use camonitor (or caEventRate) to verify the update rate.

 

Cheers,

Yong

 

From:  Abdalla Ahmad via Tech-talk <tech-talk at aps.anl.gov>
Date: Sunday, January 25, 2026 at 7:47 AM

 

Hi

 

This morning we noticed strange behavior on the archiver, it unable to write PB files for a large number of PVs, we noticed this behavior when doing live plots where the plotter plots only the first few 3 or 4 sample and no archived data at all. During troubleshooting, I noticed on one of the problematic PVs it had on its Details page the following parameters increasing every second (i.e. every sample received):

 

  • How many events lost because the sample buffer is full so far?
  • How many events lost totally so far?

 

Both AA and OS logs were not helpful, I ended up rebooting the server itself. My question is which buffer is the details page referring to? Is it buffer within AA itself or the OS?

 

Best Regards,

Abdalla Al-Dalleh

Control Engineer

SESAME

P.O. Box 7, Allan 19252, Jordan
Tel: +96253511348 , ext. 265

Fax: +96253511423

Email : abdalla.ahmad at sesame.org.jo
Website: www.sesame.org.jo

 

// mgmt/logs/cataline.err contents when doing pause or resume.
Jan 26, 2026 10:03:08 AM org.apache.catalina.core.StandardWrapperValve invoke
SEVERE: Servlet.service() for servlet [BPLServlet] in context with path [/mgmt] threw exception
java.io.IOException: java.lang.NullPointerException
	at org.epics.archiverappliance.common.BasicDispatcher.handleBPLAction(BasicDispatcher.java:85)
	at org.epics.archiverappliance.common.BasicDispatcher.dispatch(BasicDispatcher.java:50)
	at org.epics.archiverappliance.mgmt.BPLServlet.doGet(BPLServlet.java:214)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:529)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:623)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:197)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:142)
	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:51)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:166)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:142)
	at org.apache.catalina.filters.CorsFilter.handleNonCORS(CorsFilter.java:334)
	at org.apache.catalina.filters.CorsFilter.doFilter(CorsFilter.java:161)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:166)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:142)
	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:166)
	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:88)
	at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:481)
	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:90)
	at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:653)
	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:72)
	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:344)
	at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:398)
	at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:63)
	at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:935)
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1833)
	at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:52)
	at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:975)
	at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:493)
	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:63)
	at java.base/java.lang.Thread.run(Thread.java:1474)
Caused by: java.lang.NullPointerException

// mgmt/logs/cataline.out contents when doing pause and resume on a problematic PV.
2026-01-26 10:05:36,880 INFO  [http-nio-17665-exec-9] common.BasicDispatcher (BasicDispatcher.java:44) - Servicing /pauseArchivingPV
2026-01-26 10:05:36,890 ERROR [http-nio-17665-exec-9] common.BasicDispatcher (BasicDispatcher.java:84) - null
java.lang.NullPointerException: null

2026-01-26 10:41:21,519 INFO  [http-nio-17665-exec-1] common.BasicDispatcher (BasicDispatcher.java:44) - Servicing /resumeArchivingPV
2026-01-26 10:41:21,530 ERROR [http-nio-17665-exec-1] common.BasicDispatcher (BasicDispatcher.java:84) - null
java.lang.NullPointerException: null
2026-01-26 10:41:21,532 WARN  [hz.appliance0.event-4] persistence.MySQLPersistence (MySQLPersistence.java:360) - 2 rows changed when updating key  SR-PS-GW5-CH11-PS2:getIload in putTypeInfo


// mgmt/logs/cataline.out contents when doing pause and resume on a working PV.
2026-01-26 10:20:10,383 INFO  [http-nio-17665-exec-5] common.BasicDispatcher (BasicDispatcher.java:44) - Servicing /pauseArchivingPV
2026-01-26 10:20:10,395 WARN  [hz.appliance0.event-5] persistence.MySQLPersistence (MySQLPersistence.java:360) - 2 rows changed when updating key  SRC01-VA-IMG1:getPressure in putTypeInfo
2026-01-26 10:20:10,405 INFO  [http-nio-17665-exec-2] common.BasicDispatcher (BasicDispatcher.java:44) - Servicing /getPVDetails
2026-01-26 10:20:10,405 INFO  [http-nio-17665-exec-2] reports.PVDetails (PVDetails.java:70) - Getting the detailed status for PV SRC01-VA-IMG1:getPressure

2026-01-26 10:20:30,243 INFO  [http-nio-17665-exec-6] common.BasicDispatcher (BasicDispatcher.java:44) - Servicing /resumeArchivingPV
2026-01-26 10:20:30,257 WARN  [hz.appliance0.event-5] persistence.MySQLPersistence (MySQLPersistence.java:360) - 2 rows changed when updating key  SRC01-VA-IMG1:getPressure in putTypeInfo
2026-01-26 10:20:30,268 INFO  [http-nio-17665-exec-1] common.BasicDispatcher (BasicDispatcher.java:44) - Servicing /getPVDetails
2026-01-26 10:20:30,268 INFO  [http-nio-17665-exec-1] reports.PVDetails (PVDetails.java:70) - Getting the detailed status for PV SRC01-VA-IMG1:getPressure

References:
Investigating Archiver Appliance lost events Abdalla Ahmad via Tech-talk
Re: Investigating Archiver Appliance lost events Hu, Yong via Tech-talk
RE: Investigating Archiver Appliance lost events Abdalla Ahmad via Tech-talk

Navigate by Date:
Prev: Keithley 2460 sourcemeter Pete Jemian via Tech-talk
Next: Re: Tech-talk Digest, Vol 20, Issue 24 Sourabh Halli via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  2025  <2026
Navigate by Thread:
Prev: RE: Investigating Archiver Appliance lost events Abdalla Ahmad via Tech-talk
Next: RE: Investigating Archiver Appliance lost events Sky Brewer via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  2025  <2026
ANJ, 19 Mar 2026 · Home · News · About · Talk · Base · Modules · Extensions ·
· Distributions · Download · Documents · Links · Licensing ·