EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  2025  <2026 Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  2025  <2026
<== Date ==> <== Thread ==>

Subject: RE: Investigating Archiver Appliance lost events
From: Sky Brewer via Tech-talk <tech-talk at aps.anl.gov>
To: "tech-talk at aps.anl.gov" <tech-talk at aps.anl.gov>, "abdalla.ahmad at sesame.org.jo" <abdalla.ahmad at sesame.org.jo>, "yhu at bnl.gov" <yhu at bnl.gov>
Cc: "tech-talk at aps.anl.gov" <tech-talk at aps.anl.gov>
Date: Thu, 29 Jan 2026 13:57:14 +0000
Hi all,

For some background on the lost events issue.

Each PV has a buffer of events, the size of this buffer is determined by the formula 

BUFFER_CAP_ADJUSTMENT*WRITE_PERIOD/SAMPLING_PERIOD 

  • BUFFER_CAP_ADJUSTMENT is the config parameter org.epics.archiverappliance.config.PVTypeInfo.sampleBufferCapacityAdjustment
  • WRITE_PERIOD is ten seconds
  • SAMPLING_PERIOD comes from the policy file.

(This is documented a bit at https://epicsarchiver.readthedocs.io/en/latest/faq.html)

The size of the buffer is then in the metadata for the pv on the pvdetails page under: Sample buffer capacity.

The buffer is then emptied every WRITE_PERIOD to the STS, this can take some time. You can check the metrics page and check the Engine write thread(s) it should be well below WRITE_PERIOD. What can affect it is:

  • the write speed to the STS storage
  • The Memory allocated to the engine tomcat (I think the -Xmx parameter)
  • General CPU load

You can then lose events for the metric "How many events lost because the sample buffer is full so far? If

  • The write thread is larger than the WRITE_PERIOD (the next write period won’t start until the buffer is already full)
  • The PV updates faster than the size of the buffer

From your description it sounds like you overloaded the writing to the storage by consolidating all the data to the LTS and so the Engine write thread went larger than the WRITE_PERIOD.

Solutions I can think of are:

  • Consolidate in chunks of PVs, i.e. not all at once.
  • Increase the org.epics.archiverappliance.config.PVTypeInfo.sampleBufferCapacityAdjustment to like 2.0 or something.
  • Change the STS to be faster to write to (or just different to the LTS). For instance if STS is currently network attached storage make it local. Or do multiple attachments so writing to one shouldn’t affect the other.

Hope that helps
Sky


Navigate by Date:
Prev: Re: MAXnet Alarm State Under High Motor Command Load Jiajun Tian via Tech-talk
Next: udpiiu messages Mark Rivers via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  2025  <2026
Navigate by Thread:
Prev: RE: Investigating Archiver Appliance lost events Abdalla Ahmad via Tech-talk
Next: Keithley 2460 sourcemeter Pete Jemian via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  2025  <2026
ANJ, 19 Mar 2026 · Home · News · About · Talk · Base · Modules · Extensions ·
· Distributions · Download · Documents · Links · Licensing ·