>> When I restart archiver "broken" PVs back to normally.
>> I will keep monitoring them
This is always a good idea. I have a script that uses the getCurrentlyDisconnectedPVs BPL and validates these on a path independent of whatever the archiver is using. That is, if the archiver goes thru a gateway, this monitoring script checks the liveness of the PV's directly on the VLAN.
Please let me know if you find any more details.
Regards,
Murali
________________________________________
From: Gabriel de Souza Fedel <[email protected]>
Sent: Monday, September 4, 2017 5:28 AM
To: Shankar, Murali; [email protected]
Cc: ([email protected])
Subject: Re: Archiver: Problems with disconnected PVs
Hi,
When I restart archiver "broken" PVs back to normally. I will keep
monitoring them
Em 01-09-2017 16:43, Shankar, Murali escreveu:
> Also, which version is this? Apologies if you mentioned it before but I naturally assume you have a reasonably recent version.
Our version is Oct/2016
>
> Regards,
> Murali
Regards and Thank you again for help.
>
> ________________________________________
> From: Shankar, Murali
> Sent: Friday, September 1, 2017 12:23 PM
> To: Gabriel de Souza Fedel; [email protected]
> Cc: ([email protected])
> Subject: Re: Archiver: Problems with disconnected PVs
>
>>> One thing look's strange, max_array_bytes.
> This would probably affect your waveforms mostly...
>
>>> and a few has problems (eventually disconnect).
> Are all these PV's from one or two IOCs? In this case I would look at the IOC.
>
>>>>> Your PV Details page for this PV could have information
> There should be lines here for "When did we request CA to make a connection to this PV?" and "Time elapsed since search request (s)". Does these look ok? Since you paused/resumed the PV; this should be the time you resumed the PV.
>
> The "Currently DisconnectedPV's" report also has some additional information; this is not always helpful but if all your live but disconnected PV's are in the same CAJ context ID then a restart may be required. Might be useful to get a stack trace of all the threads; there might be some clue here on where the search thread for that context is stuck.
>
>>> I take a look, but i can't see anything strange
> I think after this we get into wireshark territory (aka I'm out of ideas). You'll need to see if we are issuing search requests properly and getting proper responses etc.
>
> Regards,
> Murali
> ________________________________________
> From: Gabriel de Souza Fedel <[email protected]>
> Sent: Friday, September 1, 2017 11:59 AM
> To: Shankar, Murali; [email protected]
> Cc: ([email protected])
> Subject: Re: Archiver: Problems with disconnected PVs
>
> Em 01-09-2017 15:10, Shankar, Murali escreveu:
>>>> is there another location
>> Depends on your setup of course but I see something like this in my arch.log right at the beginning.
>>
>> <context class="com.cosylab.epics.caj.CAJContext">
>> <preemptive_callback>true</preemptive_callback>
>> <addr_list>gateway:5076 gateway:5077 gateway:5078 other-gateway:5064</addr_list>
>> <auto_addr_list>false</auto_addr_list>
>> <connection_timeout>30.0</connection_timeout>
>> <beacon_period>15.0</beacon_period>
>> <repeater_port>5069</repeater_port>
>> <server_port>5076</server_port>
>> <max_array_bytes>80000000</max_array_bytes>
>> <event_dispatcher class="org.epics.archiverappliance.engine.epics.JCAEventDispatcherBasedOnPVName"/>
>> </context>
>>
> I found it:
>
>
> <context class="com.cosylab.epics.caj.CAJContext">
>
> <preemptive_callback>true</preemptive_callback>
>
> <addr_list></addr_list>
>
> <auto_addr_list>true</auto_addr_list>
>
> <connection_timeout>30.0</connection_timeout>
>
> <beacon_period>30.0</beacon_period>
>
> <repeater_port>5065</repeater_port>
>
> <server_port>5064</server_port>
>
> <max_array_bytes>30.0</max_array_bytes>
>
> <event_dispatcher
> class="org.epics.archiverappliance.engine.epics.JCAEventDispatcherBasedOnPVName"/>
> </context>
>
> One thing look's strange, max_array_bytes...Looks a bit low, can be the
> problem?
>
>>>> pause/resume I tried
>> Pausing/resuming tears down and recreates the CAJ channel for the PV so if you are unable to connect even after this you probably have some misconfiguration; that is, you can rule out most transient errors.
>>
>>>> IOC's log on ioc machine right
>> Yes; sometimes there could be stuck tasks on the IOC side; you can check for that.
>>
>> Your PV Details page for this PV could have information there that could help. You can get to this using something like so - http://localhost:17665/mgmt/bpl/getPVDetails?pv=Your_PV
>>
> I take a look, but i can't see anything strange
>> Finally, would you be in a position to attempt a restart?
> I will try it, but on next week.
>
> The most strange thing is a lot of PV's work well, and a few has
> problems (eventually disconnect)
>
> Thank you again
>
> Regards
>
>>
>> Regards,
>> Murali
>>
>>
>>
>>
>> ________________________________________
>> From: Gabriel de Souza Fedel <[email protected]>
>> Sent: Friday, September 1, 2017 10:52 AM
>> To: Shankar, Murali; [email protected]
>> Cc: ([email protected])
>> Subject: Re: Archiver: Problems with disconnected PVs
>>
>> Em 01-09-2017 13:39, Shankar, Murali escreveu:
>>>>> This seems like it might be a CA client configuration issue
>>>
>>> This is the most likely case. The engine prints out it's CAJ
>>> configuration on startup and you should be able to see this in your logs.
>>>
>> I didn't find it. I find engine/logs/catalina.err, is there another
>> location?
>>
>>>
>>> You can try a couple of things. You can pause/resume the PV's in
>>> question and see if they reconnect back. You can also look in the IOC's
>>> logs and see if there is anything interesting going there.
>>>
>> pause/resume I tried. IOC's log on ioc machine right? apparently there
>> is no errors (on epics console).
>>
>>
>>>
>>> Regards,
>>
>> Regards and thank you for the answer
>>>
>>> Murali
>>>
>>>
>>
>> --
>> Gabriel Fedel
>> Software de Operação das Linhas de Luz
>> Laboratório Nacional de Luz Síncrotron – (LNLS)
>> Centro Nacional de Pesquisa em Energia e Materiais (CNPEM)
>> [email protected] | +55 (19) 3512 1226
>> www.lnls.cnpem.br
>>
>
> --
> Gabriel Fedel
> Software de Operação das Linhas de Luz
> Laboratório Nacional de Luz Síncrotron – (LNLS)
> Centro Nacional de Pesquisa em Energia e Materiais (CNPEM)
> [email protected] | +55 (19) 3512 1226
> www.lnls.cnpem.br
>
--
Gabriel Fedel
Software de Operação das Linhas de Luz
Laboratório Nacional de Luz Síncrotron – (LNLS)
Centro Nacional de Pesquisa em Energia e Materiais (CNPEM)
[email protected] | +55 (19) 3512 1226
www.lnls.cnpem.br
- Replies:
- Re: Archiver: Problems with disconnected PVs Gabriel de Souza Fedel
- References:
- Re: Archiver: Problems with disconnected PVs Shankar, Murali
- Re: Archiver: Problems with disconnected PVs Gabriel de Souza Fedel
- Re: Archiver: Problems with disconnected PVs Shankar, Murali
- Re: Archiver: Problems with disconnected PVs Gabriel de Souza Fedel
- Re: Archiver: Problems with disconnected PVs Shankar, Murali
- Re: Archiver: Problems with disconnected PVs Shankar, Murali
- Re: Archiver: Problems with disconnected PVs Gabriel de Souza Fedel
- Navigate by Date:
- Prev:
Re: C++ multi threaded application. Andrew Johnson
- Next:
Re: data refresh and add pv Shankar, Murali
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
<2017>
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
Re: Archiver: Problems with disconnected PVs Gabriel de Souza Fedel
- Next:
Re: Archiver: Problems with disconnected PVs Gabriel de Souza Fedel
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
<2017>
2018
2019
2020
2021
2022
2023
2024
|