Hi Mirek.
I have investigated this deadlock and come to the conclusion that it is a problem
with matlab, probably under RHEL7 (I have no way to test under other systemes, in particular: windows,
though).
When I debugged the deadlock I found that some matlab threads deadlock in a library called
glibc-2.17_shim.so
This (mathworks proprietary) library is LD_PRELOADed from the 'matlab' driver script where we find a comment:
# Preload glibc_shim in case of RHLE7 variants
test -e /usr/bin/ldd && ldd --version | grep -q "(GNU libc) 2\.17" \
&& export LD_PRELOAD="$LD_PRELOAD:$MATLAB/bin/glnxa64/
glibc-2.17_shim.so" \
&& export MW_GLIBC_SHIM="$MATLAB/bin/glnxa64/
glibc-2.17_shim.so"
which leads to the hypothesis that RHEL7 only may be affected.
The deadlock happens when matlab
- loads a shared object (or library)
- AND the shared object executes some initialization code (e.g., constructors of global objects defined in the library)
- AND the initialization code calls 'pthread_join()'. 'pthread_join()' then never returns.
Note that if 'ordinary' code in the shared object (i.e., as opposed to initialization code) uses 'pthread_join()' then
that works fine.
A simple example mex file (attached) which is not using labca or epics and reproduces the described behaviour.
EPICS' libCom does use 'pthread_join()' during initialization and is therefore affected.
At this point I can suggest two possible work-arounds (using one of them is sufficient):
1.) Use an EPICS-base build with posix priority scheduling disabled. This avoids a section of initialization
code which calls 'pthread_join()'
E.g., in configure/CONFIG_SITE:
USE_POSIX_THREAD_PRIORITY_SCHEDULING = NO
2.) LD_PRELOAD EPICS' libCom.so *before* starting matlab
LD_PRELOAD=<path_to_my_epics_lib>/libCom.so matlab
HTH
- Till
On 2/19/21 4:20 AM, Miroslaw Dach wrote: