EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: Matlab 2020b crashes with labCA 3.7.2
From: Till Straumann via Tech-talk <tech-talk at aps.anl.gov>
To: Miroslaw Dach <mdach at lbl.gov>
Cc: Gregory Portmann <gjportmann at lbl.gov>, "Corbett, William J." <corbett at slac.stanford.edu>, EPICS Techtalk <tech-talk at aps.anl.gov>
Date: Fri, 5 Mar 2021 15:50:51 +0100
Hi All.

It seems my original answer did never make it to tech-talk. It can still be found
below but here are some more data I gathered:

Even with the suggested work-arounds I could not get labca-3.7.2 and
epics-7.0.4.1 or 7.05 to work with matlab 2020b. It would either hang or crash when
quitting matlab.

I cut a new [labca 3.8.0 release](https://github.com/till-s/epics-labca/releases/tag/labca_3_8_0)

which addresses this problem but you still need one (no need for both) of the following
two work-arounds:

a) use a build of epics base with posix priority scheduling disabled. In configure/CONFIG_SITE set
       USE_POSIX_THREAD_PRIORITY_SCHEDULING=NO
    then 'make clean' and 'make'. Obviously, make sure no real-time systems are using this new build.
b) use LD_PRELOAD to load and initialize libCom before starting matlab

    LD_PRELOAD=<path_to_base>/lib/<arch>/libCom.so  matlab <options>

If someone has good connections to MathWorks (or a MW engineer is reading this)
they could use the attached simple (and standalone) 'mex' file to reproduce the problem
(w/o any epics).

Cheers
- Till


On 3/1/21 5:56 PM, Miroslaw Dach wrote:
Hi Till,

Thank you very much for the in depth study of the problem. It looks like Mathworks has changed something in the code and even worse - they have introduced an "unwanted feature" which affects the Matlan2020b and LabCa users on RHEL7.

We will try one of your suggestions and let you know how things are.

Many Thanks
Mirek



On Mon, Mar 1, 2021 at 2:01 AM Till Straumann <till.straumann at psi.ch> wrote:
Hi Mirek.

I have investigated this deadlock and come to the conclusion that it is a problem
with matlab, probably under RHEL7 (I have no way to test under other systemes, in particular: windows,
though).

When I debugged the deadlock I found that some matlab threads deadlock in a library called

glibc-2.17_shim.so

This (mathworks proprietary) library is LD_PRELOADed from the 'matlab' driver script where we find a comment:

    # Preload glibc_shim in case of RHLE7 variants
    test -e /usr/bin/ldd &&  ldd --version |  grep -q "(GNU libc) 2\.17"  \
            && export LD_PRELOAD="$LD_PRELOAD:$MATLAB/bin/glnxa64/glibc-2.17_shim.so" \
            && export MW_GLIBC_SHIM="$MATLAB/bin/glnxa64/glibc-2.17_shim.so"

which leads to the hypothesis that RHEL7 only may be affected.

The deadlock happens when matlab
 - loads a shared object (or library)
 - AND the shared object executes some initialization code (e.g., constructors of global objects defined in the library)
 - AND the initialization code calls 'pthread_join()'. 'pthread_join()' then never returns.

Note that if 'ordinary' code in the shared object (i.e., as opposed to initialization code) uses 'pthread_join()' then
that works fine.

A simple example mex file (attached) which is not using labca or epics and reproduces the described behaviour.

EPICS' libCom does use 'pthread_join()' during initialization and is therefore affected.

At this point I can suggest two possible work-arounds (using one of them is sufficient):

1.) Use an EPICS-base build with posix priority scheduling disabled. This avoids a section of initialization
     code which calls 'pthread_join()'

     E.g., in configure/CONFIG_SITE:

    USE_POSIX_THREAD_PRIORITY_SCHEDULING = NO

2.) LD_PRELOAD EPICS' libCom.so *before* starting matlab

    LD_PRELOAD=<path_to_my_epics_lib>/libCom.so  matlab

HTH
- Till



On 2/19/21 4:20 AM, Miroslaw Dach wrote:
Hi Till,

We have crossed each other. You came to PSI from the US and I did the opposite. I moved to work in LBL.

Are you still maintaining the LabCa?
We are facing a problem with Matlab 2020b crashes when using labCa 3.7.2.
It looks like the incompatibility between the Matlab 2020b and labCa latest official version.
The labCa 3.7.2 seems to be the latest version unless you have the newer one?

Best Regards
Mirek



/* Demonstrate a problem with pthread_join() from initialization
 * code under matlab2020b
 *
 * For a more realistic example you can compile this file into
 * a library (mimicks an external library):
 *
 *   g++ -DBUILD_AS_LIBRARY -shared -fPIC mexJoin.cc -o libXXX.so
 * 
 * and a separate mex-file:
 *
 *   mex -cxx -DBUILD_AS_MEXFILE mexJoin.cc -L. -lXXX
 *
 * Alternatively, the code can be compiled into a single mexfile:
 *
 *   mex -cxx mexJoin.cc
 */

/* Use stdio for printing messages to ensure matlab is not interfering.
 * Must use matlab in CLI mode, however, in order to see the messages:
 *
 *   matlab -nodisplay -nosplash -nojvm
 *
 * (or watch the terminal window from where the matlab GUI was started)
 */

/* Author: Till Straumann <till.straumann at psi.ch>, 2021 */

#include <string.h>
#include <errno.h>
#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>

#if ! defined(BUILD_AS_LIBRARY) && ! defined(BUILD_AS_MEXFILE)
#define BUILD_AS_LIBRARY
#define BUILD_AS_MEXFILE
#endif

extern "C" {
	int join_a_thread(int skip);
};

#ifdef BUILD_AS_LIBRARY
static void * some_thread(void *arg)
{
	fprintf(stderr, "Thread terminated\n");
	return 0;
}

extern "C" {

int join_a_thread(int skip)
{
pthread_t id;
	switch ( skip ) {
		case 1:
			fprintf(stderr, "Skipping thread creation and joining during initialization phase\n\n");
		return -1;

		case 2:
			fprintf(stderr, "Skipping thread creation and joining during finalization phase\n\n");
		return -1;

		case -1:
			fprintf(stderr, "Attempting thread creation and joining during initialization phase\n");
		break;

		case -2:
			fprintf(stderr, "Attempting thread creation and joining during finalization phase\n");
		break;

		default:
			fprintf(stderr, "Attempting thread creation and joining from mexFunction\n");
		break;
	}

	if ( pthread_create( &id, 0, some_thread, 0 ) ) {
		fprintf(stderr, "Unable to create thread\n\n");
		return -1;
	}
	fprintf(stderr, "Created a thread\n");
	// 2020b under RHEL7 (compiled with g++ 9.3.0)
	// deadlocks in 'pthread_join' -- somewhere in
	// MW's glibc-2.17_shim.so. The same happens
	// when code which attempts 'pthread_join' from
	// a global initializer is loaded with dlopen().
	//
	// NOTE: 'join_a_thread()' works fine when executed
	//       from the mex-function itself; the deadlock
	//       occurs only when this is attempted during
	//       library initialization!
	fprintf(stderr, "Attempting to join the thread\n");
	if ( pthread_join( id, 0 ) ) {
		fprintf(stderr, "Unable to join thread\n\n");
		return -1;
	}
	fprintf(stderr, "Successfully joined!\n\n");
	return 0;
}

}

// Set the environment variable 'SKIP_JOIN_DURING_INIT' (prior to starting
// matlab) to verify that joining works when executed from the mexFunction
// itself.

class Initializer {
public:
	Initializer()
	{
		join_a_thread( ( (!! getenv("SKIP_JOIN_DURING_INIT")) ? 1 : -1) );
	}

	~Initializer()
	{
		join_a_thread( ( (!! getenv("SKIP_JOIN_DURING_EXIT")) ? 2 : -2) );
	}
};

static Initializer v;

#endif

#ifdef BUILD_AS_MEXFILE
#include <mex.hpp>
#include <mexAdapter.hpp>

using namespace matlab::data;
using matlab::mex::ArgumentList;

class MexFunction : public matlab::mex::Function {
public:
	void operator()(matlab::mex::ArgumentList o, matlab::mex::ArgumentList i)
	{
		//mexLock(); -- could prevent unloading
		join_a_thread(0);
	}
};
#endif

Replies:
Re: Matlab 2020b crashes with labCA 3.7.2 White, Greg via Tech-talk
Re: Matlab 2020b crashes with labCA 3.7.2 Till Straumann via Tech-talk

Navigate by Date:
Prev: Re: [EXTERNAL] EPICS Arduino Interfacing Vodopivec, Klemen via Tech-talk
Next: labCA 3.8.0 available Till Straumann via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024 
Navigate by Thread:
Prev: Re: Matlab 2020b crashes with labCA 3.7.2 Miroslaw Dach via Tech-talk
Next: Re: Matlab 2020b crashes with labCA 3.7.2 White, Greg via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024 
ANJ, 22 Apr 2021 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·