Hello everyone,
Recently I have been tracing a “Segmentation fault” after one Debian server was upgraded.
Firstly, I used GDB to debug:
$ gdb bin/linux-x86_64-debug/fe-stage
GNU gdb (Debian 7.7.1+dfsg-5) 7.7.1
……
(gdb) run
Starting program: /epics/iocs/fe-stage-20210415/bin/linux-x86_64-debug/fe-stage
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff2eac700 (LWP 17330)]
[Thread 0x7ffff2eac700 (LWP 17330) exited]
epics> dbLoadDatabase("dbd/fe-stage.dbd",0,0)
epics> fe_stage_registerRecordDeviceDriver(pdbbase)
epics> pmacAsynIPConfigure("P0","10.0.58.90:1025")
[New Thread 0x7ffff26ab700 (LWP 17362)]
[New Thread 0x7ffff25aa700 (LWP 17363)]
[New Thread 0x7ffff24a9700 (LWP 17364)]
[New Thread 0x7ffff23a8700 (LWP 17365)]
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7312e8f in findDpCommon (puserPvt=0x70e08c)
at ../../asyn/asynDriver/asynManager.c:526
526 ../../asyn/asynDriver/asynManager.c: No such file or directory.
(gdb)
(gdb) bt
#0 0x00007ffff7312e8f in findDpCommon (puserPvt=0x70e08c)
at ../../asyn/asynDriver/asynManager.c:526
#1 0x00007ffff73184a3 in exceptionDisconnect (pasynUser=0x70e0f4)
at ../../asyn/asynDriver/asynManager.c:2076
#2 0x00007ffff79cf4a9 in pmacAsynIPPortCommon (portName=0x70e0f4 "P0",
addr=0, pPmacPvt=0x7fffffffe380, plowerLevelInterface=0x7fffffffe388,
pasynUser=0x7fffffffe390) at ../pmacAsynIPPort.c:318
#3 0x00007ffff79ceefa in pmacAsynIPPortConfigureEos (
portName=0x70e0f4 "P0", addr=0) at ../pmacAsynIPPort.c:218
#4 0x00007ffff79cee79 in pmacAsynIPConfigure (portName=0x70e0f4 "P0",
hostInfo=0x70e0f7 "10.0.58.90:1025") at ../pmacAsynIPPort.c:192
#5 0x00007ffff79d0cc9 in pmacAsynIPConfigureCallFunc (args=0x630320)
at ../pmacAsynIPPort.c:761
#6 0x00007ffff5ed36a4 in iocshBody (pathname=0x0, commandLine=0x0,
macros=0x0) at ../../../src/libCom/iocsh/iocsh.cpp:813
#7 0x00007ffff5ed3a60 in iocshLoad (pathname=0x0, macros=0x0)
at ../../../src/libCom/iocsh/iocsh.cpp:895
#8 0x00007ffff5ed3a00 in iocsh (pathname=0x0)
at ../../../src/libCom/iocsh/iocsh.cpp:881
#9 0x0000000000407c40 in main (argc=1, argv=0x7fffffffe668)
at ../fe-stageMain.cpp:20
The message “526 ../../asyn/asynDriver/asynManager.c: No such file or directory.” indicates something wrong with the support module “asyn”, but it is confusing: I have “asyn” installed by NSLS-2 Debian
packages.
At the end, I am surprised to find out two different versions of asyn (4.31, 4.33) are built in the IOC binary “fe-stage”.
softioc@feioc02:~/fe-stage-20210415$ ldd bin/linux-x86_64-debug/fe-stage | sort
/lib64/ld-linux-x86-64.so.2 (0x00007f48d21f0000)
libasyn.so.4.31 => /usr/lib/x86_64-linux-gnu/libasyn.so.4.31 (0x00007f48cfc62000)
libasyn.so.4.33 => /usr/lib/epics/lib/linux-x86_64-debug/libasyn.so.4.33 (0x00007f48d170b000)
libautosave.so.5.7.1 => /usr/lib/epics/lib/linux-x86_64-debug/libautosave.so.5.7.1 (0x00007f48d14df000)
libbusy.so.1.6.1 => /usr/lib/epics/lib/linux-x86_64-debug/libbusy.so.1.6.1 (0x00007f48d0e2d000).
……
Then I rebuilt the IOC application. Now I can see warning messages about the conflicts which I did not pay attention at the initial build :
/usr/bin/g++ -D_GNU_SOURCE -D_DEFAULT_SOURCE -D_FORTIFY_SOURCE=2 -D_X86_64_ -DUNIX -Dlinux -O3 -g -Wall -fstack-protector-strong -Wformat -Werror=format-security -mtune=generic
-m64 -I. -I../O.Common -I. -I. -I.. -I../../../include/compiler/gcc -I../../../include/os/Linux -I../../../include -I/usr/lib/epics/include/compiler/gcc -I/usr/lib/epics/include/os/Linux -I/usr/lib/epics/include -c ../fe-stageMain.cpp
/usr/bin/g++ -o fe-stage -L/usr/lib/epics/lib/linux-x86_64 -Wl,-rpath,/usr/lib/epics/lib/linux-x86_64 -Wl,-z,relro -Wl,--as-needed -rdynamic -m64 fe-stage_registerRecordDeviceDriver.o
fe-stageMain.o -lpmacAsynMotorPort -lpmacAsynMotor -lpmacAsynIPPort -lpmacAsynCoord -lmotor -lasyn -lautosave -lstream -lcalc -lbusy -ldevIocStats -lcaPutLog -ldbRecStd -ldbCore -lca -lCom -lpcre
/usr/bin/ld: warning: libasyn.so.4.31, needed by /usr/lib/epics/lib/linux-x86_64/libpmacAsynMotor.so, may conflict with libasyn.so.4.33
So, the problem is caused by messed up Debian package versions after the server upgrade. Two different versions of asyn (or any other support modules) built in an IOC application surely will be a recipe for
trouble. My question is: should the linker “ld” have aborted the building instead of just giving warnings? Is there any way for the EPICS build system to avoid this kind of problem? I am using “GNU ld (GNU Binutils for Debian) 2.25” and “gcc (Debian 4.9.2-10+deb8u2)
4.9.2”, base-3.15.3-13.
Thanks,
Yong