EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  <20132014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  <20132014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: Mclennan PM600 motor controller
From: Torsten Bögershausen <[email protected]>
To: Ron Sluiter <[email protected]>, Mark Rivers <[email protected]>
Cc: "[email protected]" <[email protected]>, Peter Linardakis <[email protected]>
Date: Tue, 17 Dec 2013 07:21:35 +0100


On 12/16/13 10:51 PM, Ron Sluiter wrote:
Hello Mark,

I was able to recreate a problem (I presume it is "the" problem) that is along the same line
that Torsten points to. More specifically, motor_init() in drvPM304.cc makes this call when
it tests for the presence of the device,

send_recv_mess(card_index, "1OA;", buff);

send_recv_mess() then calls epicsStrtok_r() with a pointer to the above "1OA;" string.  This call to
epicsStrtok_r() is, I believe, what is causing the crash Peter is experiencing.

I suggest that both Torsten's patch to send_mess() and the following patch to send_recv_mess()
are needed;

Index: drvPM304.cc
===================================================================
--- drvPM304.cc (revision 16681)
+++ drvPM304.cc (working copy)
@@ -484,10 +484,12 @@

     cntrl = (struct PM304controller *) motor_state[card]->DevicePrivate;

+    strcpy(temp, out);
+
     /* Device support can send us multiple commands separated with ';'
      * characters.  The PM304 cannot handle more than 1 command on a line
      * so send them separately */
-    for (p = epicsStrtok_r((char *) out, ";", &tok_save);
+    for (p = epicsStrtok_r((char *) temp, ";", &tok_save);
                 ((p != NULL) && (strlen(p) != 0));
                 p = epicsStrtok_r(NULL, ";", &tok_save)) {
         Debug(2, "send_recv_mess: sending message to card %d, message=%s\n", card, p);

Of course,
How could I miss this :-(
And when fixing this location, we could drop the cast into "char *" here as well:
+    for (p = epicsStrtok_r(temp, ";", &tok_save);

/Torsten
Hope this helps,
Ron

On 12/16/2013 8:04 AM, Mark Rivers wrote:
Peter,

I suspect Torsten has found the problem.  "com" was declared "const char *", but it is being cast to "char *" when passed to epicsStrtok_r, which modifies the string.  Some compilers put const data into memory blocks that are read-only, and this could cause your crash.  I have definitely seen this with the Visual Studio compilers, (incorrect) code that ran OK on Linux would crash on Windows.

Mark

________________________________________
From: Torsten Bögershausen [[email protected]]
Sent: Monday, December 16, 2013 2:51 AM
To: Peter Linardakis; Mark Rivers; [email protected]; [email protected]
Subject: Re: Mclennan PM600 motor controller

Peter, does the following help:


diff --git a/motorApp/MclennanSrc/drvPM304.cc b/motorApp/MclennanSrc/drvPM304.cc
index e703082..3b15757 100644
--- a/motorApp/MclennanSrc/drvPM304.cc
+++ b/motorApp/MclennanSrc/drvPM304.cc
@@ -352,8 +352,9 @@ STATIC int set_status(int card, int signal)
   /* ring buffer                                       */
   /* send_mess()                                       */
   /*****************************************************/
-STATIC RTN_STATUS send_mess(int card, const char *com, char *name)
+STATIC RTN_STATUS send_mess(int card, const char *com0, char *name)
   {
+    char *com = NULL;
       char *p, *tok_save;
       char response[BUFF_SIZE];
       struct PM304controller *cntrl;
@@ -367,12 +368,13 @@ STATIC RTN_STATUS send_mess(int card, const char *com, char *name)
       return(ERROR);
       }

+    com = strdup(com0);
       cntrl = (struct PM304controller *) motor_state[card]->DevicePrivate;

       /* Device support can send us multiple commands separated with ';'
        * characters.  The PM304 cannot handle more than 1 command on a line
        * so send them separately */
-    for (p = epicsStrtok_r((char *) com, ";", &tok_save);
+    for (p = epicsStrtok_r(com, ";", &tok_save);
                   ((p != NULL) && (strlen(p) != 0));
                   p = epicsStrtok_r(NULL, ";", &tok_save)) {
           Debug(2, "send_mess: sending message to card %d, message=%s\n", card, p);
@@ -381,6 +383,7 @@ STATIC RTN_STATUS send_mess(int card, const char *com, char *name)
           Debug(2, "send_mess: card %d, response=%s\n", card, response);
       }

+    free(com);
       return(OK);
   }



On 12/13/13 3:54 AM, Peter Linardakis wrote:
Hi Mark

By including asynSetTraceMask("test-se1-1",0,255) in st.cmd, between boot and seg fault we get:

       ...
       2013/12/13 01:56:21.420 test-se1-1 asynManager::queueLockPort locking port
       2013/12/13 01:56:21.423 test-se1-1 asynManager::queueLockPort created queueLockPortPvt=0x289718
       2013/12/13 01:56:21.426 test-se1-1 asynManager::queueLockPort created queueLockPortPvt=0x289718, event=0x28ee98, mutex=0x28ef10
       2013/12/13 01:56:21.428 test-se1-1 asynManager::queueLockPort taking mutex 0x28ef10
       2013/12/13 01:56:21.430 test-se1-1 asynManager::queueLockPort queueing request
       2013/12/13 01:56:21.431 test-se1-1 addr -1 queueRequest priority 0 not lockHolder
       2013/12/13 01:56:21.432 asynManager::portThread port=test-se1-1 callback
       2013/12/13 01:56:21.434 test-se1-1 asynManager::queueLockPortCallback signaling begin event
       2013/12/13 01:56:21.435 test-se1-1 asynManager::queueLockPortCallback waiting for mutex from queueUnlockPort
       2013/12/13 01:56:21.436 test-se1-1 asynManager::queueLockPort waiting for event
       2013/12/13 01:56:21.437 test-se1-1 asynManager::queueLockPort got event from callback
       2013/12/13 01:56:21.438 test-se1-1 flush
       2013/12/13 01:56:21.439 172.16.0.108:5300 flush
       2013/12/13 01:56:21.440 asynOctetSyncIO flush
       2013/12/13 01:56:21.441 test-se1-1 queueUnlockPort
       2013/12/13 01:56:21.442 test-se1-1 asynManager::queueUnlockPort waiting for event
       2013/12/13 01:56:21.443 test-se1-1 queueUnlockPort unlock mutex 0x28ef10 complete.
       2013/12/13 01:56:21.445 test-se1-1 asynManager::queueLockPort locking port
       2013/12/13 01:56:21.445 test-se1-1 asynManager::queueLockPort taking mutex 0x28ef10
       2013/12/13 01:56:21.446 test-se1-1 asynManager::queueLockPort queueing request
       2013/12/13 01:56:21.447 test-se1-1 addr -1 queueRequest priority 0 not lockHolder
       2013/12/13 01:56:21.448 asynManager::portThread port=test-se1-1 callback
       2013/12/13 01:56:21.449 test-se1-1 asynManager::queueLockPortCallback signaling begin event
       2013/12/13 01:56:21.450 test-se1-1 asynManager::queueLockPortCallback waiting for mutex from queueUnlockPort
       2013/12/13 01:56:21.451 test-se1-1 asynManager::queueLockPort waiting for event
       2013/12/13 01:56:21.452 test-se1-1 asynManager::queueLockPort got event from callback
       2013/12/13 01:56:21.453 172.16.0.108:5300 read.
       2013/12/13 01:56:21.455 test-se1-1 queueUnlockPort
       2013/12/13 01:56:21.456 test-se1-1 asynManager::queueUnlockPort waiting for event
       2013/12/13 01:56:21.458 test-se1-1 queueUnlockPort unlock mutex 0x28ef10 complete.

and the gdb backtrace is:

       ...
       Core was generated by `../../bin/linux-arm/pitest st.cmd'.
       Program terminated with signal 11, Segmentation fault.
       #0  0xb6d649d4 in epicsStrtok_r () from /opt/epics/base/lib/linux-arm/libCom.so
       (gdb) backtrace
       #0  0xb6d649d4 in epicsStrtok_r () from /opt/epics/base/lib/linux-arm/libCom.so
       #1  0xb6eb4bb0 in send_recv_mess(int, char const*, char*) () from /opt/epics/modules/motorR6-8/lib/linux-arm/libMclennan.so
       #2  0xb6eb5218 in motor_init() () from /opt/epics/modules/motorR6-8/lib/linux-arm/libMclennan.so
       #3  0xb6eb49dc in PM304_init(void*) () from /opt/epics/modules/motorR6-8/lib/linux-arm/libMclennan.so
       #4  0xb6de26d0 in dbInitDevSup () from /opt/epics/base/lib/linux-arm/libdbStaticIoc.so
       #5  0xb6e7af8c in iocBuild () from /opt/epics/base/lib/linux-arm/libmiscIoc.so
       #6  0xb6e7b52c in iocInit () from /opt/epics/base/lib/linux-arm/libmiscIoc.so
       #7  0xb6d5ef78 in iocshBody () from /opt/epics/base/lib/linux-arm/libCom.so
       #8  0x0000bcc4 in main ()

As far as the dbior command goes, I assume you meant commenting out the database that contains the motor record?  If that's the case, the it seg faults before I get to a prompt, as it normally does and the backtrace is the same as above.

Regards
Peter

-----Original Message-----
From: Mark Rivers [mailto:[email protected]]
Sent: Friday, 13 December 2013 11:14 AM
To: Peter Linardakis; [email protected]; [email protected]
Subject: RE: Mclennan PM600 motor controller

Hi Peter,

I had this nagging suspicion that I should indeed be basing my record
on motor.db, but was confused from the motorR6-8 README file saying that serial devices needed asyn4-2.  Live and learn I guess.
Your confusion is understandable.  There are actually 2 layers in the motor software that use asyn.

1) The interface between the motor driver and message based interfaces like RS-232, GPIB, and TCP/IP.  The motor drivers always use asyn for this layer, even the old Model 1 drivers.

2) The interface between motor record device support and the motor driver.  Only Model 2 and Model 3 drivers use asyn in this layer.

I changed the PM304Config(0, "test-se1-1", 1) line from PM304Config(1,
"test-se1-1", 1), since I assumed from the "#C0 S0" syntax that if I only have one card, then it must be card 0.
In this case, the IOC seg faults immediately after boot.
The correct command is card 0, as you did.

These are the command in my startup script:
##############################
# PM304 driver setup parameters:
#     (1) maximum # of controllers,
#     (2) motor task polling rate (min=1Hz, max=60Hz)
PM304Setup(1, 10)
# PM304 driver configuration parameters:
#     (1) controller
#     (2) asyn port
#     (3) MAX axes
# Example:
#   PM304Config(0, "serial1", 1)
PM304Config(0, "serial9", 1)
##############################

So now we need to figure out why it is segfaulting for you.

Your startup script has these lines:

drvAsynIPPortConfigure("test-se1-1", "172.16.0.108:5300") # Add these lines for asynTrace debugging
asynSetTraceIOMask("test-se1-1",0,2)
asynSetTraceMask("test-se1-1",0,9)

The last line is turning on ASYN_TRACEIO_DRIVER for the TCP driver.  So you should see messages for every write and read operation to the device.  Do you see any such I/O before it crashes?  You could change the last line to:

asynSetTraceMask("test-se1-1",0,255)

to turn on all possible messages.

See if you get any messages before it seg faults.

You should also run gdb to figure out where it is crashing.  Here's how to do that:

- Enable core dumps.  With the csh this is done with

limit core 1000000

With bash it is

ulimit -c 1000000

Now run your application so it seqfaults.  You will get a core file, core.XXXXX, where XXXXX is a number.

Now run gdb on your application with that core file:

gdb PATH_TO_YOUR_APPLICATION core.XXXXX

When you get the gdb prompt type the command

backtrace

That should show what function was executing when it crashed.

Here is something else to do.  Load your application, but comment out the line to load the motor database.  At the IOC prompt type this command

dbior "drvPM304",10

That should give you a report like the following:

Driver: drvPM304
      PM304 controller 0, id: Mclennan Servo Supplies Ltd  PM304  V6.17


Mark







References:
Mclennan PM600 motor controller Peter Linardakis
RE: Mclennan PM600 motor controller Mark Rivers
RE: Mclennan PM600 motor controller Peter Linardakis
RE: Mclennan PM600 motor controller nick.rees
RE: Mclennan PM600 motor controller Mark Rivers
RE: Mclennan PM600 motor controller Peter Linardakis
RE: Mclennan PM600 motor controller Mark Rivers
RE: Mclennan PM600 motor controller Peter Linardakis
Re: Mclennan PM600 motor controller Torsten Bögershausen
RE: Mclennan PM600 motor controller Mark Rivers
Re: Mclennan PM600 motor controller Ron Sluiter

Navigate by Date:
Prev: RE: Mclennan PM600 motor controller Peter Linardakis
Next: PIXIS pvCam and 64 bit peter.leicester
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  <20132014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: RE: Mclennan PM600 motor controller Mark Rivers
Next: Error associated to building asyn4-21 L. C. De Silva
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  <20132014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 20 Apr 2015 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·