TangoMonitor: not able to acquire serialization

Hello everyone,
I've encountered issue with following exception: Not able to acquire serialization, produced by TangoMonitor.

I have DeviceServer written in C++ which connects to Basler camera and capture images (not Lima, SOLARIS custom solution). By that, it uses both pylon and tango libraries. There are two more important information: in OnImageGrabbed Pylon callback, Device is pushing ChangeEvents for few attributes and one DataReadyEvent.
Secondly, there's command called reconnect, polled every 5 seconds. It checks whether camera is attached. If yes, does nothing. Otherwise it performs reconnection using Pylon functions. I believe above error occurs when executing reconnect command.

This error happens rarely, maybe once in two or three weeks but it's annoying since one have to kill process with DeviceServer. Killing server from Astor does not work.

Tango version: 9.2.5

I've done little research and my assumption is that it's connected with threads and some deadlock somewhere in my code. However, this Device does not use threads explicitly. Also, I've read that pushing events manually from code implies some locks on threads.

I'm asking in order to query about what further steps you suggest to perform in case of investigating this issue. Or maybe someone already knows the problem and, probably, solution.

Regards
Hi,

What I would suggest is to run the device server using a debug version of the Tango C++ library and to force the generation of a core the next time it happens by killing the device server with a signal generating a core file.

For instance, you can use:
kill -3 <your_pid>

or
kill -11 <your_pid>

Then we will be able to understand better which thread is holding the Tango Monitor and maybe why it is not releasing it.
Could it be that there are some other locks involved with the pylon library?
If another lock is taken in your callback and as well in reconnect command, it might be necessary to release this lock before pushing the events.
If there is suck a lock and if it supports a timeout parameter, it would be good to use this timeout parameter to prevent deadlocks too.

Hoping this helps,
Reynald
Rosenberg's Law: Software is easy to make, except when you want it to do something new.
Corollary: The only software that's worth making is software that does something new.
Edited 1 week ago
Seems like I may know the solution for your problem. But first of all, tell me, how long does it take for attribute to return its value? I mean the attribute, which throws this "not able to acquire serialization" exception.

The problem is that your camera might take more than 3000msec to answer and may cause Tango Monitor to also catch timeout. Increasing the Device timeout value on this device (set_timeout_millis) won't solve this problem as Tango Monitor still has its native 3000msec value.

In my case, modification of device algorithm solved the timeout problem, so device started to return value in less than 3000msec.

This "not able to acquire serialization" exception was randomly thrown when it took like 5 second to return attribute's value (I had timeout set to 10seconds). So device itself wasn't throwing device timeout but this Tango Monitor exception(>3000msec).
Regards,
Jagoda
Hi Reynald and Jagoda
First of all, thank you for replies!

I'll try to get core file and inspect it with debug informations. After that I'll let you know about the results.

To be specific, I don't see that attributes reading causes a problem. It's more likely a polled command but I think that mechanism is the same. So you suggest that this command may be executing to long (over 3 seconds)? I'll try to investigate tat further.

Thank you for ideas!
 
Register or login to create to post a reply.