Monitor timeout due to random lags in State command

Hi everyone!

I would like to share with you our problem. For the moment I will explain it on a very specific scenario that we have here at Alba. We will work on preparing an isolated case, but maybe meanwhile some expert could already see an error on our side or a bug.

So let's start…

The problem was discovered when integrating Keithley in the continuous scan.

The simplified Tango architecture is as follows (all written in Python):

Keithley device emits data ready events when it is involved in an acquisition.

On the client side (Sardana controller) we are listening to those events and in the event callback we execute a command on that device in order to obtain data. Apart of that, the acquisition action (a dedicated thread) executes periodically State command on that device in order to check if the acquisition have finished. All of that is done using a single DeviceProxy.

This causes random and sporadic timeouts when acquiring the monitor on the push_data_ready call.

DevFailed[
DevError[
    desc = Not able to acquire serialization (dev, class or process) monitor
  origin = TangoMonitor::get_monitor
  reason = API_CommandTimedOut
severity = ERR]

We have run the Keithley DS with v5 Tango logging.

We were able to reproduce the problem three times (two of them produce the similar traces) and all of them points to the State command which locks the monitor for more than 3 seconds.

NORMAL EXECUTION OF STATE COMMAND:

1464270693 [140091939538688] In get_monitor() rpastor/keithely2010/01, thread = 6, ctr = 0
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor DeviceImpl::state (attribute) arrived
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor SubDevDiag::get_associated_device() entering … 
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor SubDevDiag::get_associated_device() found : No associated device name!
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor SubDevDiag::set_associated_device() entering … 
1464270693 [140091939538688] INFO rpastor/keithely2010/01 2016-05-26 15:51:33.541613: In rpastor/keithely2010/01::always_executed_hook()
1464270693 [140091939538688] DEBUG rpastor/keithely2010/01 Entering in state_machine ((Keithley2010(rpastor/keithely2010/01),))
1464270693 [140091939538688] DEBUG rpastor/keithely2010/01 Exiting state_machine
1464270693 [140091939538688] INFO rpastor/keithely2010/01 2016-05-26 15:51:33.541784: After state_machine rpastor/keithely2010/01::always_executed_hook()
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor In get_monitor() rpastor/keithely2010/01, thread = 6, ctr = 1
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor owner_thread !!
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor Attribute::fire_change_event() entring …
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor In rel_monitor() rpastor/keithely2010/01, ctr = 2, thread = 6
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor In get_monitor() rpastor/keithely2010/01, thread = 6, ctr = 1
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor owner_thread !!
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor Attribute::fire_change_event() entring …
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor In rel_monitor() rpastor/keithely2010/01, ctr = 2, thread = 6
1464270693 [140091939538688] INFO rpastor/keithely2010/01 2016-05-26 15:51:33.541945: Out rpastor/keithely2010/01::always_executed_hook()
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor In rel_monitor() rpastor/keithely2010/01, ctr = 1, thread = 6
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor Signalling !
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor SubDevDiag::set_associated_device() entering … 
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor Leaving DeviceImpl::state (attribute)

CASE 1 (coincides with emitting Heartbeat event) OCCURENCE 1:

1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor In get_monitor() rpastor/keithely2010/01, thread = 6, ctr = 0
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor DeviceImpl::state (attribute) arrived
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor SubDevDiag::get_associated_device() entering … 
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor SubDevDiag::get_associated_device() found : No associated device name!
1464270693 [140091931145984] DEBUG dserver/Keithley/rpastor In get_monitor() rpastor/keithely2010/01, thread = 15, ctr = 1
1464270693 [140091939538688] DEBUG dserver/Keithley/rpastor SubDevDiag::set_associated_device() entering … 
1464270693 [140091931145984] DEBUG dserver/Keithley/rpastor Thread 15: waiting !!
1464270694 [140091947931392] DEBUG dserver/Keithley/rpastor ———-> Time = 462270694,977220 Sending event heartbeat
1464270694 [140091947931392] DEBUG dserver/Keithley/rpastor ZmqEventSupplier::push_heartbeat_event(): delta time since last heartbeat 9
1464270694 [140091947931392] DEBUG dserver/Keithley/rpastor ZmqEventSupplier::push_heartbeat_event(): detected heartbeat event for tango://pc255.cells.es:10000/dserver/keithley/rpastor.heartbeat
1464270694 [140091947931392] DEBUG dserver/Keithley/rpastor ZmqEventSupplier::push_heartbeat_event(): delta _time 9
1464270694 [140091947931392] DEBUG dserver/Keithley/rpastor ZmqEventSupplier::push_heartbeat_event(): nb_event = 1
1464270694 [140091947931392] DEBUG dserver/Keithley/rpastor Event heartbeat, next wake_up at 462270703,977598
1464270694 [140091947931392] DEBUG dserver/Keithley/rpastor Sub device property storage, next wake_up at 462272410,474407
1464270694 [140091947931392] DEBUG dserver/Keithley/rpastor Sleep for : 9000
1464270696 [140091922753280] DEBUG dserver/Keithley/rpastor DeviceImpl::ping arrived
1464270696 [140091922753280] DEBUG dserver/Keithley/rpastor Leaving DeviceImpl::ping
1464270696 [140091922753280] DEBUG dserver/Keithley/rpastor In get_monitor() rpastor/keithely2010/01, thread = 8, ctr = 1
1464270696 [140091922753280] DEBUG dserver/Keithley/rpastor Thread 8: waiting !!
1464270696 [140091931145984] DEBUG dserver/Keithley/rpastor TIME OUT for thread 15
DevFailed[
DevError[
    desc = Not able to acquire serialization (dev, class or process) monitor
  origin = TangoMonitor::get_monitor
  reason = API_CommandTimedOut
severity = ERR]
]
1464270696 [140091939538688] INFO rpastor/keithely2010/01 2016-05-26 15:51:36.779552: In rpastor/keithely2010/01::always_executed_hook()
1464270696 [140091939538688] DEBUG rpastor/keithely2010/01 Entering in state_machine ((Keithley2010(rpastor/keithely2010/01),))
1464270696 [140091939538688] DEBUG rpastor/keithely2010/01 Exiting state_machine
1464270696 [140091939538688] INFO rpastor/keithely2010/01 2016-05-26 15:51:36.779722: After state_machine rpastor/keithely2010/01::always_executed_hook()
1464270696 [140091939538688] DEBUG dserver/Keithley/rpastor In get_monitor() rpastor/keithely2010/01, thread = 6, ctr = 1
1464270696 [140091939538688] DEBUG dserver/Keithley/rpastor owner_thread !!
1464270696 [140091939538688] DEBUG dserver/Keithley/rpastor Attribute::fire_change_event() entring …
1464270696 [140091939538688] DEBUG dserver/Keithley/rpastor In rel_monitor() rpastor/keithely2010/01, ctr = 2, thread = 6
1464270696 [140091939538688] DEBUG dserver/Keithley/rpastor In get_monitor() rpastor/keithely2010/01, thread = 6, ctr = 1
1464270696 [140091939538688] DEBUG dserver/Keithley/rpastor owner_thread !!
1464270696 [140091939538688] DEBUG dserver/Keithley/rpastor Attribute::fire_change_event() entring …
1464270696 [140091939538688] DEBUG dserver/Keithley/rpastor In rel_monitor() rpastor/keithely2010/01, ctr = 2, thread = 6
1464270696 [140091939538688] INFO rpastor/keithely2010/01 2016-05-26 15:51:36.780422: Out rpastor/keithely2010/01::always_executed_hook()
1464270696 [140091939538688] DEBUG dserver/Keithley/rpastor In rel_monitor() rpastor/keithely2010/01, ctr = 1, thread = 6
1464270696 [140091939538688] DEBUG dserver/Keithley/rpastor Signalling !
1464270696 [140091939538688] DEBUG dserver/Keithley/rpastor SubDevDiag::set_associated_device() entering … 
1464270696 [140091939538688] DEBUG dserver/Keithley/rpastor Leaving DeviceImpl::state (attribute)

CASE 1 (coincides with emitting Heartbeat event) OCCURRENCE 2:

1464274066 [139868710455040] In get_monitor() rpastor/keithely2010/01, thread = 6, ctr = 0
1464274066 [139868710455040] DEBUG dserver/Keithley/rpastor DeviceImpl::state (attribute) arrived
1464274066 [139868710455040] DEBUG dserver/Keithley/rpastor SubDevDiag::get_associated_device() entering … 
1464274066 [139868710455040] DEBUG dserver/Keithley/rpastor SubDevDiag::get_associated_device() found : No associated device name!
1464274066 [139868710455040] DEBUG dserver/Keithley/rpastor SubDevDiag::set_associated_device() entering … 
1464274066 [139868710455040] INFO rpastor/keithely2010/01 2016-05-26 16:47:46.932663: In rpastor/keithely2010/01::always_executed_hook()
1464274066 [139868710455040] DEBUG rpastor/keithely2010/01 Entering in state_machine ((Keithley2010(rpastor/keithely2010/01),))
1464274066 [139868710455040] DEBUG rpastor/keithely2010/01 Exiting state_machine
1464274066 [139868710455040] INFO rpastor/keithely2010/01 2016-05-26 16:47:46.932833: After state_machine rpastor/keithely2010/01::always_executed_hook()
1464274066 [139868710455040] DEBUG dserver/Keithley/rpastor In get_monitor() rpastor/keithely2010/01, thread = 6, ctr = 1
1464274066 [139868710455040] DEBUG dserver/Keithley/rpastor owner_thread !!
1464274066 [139868361455360] DEBUG dserver/Keithley/rpastor In get_monitor() rpastor/keithely2010/01, thread = 8, ctr = 2
1464274066 [139868361455360] DEBUG dserver/Keithley/rpastor Thread 8: waiting !!
1464274069 [139868718847744] DEBUG dserver/Keithley/rpastor ———-> Time = 462274069,533521 Sending event heartbeat
1464274069 [139868718847744] DEBUG dserver/Keithley/rpastor ZmqEventSupplier::push_heartbeat_event(): delta time since last heartbeat 25
1464274069 [139868718847744] DEBUG dserver/Keithley/rpastor ZmqEventSupplier::push_heartbeat_event(): detected heartbeat event for tango://pc255.cells.es:10000/dserver/keithley/rpastor.heartbeat
1464274069 [139868718847744] DEBUG dserver/Keithley/rpastor ZmqEventSupplier::push_heartbeat_event(): delta _time 25
1464274069 [139868718847744] DEBUG dserver/Keithley/rpastor ZmqEventSupplier::push_heartbeat_event(): nb_event = 2
1464274069 [139868718847744] DEBUG dserver/Keithley/rpastor Event heartbeat, next wake_up at 462274078,534326
1464274069 [139868718847744] DEBUG dserver/Keithley/rpastor Sub device property storage, next wake_up at 462275844,004265
1464274069 [139868718847744] DEBUG dserver/Keithley/rpastor Sleep for : 9000
1464274069 [139868344669952] DEBUG dserver/Keithley/rpastor DeviceImpl::ping arrived
1464274069 [139868344669952] DEBUG dserver/Keithley/rpastor Leaving DeviceImpl::ping
1464274069 [139868344669952] DEBUG dserver/Keithley/rpastor In get_monitor() rpastor/keithely2010/01, thread = 10, ctr = 2
1464274069 [139868344669952] DEBUG dserver/Keithley/rpastor Thread 10: waiting !!
1464274070 [139868361455360] DEBUG dserver/Keithley/rpastor TIME OUT for thread 8
DevFailed[
DevError[
    desc = Not able to acquire serialization (dev, class or process) monitor
  origin = TangoMonitor::get_monitor
  reason = API_CommandTimedOut
severity = ERR]
]
1464274070 [139868710455040] DEBUG dserver/Keithley/rpastor Attribute::fire_change_event() entring …
1464274070 [139868710455040] DEBUG dserver/Keithley/rpastor In rel_monitor() rpastor/keithely2010/01, ctr = 2, thread = 6
1464274070 [139868710455040] DEBUG dserver/Keithley/rpastor In get_monitor() rpastor/keithely2010/01, thread = 6, ctr = 1
1464274070 [139868710455040] DEBUG dserver/Keithley/rpastor owner_thread !!
1464274070 [139868710455040] DEBUG dserver/Keithley/rpastor Attribute::fire_change_event() entring …
1464274070 [139868710455040] DEBUG dserver/Keithley/rpastor In rel_monitor() rpastor/keithely2010/01, ctr = 2, thread = 6
1464274070 [139868710455040] INFO rpastor/keithely2010/01 2016-05-26 16:47:50.147497: Out rpastor/keithely2010/01::always_executed_hook()
1464274070 [139868710455040] DEBUG dserver/Keithley/rpastor In rel_monitor() rpastor/keithely2010/01, ctr = 1, thread = 6
1464274070 [139868710455040] DEBUG dserver/Keithley/rpastor Signalling !
1464274070 [139868710455040] DEBUG dserver/Keithley/rpastor SubDevDiag::set_associated_device() entering … 
1464274070 [139868710455040] DEBUG dserver/Keithley/rpastor Leaving DeviceImpl::state (attribute)

CASE 2 (no Hearbeat emit):

1464272713 [140721105397504] DEBUG dserver/Keithley/rpastor In get_monitor() rpastor/keithely2010/01, thread = 16, ctr = 2
1464272713 [140721105397504] DEBUG dserver/Keithley/rpastor Thread 16: waiting !!
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor DeviceImpl::ping arrived
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor Leaving DeviceImpl::ping
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor In get_monitor() rpastor/keithely2010/01, thread = 8, ctr = 2
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor Thread 8: waiting !!
1464272716 [140721105397504] DEBUG dserver/Keithley/rpastor TIME OUT for thread 16
DevFailed[
DevError[
    desc = Not able to acquire serialization (dev, class or process) monitor
  origin = TangoMonitor::get_monitor
  reason = API_CommandTimedOut
severity = ERR]
]
1464272716 [140721113790208] DEBUG dserver/Keithley/rpastor Attribute::fire_change_event() entring …
1464272716 [140721113790208] DEBUG dserver/Keithley/rpastor In rel_monitor() rpastor/keithely2010/01, ctr = 2, thread = 6
1464272716 [140721113790208] DEBUG dserver/Keithley/rpastor In get_monitor() rpastor/keithely2010/01, thread = 6, ctr = 1
1464272716 [140721113790208] DEBUG dserver/Keithley/rpastor owner_thread !!
1464272716 [140721113790208] DEBUG dserver/Keithley/rpastor Attribute::fire_change_event() entring …
1464272716 [140721113790208] DEBUG dserver/Keithley/rpastor In rel_monitor() rpastor/keithely2010/01, ctr = 2, thread = 6
1464272716 [140721113790208] INFO rpastor/keithely2010/01 2016-05-26 16:25:16.446286: Out rpastor/keithely2010/01::always_executed_hook()
1464272716 [140721113790208] DEBUG dserver/Keithley/rpastor In rel_monitor() rpastor/keithely2010/01, ctr = 1, thread = 6
1464272716 [140721113790208] DEBUG dserver/Keithley/rpastor Signalling !
1464272716 [140721113790208] DEBUG dserver/Keithley/rpastor SubDevDiag::set_associated_device() entering … 
1464272716 [140721113790208] DEBUG dserver/Keithley/rpastor Leaving DeviceImpl::state (attribute)
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor DeviceImpl::state (attribute) arrived
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor SubDevDiag::get_associated_device() entering … 
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor SubDevDiag::get_associated_device() found : No associated device name!
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor SubDevDiag::set_associated_device() entering … 
1464272716 [140721097004800] INFO rpastor/keithely2010/01 2016-05-26 16:25:16.446446: In rpastor/keithely2010/01::always_executed_hook()
1464272716 [140721097004800] DEBUG rpastor/keithely2010/01 Entering in state_machine ((Keithley2010(rpastor/keithely2010/01),))
1464272716 [140721097004800] DEBUG rpastor/keithely2010/01 Exiting state_machine
1464272716 [140721097004800] INFO rpastor/keithely2010/01 2016-05-26 16:25:16.446530: After state_machine rpastor/keithely2010/01::always_executed_hook()
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor In get_monitor() rpastor/keithely2010/01, thread = 8, ctr = 1
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor owner_thread !!
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor Attribute::fire_change_event() entring …
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor In rel_monitor() rpastor/keithely2010/01, ctr = 2, thread = 8
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor In get_monitor() rpastor/keithely2010/01, thread = 8, ctr = 1
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor owner_thread !!
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor Attribute::fire_change_event() entring …
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor In rel_monitor() rpastor/keithely2010/01, ctr = 2, thread = 8
1464272716 [140721097004800] INFO rpastor/keithely2010/01 2016-05-26 16:25:16.446616: Out rpastor/keithely2010/01::always_executed_hook()
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor In rel_monitor() rpastor/keithely2010/01, ctr = 1, thread = 8
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor Signalling !
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor SubDevDiag::set_associated_device() entering … 
1464272716 [140721097004800] DEBUG dserver/Keithley/rpastor Leaving DeviceImpl::state (attribute)

We are using:
Tango 8.1.2 (with patches applied)
PyTango 8.1.1

If no one comments we will proceed in preparing an isolated scenario.

Many thanks in advance for any suggestions!

Hi Zibi,

Your case seems interesting. What could help even further is if you could generate a log file for your problem.
To do so in Jive select your DS admin device (dserver/xxx/yyy) and select the logging entry.
For the Logging level, set DEBUG, for the Logging target, set file and for the Logging RFT, set 20000.
Then restart your DS. In /tmp/tango-dserver/DS/instance/dserver_xxx_yyy you should now have a log file generated.
In fact the logging system will manage two log files in a round robbin way. The file size will be 20000 kB each.
Once you get your error, stop your DS and send us these files.

Ideally, if you could do this using Tango 9….

Cheers

Emmanuel
Hi Manu,

Thanks for the hint. Please see attached the log files, for both the admin device and the keithley device.
I see the logs a little bit different than the ones from my initial message. This may be due to the fact that now we have done the tests with Tango 9.1.0 and PyTango 9.1.0. Apart of that I attach you the output of the server around the time of the "Not able to acquire serialization" error (so you could find it easier in the log files…)


1464684463 [140515282249472] DEBUG dserver/Keithley/monitor_serialization In get_monitor() test/rpastor/1, thread = 8, ctr = 0
1464684463 [140515282249472] DEBUG dserver/Keithley/monitor_serialization DeviceImpl::state (attribute) arrived
1464684463 [140515265464064] DEBUG dserver/Keithley/monitor_serialization In get_monitor() test/rpastor/1, thread = 20, ctr = 1
1464684463 [140515265464064] DEBUG dserver/Keithley/monitor_serialization Thread 20: waiting !!
1464684463 [140515282249472] DEBUG dserver/Keithley/monitor_serialization SubDevDiag::get_associated_device() entering … 
1464684463 [140515282249472] DEBUG dserver/Keithley/monitor_serialization SubDevDiag::get_associated_device() found : No associated device name!
1464684463 [140515282249472] DEBUG dserver/Keithley/monitor_serialization SubDevDiag::set_associated_device() entering … 
1464684464 [140515500812032] DEBUG dserver/Keithley/monitor_serialization ———-> Time = 462684464,455918 Sending event heartbeat
1464684464 [140515500812032] DEBUG dserver/Keithley/monitor_serialization ZmqEventSupplier::push_heartbeat_event(): delta time since last heartbeat 300
1464684464 [140515500812032] DEBUG dserver/Keithley/monitor_serialization ZmqEventSupplier::push_heartbeat_event(): detected heartbeat event for tango://controls04.cells.es:10000/d
server/keithley/monitor_serialization.heartbeat
1464684464 [140515500812032] DEBUG dserver/Keithley/monitor_serialization ZmqEventSupplier::push_heartbeat_event(): delta _time 300
1464684464 [140515500812032] DEBUG dserver/Keithley/monitor_serialization ZmqEventSupplier::push_heartbeat_event(): nb_event = 2
1464684464 [140515500812032] DEBUG dserver/Keithley/monitor_serialization Event heartbeat, next wake_up at 462684473,456568
1464684464 [140515500812032] DEBUG dserver/Keithley/monitor_serialization Sub device property storage, next wake_up at 462685964,631057
1464684464 [140515500812032] DEBUG dserver/Keithley/monitor_serialization Sleep for : 9000
1464684466 [140515273856768] DEBUG dserver/Keithley/monitor_serialization DeviceImpl::ping arrived
1464684466 [140515273856768] DEBUG dserver/Keithley/monitor_serialization Leaving DeviceImpl::ping
1464684466 [140515273856768] DEBUG dserver/Keithley/monitor_serialization In get_monitor() test/rpastor/1, thread = 9, ctr = 1
1464684466 [140515273856768] DEBUG dserver/Keithley/monitor_serialization Thread 9: waiting !!
1464684466 [140515265464064] DEBUG dserver/Keithley/monitor_serialization TIME OUT for thread 20
 ———————-Event generated lost: DevFailed[
DevError[
    desc = Not able to acquire serialization (dev, class or process) monitor
  origin = TangoMonitor::get_monitor
  reason = API_CommandTimedOut
severity = ERR]
]
1464684466 [140515282249472] INFO test/rpastor/1 In test/rpastor/1::always_executed_hook()
1464684466 [140515282249472] DEBUG test/rpastor/1 Entering in state_machine ((Keithley2010(test/rpastor/1),))
1464684466 [140515282249472] DEBUG test/rpastor/1 Exiting state_machine
1464684466 [140515282249472] DEBUG dserver/Keithley/monitor_serialization In get_monitor() test/rpastor/1, thread = 8, ctr = 1
1464684466 [140515282249472] DEBUG dserver/Keithley/monitor_serialization owner_thread !!
1464684466 [140515282249472] DEBUG dserver/Keithley/monitor_serialization Attribute::fire_change_event() entering …
1464684466 [140515282249472] DEBUG dserver/Keithley/monitor_serialization In rel_monitor() test/rpastor/1, ctr = 2, thread = 8
1464684466 [140515282249472] DEBUG dserver/Keithley/monitor_serialization In get_monitor() test/rpastor/1, thread = 8, ctr = 1
1464684466 [140515282249472] DEBUG dserver/Keithley/monitor_serialization owner_thread !!
1464684466 [140515282249472] DEBUG dserver/Keithley/monitor_serialization Attribute::fire_change_event() entering …
1464684466 [140515282249472] DEBUG dserver/Keithley/monitor_serialization In rel_monitor() test/rpastor/1, ctr = 2, thread = 8
1464684466 [140515282249472] DEBUG dserver/Keithley/monitor_serialization In rel_monitor() test/rpastor/1, ctr = 1, thread = 8
1464684466 [140515282249472] DEBUG dserver/Keithley/monitor_serialization Signalling !
1464684466 [140515282249472] DEBUG dserver/Keithley/monitor_serialization SubDevDiag::set_associated_device() entering … 
1464684466 [140515282249472] DEBUG dserver/Keithley/monitor_serialization Leaving DeviceImpl::state (attribute)

Please let me know if you need more information, or you want us to reproduce it in a reduced setup.

Cheers,
Zibi

Hi Zibi,

If I tell you where to add some more debug info, could you re-compile Tango C++ lib and re-run your test?

Manu
Hi Manu,

Sure, either post them here, or attach a patch file.

Thanks for you help!
Zibi
Re,

Well, if you could add in the file server/blackbox.cpp in the method insert_corba_attr(xx) a cout5 log before and after the mutex
lock/unlock at the method beginning and at the method end. Something like


cout5 << "Before taking lock" << endl;

Please, also check that in the Keithley class, the log message in the always_executed_hook() method is the first line of code
in this method.
This will not solve the problem (may be some more changes will be required later on) but this should help us to determine precisely in which method the thread executing the state() request is waiting.

Then, re-run your test with the modified lib and send me the log files.

Thank's for your help

Manu
Re,

Oups sorry, it's not cout5 but cout4.

Cheers

Manu
Hi Manu,

Again please find attached the log files, for both the admin device and the keithley device.

The following lines were added:
  • Before taking lock
  • After taking lock
  • Before releasing lock
  • After releasing lock

Also, the self.info_stream('In %s::always_executed_hook()' % self.get_name()) is the first line in the always_executed_hook method of the Keithley class.

And this is the output of the server around the error:


464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization In get_monitor() test/rpastor/1, thread = 6, ctr = 0
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization DeviceImpl::state (attribute) arrived
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization SubDevDiag::get_associated_device() entering … 
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization SubDevDiag::get_associated_device() found : No associated device name!
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization SubDevDiag::set_associated_device() entering … 
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization Before taking lock
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization After taking lock
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization Before releasing lock
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization After releasing lock
1464708083 [140488084489984] INFO test/rpastor/1 In test/rpastor/1::always_executed_hook()
1464708083 [140488084489984] DEBUG test/rpastor/1 Entering in state_machine ((Keithley2010(test/rpastor/1),))
1464708083 [140488084489984] DEBUG test/rpastor/1 Exiting state_machine
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization In get_monitor() test/rpastor/1, thread = 6, ctr = 1
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization owner_thread !!
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization Attribute::fire_change_event() entering …
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization In rel_monitor() test/rpastor/1, ctr = 2, thread = 6
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization In get_monitor() test/rpastor/1, thread = 6, ctr = 1
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization owner_thread !!
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization Attribute::fire_change_event() entering …
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization In rel_monitor() test/rpastor/1, ctr = 2, thread = 6
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization In rel_monitor() test/rpastor/1, ctr = 1, thread = 6
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization Signalling !
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization SubDevDiag::set_associated_device() entering … 
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization Leaving DeviceImpl::state (attribute)
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization In get_monitor() test/rpastor/1, thread = 6, ctr = 0
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization DeviceImpl::state (attribute) arrived
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization SubDevDiag::get_associated_device() entering … 
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization SubDevDiag::get_associated_device() found : No associated device name!
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization SubDevDiag::set_associated_device() entering … 
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization Before taking lock
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization After taking lock
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization Before releasing lock
1464708083 [140488084489984] DEBUG dserver/Keithley/monitor_serialization After releasing lock
56
1464708083 [140488067704576] DEBUG dserver/Keithley/monitor_serialization In get_monitor() test/rpastor/1, thread = 14, ctr = 1
1464708083 [140488067704576] DEBUG dserver/Keithley/monitor_serialization Thread 14: waiting !!
1464708086 [140488076097280] DEBUG dserver/Keithley/monitor_serialization DeviceImpl::ping arrived
1464708086 [140488076097280] DEBUG dserver/Keithley/monitor_serialization Leaving DeviceImpl::ping
1464708086 [140488076097280] DEBUG dserver/Keithley/monitor_serialization In get_monitor() test/rpastor/1, thread = 7, ctr = 1
1464708086 [140488076097280] DEBUG dserver/Keithley/monitor_serialization Thread 7: waiting !!
1464708087 [140488067704576] DEBUG dserver/Keithley/monitor_serialization TIME OUT for thread 14
 ———————-Event generated lost: DevFailed[
DevError[
    desc = Not able to acquire serialization (dev, class or process) monitor
  origin = TangoMonitor::get_monitor
  reason = API_CommandTimedOut
severity = ERR]
]
1464708087 [140488084489984] INFO test/rpastor/1 In test/rpastor/1::always_executed_hook()
1464708087 [140488084489984] DEBUG test/rpastor/1 Entering in state_machine ((Keithley2010(test/rpastor/1),))
1464708087 [140488084489984] DEBUG test/rpastor/1 Exiting state_machine
1464708087 [140488084489984] DEBUG dserver/Keithley/monitor_serialization In get_monitor() test/rpastor/1, thread = 6, ctr = 1
1464708087 [140488084489984] DEBUG dserver/Keithley/monitor_serialization owner_thread !!
1464708087 [140488084489984] DEBUG dserver/Keithley/monitor_serialization Attribute::fire_change_event() entering …
1464708087 [140488084489984] DEBUG dserver/Keithley/monitor_serialization In rel_monitor() test/rpastor/1, ctr = 2, thread = 6
1464708087 [140488084489984] DEBUG dserver/Keithley/monitor_serialization In get_monitor() test/rpastor/1, thread = 6, ctr = 1
1464708087 [140488084489984] DEBUG dserver/Keithley/monitor_serialization owner_thread !!
1464708087 [140488084489984] DEBUG dserver/Keithley/monitor_serialization Attribute::fire_change_event() entering …
1464708087 [140488084489984] DEBUG dserver/Keithley/monitor_serialization In rel_monitor() test/rpastor/1, ctr = 2, thread = 6
1464708087 [140488084489984] DEBUG dserver/Keithley/monitor_serialization In rel_monitor() test/rpastor/1, ctr = 1, thread = 6
1464708087 [140488084489984] DEBUG dserver/Keithley/monitor_serialization Signalling !
1464708087 [140488084489984] DEBUG dserver/Keithley/monitor_serialization SubDevDiag::set_associated_device() entering … 
1464708087 [140488084489984] DEBUG dserver/Keithley/monitor_serialization Leaving DeviceImpl::state (attribute)
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization DeviceImpl::state (attribute) arrived
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization SubDevDiag::get_associated_device() entering … 
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization SubDevDiag::get_associated_device() found : No associated device name!
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization SubDevDiag::set_associated_device() entering … 
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization Before taking lock
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization After taking lock
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization Before releasing lock
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization After releasing lock
1464708087 [140488076097280] INFO test/rpastor/1 In test/rpastor/1::always_executed_hook()
1464708087 [140488076097280] DEBUG test/rpastor/1 Entering in state_machine ((Keithley2010(test/rpastor/1),))
1464708087 [140488076097280] DEBUG test/rpastor/1 Exiting state_machine
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization In get_monitor() test/rpastor/1, thread = 7, ctr = 1
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization owner_thread !!
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization Attribute::fire_change_event() entering …
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization In rel_monitor() test/rpastor/1, ctr = 2, thread = 7
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization In get_monitor() test/rpastor/1, thread = 7, ctr = 1
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization owner_thread !!
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization Attribute::fire_change_event() entering …
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization In rel_monitor() test/rpastor/1, ctr = 2, thread = 7
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization In rel_monitor() test/rpastor/1, ctr = 1, thread = 7
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization Signalling !
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization SubDevDiag::set_associated_device() entering … 
1464708087 [140488076097280] DEBUG dserver/Keithley/monitor_serialization Leaving DeviceImpl::state (attribute)

If you need anything else, just let me know.

Cheers,
Zibi
Hi Zibi,

The mystery is still there… We see code executed correctly in the C++ library but then the thread stops….

Could you send us your DS code?
Do you also confirm that some client is requiring the device state every 10 mS (or closed to 10 mS)?

Thank's in advance

Manu
Hi Manu,

Yes, the acquisition thread (in the client) is verifying the State every 10 mS.

My colleague will shortly attach the DS code.

Cheers,
Zibi
 
Register or login to create to post a reply.