Monitor timeout due to random lags in State command

# 8 years ago
rpastor	Hi Manu, first of all, thanks for your help. I attach the DS code as you requested. It is a little bit messy, but I hope it will be useful (device.py). In case you need all the module, I also attach it in a tar-gz (keithley.tar.gz) If you require anything else let us know. Again, many thanks. Roger Edited 8 years ago Attachments: 129Kb 122Kb

# 8 years ago
rpastor	Hi again Manu, We have generated some simple scripts (server MonitorLockSerializationServer.py and client MonitorLockSerializationClient) to reproduce the problem. The server has three attributes, one that generates data_ready events, one that generate change events instead and finally a sleep_time variable that, guess what, we use to execute sleeps (quite unexpected). Both the attr that generate events start to do so when written. Their write method starts a thread that generates events waiting between each generation. To stop the event generation we have implemented a StopThread command that kills 'em all. Finally, but not less important for it, we have implemented a command that makes sleeps (CommandSleep). Our client simply receive a parameter and executes a method that, depending on the input parameter of the client, starts one of the events generation of the server, then it makes a subscription to the attr and, finally, it starts a loop that executes a command_inout of the device (the CommandSleep) The setup is based in this scripts but to build it, it is necessary to: Start the server: Open a console and define the device as follows: $ tango_admin –add-server MonitorLockSerializationServer/LockTest MonitorLockSerialization test/monitor_lock/1 Once the device has been defined, start it from the directory where its file is located with: $ python MonitorLockSerializationServer.py LockTest -v4 To start the client From another console of a host with the same TANGO_DB we run the script from the directory where it is located like: $ ./MonitorLockSerializationClient ChangeEvent if we want to test the setup with change events. $ ./MonitorLockSerializationClient DataReadyEvent if we want to test the setup with data_ready events instead. If we run the setup using the ChangeEvent parameter no error will happen and the system will work as expected. If we run the setup using the DataReadyEvent parameter, after some loops (10 exactly, as the CommandSleep executes 10 times faster than the push data ready) the client will crash from a timeout executing the CommandSleep and the server will generate the Not able to acquire serialization monitor error. Playing with the sleeps time of the client we can force the problem to happen at the first loop (i.e. increasing the sleep time on the CommandSleep) but as it is now, the error happens quite fast either way. Cheers, Roger Attachments: 4Kb 2Kb

# 8 years ago

rpastor

Hi again Manu,

We have generated some simple scripts (server MonitorLockSerializationServer.py and client MonitorLockSerializationClient) to reproduce the problem.

The server has three attributes, one that generates data_ready events, one that generate change events instead and finally a sleep_time variable that, guess what, we use to execute sleeps (quite unexpected).
Both the attr that generate events start to do so when written. Their write method starts a thread that generates events waiting between each generation.
To stop the event generation we have implemented a StopThread command that kills 'em all.
Finally, but not less important for it, we have implemented a command that makes sleeps (CommandSleep).

Our client simply receive a parameter and executes a method that, depending on the input parameter of the client, starts one of the events generation of the server, then it makes a subscription to the attr and, finally, it starts a loop that executes a command_inout of the device (the CommandSleep)

The setup is based in this scripts but to build it, it is necessary to:

Start the server:
Open a console and define the device as follows:
$ tango_admin –add-server MonitorLockSerializationServer/LockTest MonitorLockSerialization test/monitor_lock/1
Once the device has been defined, start it from the directory where its file is located with:
$ python MonitorLockSerializationServer.py LockTest -v4

To start the client
From another console of a host with the same TANGO_DB we run the script from the directory where it is located like:
$ ./MonitorLockSerializationClient ChangeEvent
if we want to test the setup with change events.
$ ./MonitorLockSerializationClient DataReadyEvent
if we want to test the setup with data_ready events instead.

If we run the setup using the ChangeEvent parameter no error will happen and the system will work as expected.
If we run the setup using the DataReadyEvent parameter, after some loops (10 exactly, as the CommandSleep executes 10 times faster than the push data ready) the client will crash from a timeout executing the CommandSleep and the server will generate the Not able to acquire serialization monitor error. Playing with the sleeps time of the client we can force the problem to happen at the first loop (i.e. increasing the sleep time on the CommandSleep) but as it is now, the error happens quite fast either way.

Cheers,
Roger

# 8 years ago
Manu	Hi, Thank's for your minimal test case. I have reproduced the problem here. Will look into it ASAP Cheers Manu

# 8 years ago
Manu	Re, After some analysis of what happens with Tiago, we found out the reason of the problem. It's a PyTango bug related to the Python GIL management. The change event and the data ready event are not managed the same way in PyTango. For change_event, there is some stuff related to GIL but not for data_ready_event and this is the reason of the problem. Tiago will look after this. Have a nice evening Tiago + manu

# 8 years ago

Manu

Re,

After some analysis of what happens with Tiago, we found out the reason of the problem. It's a PyTango bug related to the
Python GIL management. The change event and the data ready event are not managed the same way in PyTango. For change_event,
there is some stuff related to GIL but not for data_ready_event and this is the reason of the problem.

Tiago will look after this.

Have a nice evening

Tiago + manu

# 8 years ago
zreszela	Thanks to both! Then we will wait for the fix in PyTango.

# 8 years ago
TCoutinho	I Zibi, Would you be kind and make a bug report on PyTango github Thanks in advance Tiago