No heartbeat error on Event Subscription

i have experienced issues with zmq filtering events
subscriber and publisher strings should match exactly the one used to start DataBasedssmile
in our case filtering strings are ip addresses using a string ("localhost") or aaa.bbb.ccc.dd dot representation. from zmq it differs
Very good but not the problem obviously.

I have discussed with the experts and there could be an issue with the TANGO_HOST you use. If the TANGO_HOST for the client and the server are not exactly the same then you won't receive events either. They must be the same text i.e. both must have defined the same one of these possible values PC5-HP:20000 or PC5-HP.ncra.tifr.res.in:20000 or 192.168.118.210:20000. If you are starting the server or the client from an ide (Eclipse/Netbeans/IntelliJ) make sure the environment variable for TANGO_HOST is identical.

The reason for this is the event system uses TANGO_HOST for zmq to filter out events. This means if the TANGO_HOST is not the same then the events will be filtered by the server and never be sent to the client.

If you are already in the above case i.e. TANGO_HOST identical, then the next step is to analyse the network traffic. The best way to do this is with wireshark. But you need to run your client on a different machine. Can you do this i.e. install wireshark (https://www.wireshark.org) and run the client on a different machine? Once this is done you need to dump the traffic between the two machines while running the client. This will tell us exactly what is being sent and hopefully why it is not working. The dump needs to done on both machine. Let us know once you have wireshark installed and the client running on another machine. We can then help you use it to get the log.

Andy
Hi Vatsal,

can you send me the exact settings for your TANGO_HOST for the Database server, your JDeviceEventTest device server and your client.

I noticed a strange behavior on my linux system. If I set the TANGO_HOST for both the device server and client to 127.0.0.1 then events don't work. If I set them to the ip name of my pc then events work. The TANGO_HOST for my database server is set to 127.0.0.1.

I wonder if this is not your case.

Andy

Dear Andy,

I have installed wireshark on two different machines. I have also modified my device server and client code to print the value of the TANGO_HOST environment variable.

Consider the system on which the device server is running as System A and the system on which the client is running as the system B. My tango database device server is also running on System A. Details of both the system are as follows:

System A
    Hostname - PC5-HP
    IP Address - 192.168.118.210
    TANGO_HOST - PC5-HP.ncra.tifr.res.in

System B
    Hostname - Monali-PC
    IP Address - 192.168.118.207
    TANGO_HOST - PC5-HP.ncra.tifr.res.in

I've also added following lines in the hosts file of both the system:

192.168.118.210		PC5-HP		PC5-HP.ncra.tifr.res.in
192.168.118.207		Monali-PC	Monali-PC.ncra.tifr.res.in

From both the machines I'm able to successfully ping
    Monali-PC
    Monali-PC.ncra.tifr.res.in
    PC5-HP
    PC5-HP.ncra.tifr.res.in
    localhost

I've attached the snapshots related to the ping statistics.

I'm able to successfully open the ATKPanel from Jive for the JTangoDevice Server in both the machines. I've attached snapshot of Jive and ATKPanel of both the machines.

I have also attached the wireshark dump of both the machines and the client and server log. I hope it helps you in understanding the issue.

Again attaching the slightly modified device server and client code.

Regards,
Vatsal Trivedi
Hi Vatsal,

After some analysis of the wireshark dump trace gathered from Monali and the server log, it seems to be a case problem.
For the heartbeat event, on the client side, the event system waits for event with name tango://pc5-hp:20000/dserver/jdeviceforevent/jdevt1.heartbeat while the server is sending tango://PC5-HP:20000/dserver/jdeviceforevent/jdevt1.heartbeat

It seems that there is a case problem.

Could you simply try the same test (server on pc5-hp and client on Monali) with the TANGO_HOST environment variable set simply
to pc5-hp:20000 on both side?

Anyway, I have the feeling that a lowercase is missing on the server side!

Hoping this help

Emmanuel
Hi Vatsal,

any news? Did you try with lowercase?

We have looked more into this issue and found some cases for TANGO events are not handled well e.g. when DNS is not configured and/or on a laptop with a dynamic ip number and name provided by roaming wifi. This will be fixed in the future. For now lowercase seems to be your answer.

I have a question for you. Can you tell me how you managed to get the TRACE output from the Java device server or send me your logback.xml file.

Kind regards

Andy

Dear Andy,

Sorry for not responding promptly. I tried with setting the TANGO_HOST variable to "pc5-hp:20000" on both the machine. But it didn't resolve the issue.

I have attached the zip files containing relevant snapshots and log files from both the systems.

Andy
I have a question for you. Can you tell me how you managed to get the TRACE output from the Java device server or send me your logback.xml file.

To generate the log from the device server, I specify the logging level (-v6) as a program argument along with the instance name. Then I redirect the console output to a file (could be easily done in Eclipse and IntelliJ) . I don't use the logback.xml file or the "logging_target" available in TANGO Logging service.

Regards,
Vatsal Trivedi
Dear Vatsal,

thank you for your reply and the answer about the logging. We appreciate your patience in this affair. You have indeed found a problem in the implementation which we will resolve for good in TANGO 9.2 but for now we need some more of your patience and help to make events work for you !

The problem is clear to us know. The reason is due to the use of lower and upper case letters in the host names and the fact that the current implementation of TANGO events does not use the TANGO_HOST to construct the fulld evice name which is then used to filter events. Am I being clear or should I explain this in more detail?

Anyway what we need you to do is to change the host name from upper to lower case. Apparently this is not so easy in Windows. I suggest you do this in the hosts file and the computer name. Then reboot. Then restart the database device server. Check that the name has changed with "ipconfig /all" AND check that the database server returns the server name in lowercase by executing the DbGetCSDbServerList command on the sys/database/2 device. Send us the output if you are not sure.

Then you can start the device server and device name. To be safe you can use TANGO_HOST=pc5-hp:20000 and the fully qualified name. But apparently the TANGO_HOST is not used by the Java implementation.

Keep us informed.

Kind regards

Andy + Manu

Dear Andy,

As per your suggestion I have changed the hostname of both the machine. Hostname of server machine is "stango" and client machine is "ctango".

It is working now. I don't get any issues on the client end.

Still I'll try using different types of events and will confirm. If this is true then the fact that the hostname of the system should be in lowercase in order to events work is quite strange.

Hope you would be able to find some workaround to it.

If I encounter any issue, I'll keep you posted.

I am grateful to all the members of the community who helped me in resolving the issue.

Regards,
Vatsal
Edited 8 years ago
Dear Vatsal,

this is excellent news! Thank you for your patience and perseverance. As mentioned in my previous post we have identified the problem in the code and we will fix this (probably in V9.2). You were unfortunately the first person to encounter this bug. But it has helped us solve a bug so you and us have not spent our time in vain!

I hope you encounter fewer problems with TANGO in the future!

Andy
 
Register or login to create to post a reply.