Out Of Memory - ZeroMQ - PyTango - Windows

Hello,

I joined last year an Astronomy project named ExTra (lead by IPAG CNRS) in order to contribute to final validation and few improvements regarding the device control layer. It's based on Tango as the main framework, Devices servers being basically telescopes, cameras, spectrometer, mounts, domes, etc. DS are distributed on 9 computers, mixing linux and windows. All have been implemented in Python, relying on PyTango.

Now here is the issue. We are facing a kind of memory leak error on several DS after approx 48h of execution. The error message is related to ZEROMQ library, more precisely about a message queue mechanism (please see attachment). The most important to notice is that these DS will run into error if they are executed on a windows system whereas they will run correctly on linux.

Another very important point is that ExTrA project runs with Tango 8.1.2 for years and it was not expected to upgrade up to now.

So, does this error message sound familiar to someone ??

Thanks for any tip/help !
Stephane
Hi Stephane,

A memory leak has been fixed recently in DbDatum destructor.
The leak was impacting only Tango applications running on Windows and built with a MSVC compiler version <= 10 (Visual Studio 10).

You can get details about the patch on this Pull Request:
https://github.com/tango-controls/cppTango/pull/488

The fix is available in cppTango 9.3.3 which has been pre-released last week.

So, if you're having code compiled with an old MSVC++ version and if you're dealing a lot with Tango properties, this might be the root cause of the memory leak you are observing.

Kind regards,
Reynald
Rosenberg's Law: Software is easy to make, except when you want it to do something new.
Corollary: The only software that's worth making is software that does something new.
Edited 3 months ago
By the way, the error message coming from zeromq does not necessarily mean the memory leak is due to zeromq code.
It simply means that a method in zeromq code tried to allocate some memory and got an error because there was no free memory left on the computer.
Rosenberg's Law: Software is easy to make, except when you want it to do something new.
Corollary: The only software that's worth making is software that does something new.
Edited 3 months ago
Hi Reynald,

thank you very much for your answer. Actually our DS are all written in python based on PyTango. I guess that your assumption is still OK if PyTango relies itself on MSVC 9 somehow, am I correct ? If so I could try to rebuild it with the patch ?

By the way, yes you're absolutely right about ZeroMQ, the message doesn't mean that the issue comes from this lib !!

Thanks again
Stephane
grizzli
Actually our DS are all written in python based on PyTango. I guess that your assumption is still OK if PyTango relies itself on MSVC 9 somehow, am I correct ?

PyTango is using the Tango C++ library so you could be affected by this bug if the version of PyTango you are using is using a Tango C++ library built with an old MSVC version.
Rosenberg's Law: Software is easy to make, except when you want it to do something new.
Corollary: The only software that's worth making is software that does something new.
Edited 3 months ago
I got it. Thanks !
Hi Reynald,

given the situation above, we are asking ourselves if we'd rather upgrade to Tango9 instead of patching Tango812. Please let me ask you 2 more questions :

- Is there something tricky while upgrading ? I was thinking about DataBase, does it require any migration process for instance ?

- Is it possible to upgrade only one part of our hosts ? Say, we could keep linux hosts running with Tango812 and migrate Windows hosts on Tango9… ?

I had a look around but did not find answers to these.

Many thanks again
Stephane
grizzli
- Is there something tricky while upgrading ? I was thinking about DataBase, does it require any migration process for instance ?

Yes, the database schema evolved so you will have to upgrade it.
In the source distribution (9.2.5a), in the README, there is a section "UPDATING FROM A TANGO 8 DATABASE".
Here is a copy of what it says:

Tango 9.1 requires an update of the database. This is one update of
the stored procedure (release 1.11), some new commands in the list of allowed
commands for the Database class (for Tango Access Control system) and
some new tables related to the new pipe feature.

To update your database, follow these instructions:

a - Stop your control system and your database server(s)

b - Backup your database (Recommended, not mandatory)

c - Cd to the <install_dir>/share/tango/db directory

d - Run the update script:
mysql -u[user] -p[password] < ./update_db8.sql

e - Restart your database server(s)


grizzli
- Is it possible to upgrade only one part of our hosts ? Say, we could keep linux hosts running with Tango812 and migrate Windows hosts on Tango9… ?

Yes, this is possible and I would recommend it even during a full migration. You can migrate parts of your system progressively.
Even on the same host, you can have device servers running with Tango 8 and device servers running with Tango 9, and they can talk with each other.
The first element to migrate is the TangoDatabase server (with the corresponding Tango Database schema update I mentioned above).
Rosenberg's Law: Software is easy to make, except when you want it to do something new.
Corollary: The only software that's worth making is software that does something new.
Hi,

Please be aware that you might encounter some difficulties if you are using a recent MySQL version if you execute the update_db8.sql script coming from the distribution.
I just tried it on a VM with MySQL 5.7.

I got an error:
ERROR 1067 (42000) at line 7: Invalid default value for 'accessed'

If you are using a recent version of MySQL, the work-around is to update this update_db8.sql file by replacing:
"timestamp NOT NULL," strings with "timestamp NOT NULL default '2000-01-01 00:00:00',"

On Linux, you can do that with this bash command:
sed -i "s/timestamp NOT NULL,/timestamp NOT NULL default '2000-01-01 00:00:00',/g" update_db8.sql
Rosenberg's Law: Software is easy to make, except when you want it to do something new.
Corollary: The only software that's worth making is software that does something new.
 
Register or login to create to post a reply.