Problems with Tango DB as a service in GitLab.com CI

Hi,

In Sardana project we are refactoring our GitLab CI pipelines. We plan to use Tango DB as a service. And we stumbled across the following issue:

When we try to start a Tango DS (Sardana Pool or MacroServer) in a docker container in GitLab CI the server startup process initially stay idle for exactly 30 s. Then the server startup proceeds but most of the times the server is not usable - it looks like it does not fetch correctly the list of devices that belongs to the server from the Tango DB.

Both mysql and Tango Databaseds containers run as services. We tried both tango-controls and SKA images, but the behavior is the same. At first look these services are healthy e.g. we can execute queries or define DS on the Tango DB.

Do you know when can occur:
- 30 s idle time (timeout?) on server startup
- DS fetches incomplete list of devices from the Tango DB on the server sturtup

Or are you aware of any GitLab.com hosted project that in its GitLab CI has a Tango DB service?

Just for the reference here you can find our GitLab CI file: https://gitlab.com/sardana-org/sardana/-/blob/b74f1e1983ca7e69899b5d1e6215b1ca0ba81457/.gitlab-ci.yml

Thanks in advance for your help!
Zibi
Edited 1 year ago
Hi Zibi,

zreszela
Do you know when can occur:
- 30 s idle time (timeout?) on server startup

The 30s idle time might come from the following line https://gitlab.com/tango-controls/docker/tango-db/-/blob/main/Dockerfile#L40 which is waiting with a timeout of 30 seconds for mysql server to be ready and in your use case, waiting for the availability of the following port gitlab-runner-tango-mysql:3306

Could it be that the mysql docker container is taking more than 30 seconds to startup MySQL server?

Cheers,
Reynald
Rosenberg's Law: Software is easy to make, except when you want it to do something new.
Corollary: The only software that's worth making is software that does something new.
Thanks Reynald for your fast reply and for the hint!

Unfortunatelly I think this is not this problem.

My testing fixtures are configuring Tango DB before starting the server e.g. they successuly call tango.Database.add_device().
Also the Database dserver proxy respond to ping in 60 ms. So, I believe that the Tango DB is up already.

In the subsequent steps the fixture tries to start the server and it logs:

MainThread INFO 2022-11-09 12:18:59,220 TaurusRootLogger: Started at 2022-11-09 12:18:29.157415
MainThread INFO 2022-11-09 12:18:59,220 TaurusRootLogger: Log is being stored in /tmp/tango-mambauser/MacroServer/unittest1_1/log.txt

Note the 12:18:29 and 12:18:59 timestamps which gives exactly 30 s.

Could it be that the tango.Util constructor or tango.Util.server_init() are somehow hung for these 30 s?
Hi,

I just wanted to give you some update on this.

We tried to run it on a local GitLab instance, and everything works well - thanks Benjamin!

Finally it may be not the fault of the Tango DB container. I could start the servers correctly. Well, they start in 50 s while should start in appox 5 s - there is always this 30 s initial idle period. Then during the execution of the test I see that servers randomly get idle for different time periods but always multiplications of 10 s: sometimes 10 s, sometimes 20 s. Then we get `API_EventTimeout`..

I will keep you posted if I need some other Tango expert help :)

Thanks!
Finally it appeared to be a GitLab CI/ Runner issue. The issue was finally reproducible on our local GitLab instances when using VM Runners. The requests to get the FQDN name of the host where the Tango DataBaseds runs were taking 10 s. We workaround it by using IP in the TANGO_HOST envrionment variable.

You can find more details in https://gitlab.com/sardana-org/sardana/-/merge_requests/1831.

Many thanks to Benjamin from MAXIV for his help!
 
Register or login to create to post a reply.