Scalability

Dear all,

I have a question regarding the general consensus on designing a Tango Controls topology that makes the control system scaleable. I am aware (correct me if I am wrong), that it is possible to have many Tango hosts over a network (and having a dedicated computer server for each host). I am also aware that one could use any one of these host nodes to pass messages to the starter devices on all other hosts if needed (as if having central control). That should make hosting very scaleable since we could add more hosts on the fly if needed.

However, I see that all these hosts would still be connecting to a single central Tango device database. This is not a problem as far as storing device configuration information. However I was wondering if anybody can suggest a good topology for high-rate data ingestion from all these hosts (actual science data from say, hundreds of FPGA boards) that is being upstreamed from all the devices in the network. A single MySQL archive could not possibly handle all the traffic. What are the possible options?

Any help or comment is appreciated. Thanks.
Dr Andrea DeMarco, BSc (Hons) (Melita), MSc (Melita), DPhil (UEA)
Lecturer | Researcher
Department of Physics
Institute of Space Sciences and Astronomy

Room 220, Maths and Physics Building
University of Malta, Msida MSD2080, MALTA
drea
Dear all,

I have a question regarding the general consensus on designing a Tango Controls topology that makes the control system scaleable. I am aware (correct me if I am wrong), that it is possible to have many Tango hosts over a network (and having a dedicated computer server for each host).

Dear Andrea,

The terminology you are using is quite confusing because a TANGO HOST in the TANGO world represents a host where there is a Tango database server running. By extension, a TANGO HOST represents the whole control system defined in this Tango database (all the Tango devices and Tango servers defined in this Tango database). All the Tango devices defined for a specific TANGO HOST can be deployed on many different hosts.
It is possible to define several TANGO HOSTS in the same network and in this case it's even possible for devices belonging to a specific TANGO HOST (control system) to communicate with devices belonging to another TANGO HOST (control system) .

drea
I am also aware that one could use any one of these host nodes to pass messages to the starter devices on all other hosts if needed (as if having central control). That should make hosting very scaleable since we could add more hosts on the fly if needed.

Typically you will get a Starter device per host (computer node). The goal of the Starter device is to simplify the administration of the Tango servers. Then you can use a graphical user interface called astor for instance to get an overview of your deployed TANGO servers belonging to your control system, grouped by hosts (computers).
The Starter devices are here to help you to remotely start/stop TANGO servers and to monitor the status of the Tango servers (running/stopped) on the different computers.
You can add new hosts on the fly if needed, there is no problem.

drea
However, I see that all these hosts would still be connecting to a single central Tango device database. This is not a problem as far as storing device configuration information.

They will connect to the TANGO HOST where is running the MySQL Tango Database, which is used to store the configuration of your control system (definition of Tango devices, Tango servers, …smile.
The Tango database is used mainly when a Tango device server is starting, as well as when a client tries to connect to a Tango device. Among other things, it is used to store the information needed to establish the connection between Tango clients and Tango devices.
Once the communication is established, the database is not used, unless there is a need to change something in the Tango configuration.

drea
However I was wondering if anybody can suggest a good topology for high-rate data ingestion from all these hosts (actual science data from say, hundreds of FPGA boards) that is being upstreamed from all the devices in the network. A single MySQL archive could not possibly handle all the traffic. What are the possible options?

Could you please be a bit more precise on what you would like to achieve?
What do you want to do with the data from the hundreds of FPGAs?
As you wrote it is science data, I guess you would like to store them permanently?
If yes, do you want to store raw data or will you apply some treatment and compression before to store the data?
How much data does it represent?
How much data per second/per FPGA?
What kind of data is it? Images? Spectrum data? Scalar?

Hoping this helps a bit and that I didn't make the things more confused.
Rosenberg's Law: Software is easy to make, except when you want it to do something new.
Corollary: The only software that's worth making is software that does something new.
Dear Reynald,

Thank you so much for the detailed reply - you made a lot of things clear, and apologies for the confusing terminology.

1) The definition of a TANGO HOST is now more clear - it's the computer node running the entire TANGO control system. What I was in fact aiming for is to have multiple HOSTS on the network, with possibly devices from one TANGO HOST communicating with devices on another TANGO HOST. The reason of course is that we would not like a single computer managing the entire global system.

2) Astor looks great for being able to remotely control any of the TANGO HOST nodes on the network, without requiring to be physically present at each of the nodes.

3) The MySQL TANGO database is probably sufficient to store the control system definition. What I am more worried about is the archiving part - perhaps there are ways to interface between the TANGO archiving mechanism and some form of MySQL cluster / cloudstack in order to have high throughput archiving of raw data. I expect the network to be a 40G Ethernet setup, with quite high data rates coming from the devices. A data rate of 4Tb/s being generated from all various stations will not be uncommon. Of course this is not "control data", but pure upstream observation data, say spectrum data from radio telescopes. Ideally this data finds its way into some form of datastore. I was wondering if there are recommended ways of interfacing traditional high throughput cloud storage systems with TANGO devices. We can assume there will not be any added computation/compression at this stage, but direct storage. Storage is ideally permanent (unless explicitly removed by higher-level management processes).

Thank you so much for your time and explanation.
Dr Andrea DeMarco, BSc (Hons) (Melita), MSc (Melita), DPhil (UEA)
Lecturer | Researcher
Department of Physics
Institute of Space Sciences and Astronomy

Room 220, Maths and Physics Building
University of Malta, Msida MSD2080, MALTA
drea
Dear Reynald,

1) The definition of a TANGO HOST is now more clear - it's the computer node running the entire TANGO control system.

Let me try to be even more precise… The TANGO HOST is defined by a host and a port number. This information is used to locate the Tango Database server, which is running on this host and listening on this port number.
The Tango Database Server will be contacted by a Tango server every time this Tango server is starting, in order to get its configuration from the Tango Database (name of the devices managed by this server, configuration of these devices, … )
The database server will also store in the Tango Database the information necessary to establish the communication with a given Tango device (CORBA IOR in the current implementation).
So when a client will try to connect to a Tango device, it will first contact the database server, which will give him back the information on how to contact this Tango device. The client will then contact the device directly.

Every TANGO device belongs to a specific TANGO control system.
A TANGO control system is composed (among other things) of the following elements:
  • The Tango Database
  • The database server running on the host and listening on the port defined in the TANGO_HOST environment variable
  • The TANGO device servers which can run on many different hosts
  • The TANGO devices, which are exported by the TANGO device servers, and which are basically objects (as in object programming languages) on which you can invoke commands and read or/and write attributes.
To refer to a specific TANGO control system, we usually refer to it with its TANGO HOST because this is the minimum information needed to get access to the whole TANGO configuration and to be able to connect to the TANGO devices belonging to this control system.

drea
What I was in fact aiming for is to have multiple HOSTS on the network, with possibly devices from one TANGO HOST communicating with devices on another TANGO HOST. The reason of course is that we would not like a single computer managing the entire global system.

If your system is really big, you can split it in multiple control systems and use several TANGO HOSTS. In that case, it is indeed possible for devices belonging to one TANGO HOST to communicate with devices from another TANGO HOST, using the FQDN (Fully Qualified Device Name).
For instance if you have a device named my/device/name from TANGO_HOST my_tango_host:12345, you can communicate with it using the following name: tango://my_tango_host:12345/my/device/name.

drea
3) The MySQL TANGO database is probably sufficient to store the control system definition. What I am more worried about is the archiving part - perhaps there are ways to interface between the TANGO archiving mechanism and some form of MySQL cluster / cloudstack in order to have high throughput archiving of raw data. I expect the network to be a 40G Ethernet setup, with quite high data rates coming from the devices. A data rate of 4Tb/s being generated from all various stations will not be uncommon. Of course this is not "control data", but pure upstream observation data, say spectrum data from radio telescopes. Ideally this data finds its way into some form of datastore. I was wondering if there are recommended ways of interfacing traditional high throughput cloud storage systems with TANGO devices. We can assume there will not be any added computation/compression at this stage, but direct storage. Storage is ideally permanent (unless explicitly removed by higher-level management processes).

The Tango archiving mechanism (Tango historical Database), as of today, is not able and was not designed to cope with your requirements.
We are currently trying to improve it to be able to store data into Apache Cassandra but this might still not fit your needs?
Usually, Cassandra is used with a replication factor >= 3, so this would multiply by at least 3 the network traffic.

Nothing prevents you to use Tango for the control part and to develop a Tango device server managing the streaming part to a specific datastore, using the technology you want.

Maybe someone around here has already encounter similar needs and can give you advices?

drea
Thank you so much for your time and explanation.

Happy to help smile
Rosenberg's Law: Software is easy to make, except when you want it to do something new.
Corollary: The only software that's worth making is software that does something new.
Dear Reynald,

Thanks again for the very clear explanation. The FQDN point you make clearly solves the problem I wanted to tackle, and I think so does the possibility of coding a device server to custom stream data from devices into some sort of cluster/stack storage database system.

That way I could also be able to poll values off the sensors/devices at a rate of my choosing, either by reading off attributes directly from the sensors, or by reading values off the circular buffer given by Tango for each of the devices. I could set different periods (or event triggers) to read these attribute values and push a stream to a custom database system.

Things are looking good! smile Thanks so much for your help. smile
Dr Andrea DeMarco, BSc (Hons) (Melita), MSc (Melita), DPhil (UEA)
Lecturer | Researcher
Department of Physics
Institute of Space Sciences and Astronomy

Room 220, Maths and Physics Building
University of Malta, Msida MSD2080, MALTA
Hi,

I think Reynald has answered all the question very well and you have well understood his answers from your comments. Nonetheless I wanted to add some info from my side which goes in the same direction.

The topology of a TANGO system is primarily a point-2-point one. This means each device (the addressable object in a TANGO system) has it's own connection point on the network which clients connect to. Each client connection has its own thread in the device server (the process where the device is served). The link from the device-server-to-client has access to the full bandwidth of the network connection (tested up to 10 GHz). Special care has been taken to use high performance binary protocols (omniorb and zeromq) and by writing efficient code which avoids copying data as much as possible. To increase throughput in the system one can add more network connections either by adding more hosts or nic's, add more devices and add more clients. This will continue to scale as long as you can increase the overall physical network bandwidth and continue to split the traffic over multiple devices and clients. As an example you could imagine having 100 high speed devices (fpga's or similar) which need to be send their data via TANGO. By defining 100 devices each with their own network connection (i.e. host) and 100 clients each dealing with one device one could theoretically achieve 100 x bandwidth performance. Where bandwidth is the network bandwidth available to each connection.

One hits a bandwidth limit only when sending all traffic through a single point (device or client). Then the limit is the bandwidth of that connection. In practice this can be avoided by building hierarchies of devices where the data is reduced at each level to an acceptable level for the next connection. The level of information and control will scale with the level of hierarchy. For the above example one could have 10 sub-groups of 10 devices each. Then by doing some treatment on the 10 high level devices which allows one to pass only a tenth of the information to the next level and so on.

As Reynald explained the TANGO_HOST database is only used for static information and does not intervene in the dynamic exchange of data. One can have multiple static databases if necessary. The archive database is currently being upgraded to have much higher performances. The highest performance is still with a live system streaming data from the devices to higher levels of devices or clients.

We would be happy to help you work out the details if you have a concrete use case.

Kind regards

Andy
Hi Andy,

Thanks for the very informative post on issues of bandwidth - I am sure this will be useful in the proposal I designing, as I am sure these are the kinds of questions that will have to be tackled. The archive database seems to operate at various speeds with two limits, one for the historic database, and one for the temporal database. Although these limits are heuristically very realistic, I was wondering if it's possible to have different and perhaps even more stringent polling speeds for attribute values, and if so, would I require a custom database outside of TANGO to store high-rate historical data uploaded from a Tango device?

I have described this problem in another thread: Your text to link here...

I am envisaging a slighly large database, in fact a large cluster such as MySQL Cluster, where storage is distributed along different servers, although possibly visible as a single database through something like Apache Mesos. The requirement is to have frequent updates stored historically, or flushed clean by the administrator. In any case, we would not like a "temporary" archive, but a rather permanent one.

If there are any thoughts on this, it would be great to know.

Thanks again for all your help smile
Dr Andrea DeMarco, BSc (Hons) (Melita), MSc (Melita), DPhil (UEA)
Lecturer | Researcher
Department of Physics
Institute of Space Sciences and Astronomy

Room 220, Maths and Physics Building
University of Malta, Msida MSD2080, MALTA
For very high speed needs I think you will need a database of the kind you are proposing. I see no problem streaming very high data to a database which can store it at the right speeds. You can send data using events at the frequency you require. Today there is an arbitrary limit in the TANGO library on the highest frequency of events at 200 Hz. You can send events at higher frequencies by either (1) pushing events from you own code, or (2) change this arbitrary limit in the code and recompile the TANGO library. The limit on the frequency will then depend on your hardware (cpu and network).

TANGO has different event types for either sending CHANGE, PERIODIC, or ARCHIVE events. You could use either of these to setup a channel of high speed events. The database could then subscribe only to these events and store them in a database capable of storing huge quantities of data coming at high speed.

We have not tested the full performance of a NoSQL solution like Cassandra but we have heard this can go to very high performances. A clustered MySQL database as you are thinking of seems like a good option too. In the case of data which is time dependent (which is the case for control systems) it is advisable to use some kind of segmentation mechanism which "freezes" the data and informs the database engine that this data will not change anymore.
 
Register or login to create to post a reply.