Size of existing TANGO projects

Hello Community members,

We are working on one of the SKA Element - Telescope Manager (TM). Our initial estimates are as follows:

Total TANGO devices
  • TM Mid - 706
  • TM Low - 991
We seek your inputs on existing deployments of TANGO projects and their scale. We are looking for information such as the number of Tango devices (or device servers), archive events/sec, and possibly the total number of monitoring points.

These inputs will help us to do a comparative study with various existing TANGO deployments.

Any documentation, suggestions, and directions are welcome. We would also like to hear your learnings on deployment aspects.

Regards,
Apurva Patkar
Edited 5 years ago
Hi Apurva,

at the ESRF our largest Tango database has 17224 devices and 3003 servers exported. I do not have the number of archived events/sec but I am sure someone can give you this too.

This size of database shows no signs of overload and you can comfortably go up to this. Beyond this is probably not a problem either but we do not have experience with this. We could think about what would be the limitations could be taking into account that Tango is very highly distributed. The main limitation today would be the database in cases where you have a large number of clients (re)starting often and a large number of memorised attributes. These create traffic on the database.

Do you have an idea what the maximum size of your database could be?

Cheers

Andy
Andy
I do not have the number of archived events/sec but I am sure someone can give you this too.

On a typical day, we currently have in total about 95 archive events/second in average received by our set of HDB++ subscribers in our accelerator control system.
On machine dedicated time days (maintenance and tests days), there could be up to about 110 events/sec in average on these days.
We are currently archiving 8228 attributes from the accelerator control system in HDB++.

Cheers,
Reynald
Rosenberg's Law: Software is easy to make, except when you want it to do something new.
Corollary: The only software that's worth making is software that does something new.
Edited 5 years ago
Hello, a couple of figures of the Trieste plants:

FERMI (free electron lase)
- accelerator + laser systems : 525 devices, ~8500 attributes logged on hdb++, 33 archive_events/s
- photon transport sytem : 1130 devices, ~800 attributes logged on hdb++, 22 archive_events/s

ELETTRA (synchrotron)
- storage ring + booster : 3185 devices, 2263 attributes logged on hdb++, 138 archive_events/s




Some more comments. As noted above the database server is a critical element in a Tango system. For this reason we have implemented timing statistics on the most common calls made to the database server. You can access them via the attributes of the database server. The attribute Timing_info gives a summary in text of the average, minimum, maximum and number of calls. This will give you an idea if the database is overloaded.

Here is the output for one of the database servers in the ESRF accelerator control system:

Attribute: sys/database/2/Timing_info
Duration: 3 msec
measure date: 02/07/2018 13:30:28 + 477ms
quality: VALID
dim x: 32
Read length: 32
Read [0]	TANGO Database Timing info on host orion
Read [1]	 
Read [2]	command	average	minimum	maximum	calls
Read [3]	 
Read [4]	DbDeleteDeviceProperty	 1.003	 0.301	404.974	2693
Read [5]	DbExportDevice	11.186	 5.686	2526.654	44796
Read [6]	DbExportEvent	 3.724	 2.891	 5.552	188
Read [7]	DbGetClassPropertyList	 0.290	 0.045	24.460	1394
Read [8]	DbGetDataForServerCache	332.583	 0.675	13623.464	7907
Read [9]	DbGetDeviceAttributeProperty	 0.000	 0.000	 0.000	0
Read [10]	DbGetDeviceAttributeProperty2	109.088	 0.070	696.427	339805
Read [11]	DbGetDeviceClassList	 6.600	 0.036	206.892	44392
Read [12]	DbGetDeviceDomainList	17.207	 0.094	32.035	231
Read [13]	DbGetDeviceExportedList	12.950	 0.096	24.607	18
Read [14]	DbGetDeviceFamilyList	12.427	 0.076	30.402	381
Read [15]	DbGetDeviceMemberList	 8.437	 0.059	125.692	7021
Read [16]	DbGetDevicePipeProperty	 0.366	 0.113	 0.750	114
Read [17]	DbGetDeviceProperty	 3.720	 0.042	1928.366	8023849
Read [18]	DbGetDevicePropertyList	 0.700	 0.084	276.078	3050
Read [19]	DbGetHostList	 0.000	 0.000	 0.000	0
Read [20]	DbGetHostServerList	14.858	 0.048	2494.923	11776
Read [21]	DbGetServerList	20.439	 0.096	46.923	1363
Read [22]	DbImportDevice	 0.637	 0.190	8328.040	159810718
Read [23]	DbImportEvent	 1.323	 0.046	 4.238	962
Read [24]	DbInfo	54.164	 1.336	96.809	73
Read [25]	DbMySqlSelect	19.588	 0.076	943.388	1489
Read [26]	DbPutClassProperty	13.176	 3.912	649.171	22547
Read [27]	DbPutDeviceAttributeProperty	 0.000	 0.000	 0.000	0
Read [28]	DbPutDeviceAttributeProperty2	 4.708	 0.154	1954.627	80418814
Read [29]	DbPutDevicePipeProperty	 0.000	 0.000	 0.000	0
Read [30]	DbPutDeviceProperty	 8.693	 0.975	1736.301	4127381
Read [31]	DbUnExportServer	88.445	36.241	3442.457	4559

These are the stats after 6 weeks of running operating the accelerator. Times are in milliseconds. As you can see we have most calls for the DbImportDevice and DbPutDeviceAttributeProperty2 calls - roughly 20/s and 5/s respectively. All average times are in the few ms range.

We have a second database server which acts as backup and to take some of the load.

Hope this helps.

Andy

Hi Everyone,

I thank you all for sharing TANGO deployment information pertaining to your project. These are useful inputs to understand the scale of various TANGO projects.

Andy
Do you have an idea what the maximum size of your database could be?
We have very rough estimates of monitoring points, their archiving cadence and number of TANGO devices. We don't have sufficient information yet to determine the size of TANGO Database. The design is evolving and we will have the figures going forward.

Andy, your tip on the number of clients (re)starting, number of memorized attributes, and understanding the database calls is helpful.

Regards,
Apurva Patkar
As a comparison, at Diamond we had ~250,000 attributes (EPICS PVs) in the archiver and a total of about 2 million PVs overall. This went to a single node archiving device that was capable of archiving several thousand events/sec as well as simultaneously serving out the read requests. (Resilience was just a second identical system).

There is a talk giving details of other EPICS sites here: https://indico.esss.lu.se/event/507/session/2/contribution/6/material/slides/0.pptx. This is about the EPICS archive appliance which is a later generation than what we used at Diamond. The maximum ingest bragging rights seems to belong to BNL (20k events/sec), but this probably has a number of nodes - I suspect the upper limit for a single node is around 5-10k events/sec.

The EPICS archive appliance focusses on being easy to deploy (hence the name) and read performance - because this was the problem areas with earlier systems.
Nick Rees
Edited 5 years ago
What data types are the EPICS PV? It would be interested to know in order to compare. An attribute can be a scalar, spectrum or image. In some cases we archive images with thousands of values.

Andy
EPICS V3 (the traditional version) data types are scalars and arrays of primitive types (an image is stored in an array). EPICS V7 (the latest version) supports arbitrarily structured types. The archive supports arrays but practically is mostly scalars.
Nick Rees
In order to compare apples with apples and pears with pears I asked Pascal to calculate the number of control points in the ESRF accelerator control system. A control point = an attribute / a command for a device. If it is a R/W attribute it is counted as 2 attributes. The calculation is done by a tool which contacts every device and only takes into account devices that are active i.e. reply on the network. The results for the ESRF accelerator are:

Hosts = 316
Device Servers = 2882
Devices = 16899
Attributes = 189165
Commands = 243658

Control Points = Attributes + Commands = 535744

All of these are served by one Tango database server with one backup server.

This gives you an idea of how big the system is in EPICS PVs but it is still slightly underestimated because a single attribute can be an array and Pipes are not counted.

Andy
Edited 5 years ago
 
Register or login to create to post a reply.