HDB++ event subscriber "storing error" with Cassandra

Hello,

My first post to the forum, on behalf of a few of us at Max IV working with the HDB++ and Cassandra. I realise managing a Cassandra DB is not trivial and we are on (the bottom of) the learning curve. We have a cluster set up (on VMs) and already configured a number of event subscribers, that have been running for over a year. The issue we have is that some of them report "storing error" for all of their attributes. I am not sure where this error state is determined, I think it may come from this method in the cassandra library:

https://github.com/tango-controls-hdbpp/libhdbpp-cassandra/blob/master/src/LibHdb%2B%2BCassandra.cpp#L468

I guess it is saying something bad about the underlying health of the cluster, but we don't know what or how to remedy it. Restarting the subscribers a number of times can resolve it sometimes.

Any suggestions welcome, thanks a lot

Paul (MaxIV KITS controls)
Hi Paul,

There is a bug in the latest version of libhdb++cassandra library… and you are sadly affected by it…
We should have created an issue about it, sorry.
We discovered the bug very recently and the main libhdb++cassandra library developer was away this week…
He is coming back on Monday and this bug should be fixed quickly.
From what I remember, with the latest version, you get "storing error" wrongly when the attributes are with an invalid quality factor or sending exceptions and nothing is stored in this case, which is of course wrong.

Could you please create an issue with a screenshot of some of the unexpected storing errors you are seeing on https://github.com/tango-controls-hdbpp/libhdbpp-cassandra/issues ?
Then we can see if you are getting the same kind of errors as what we are seeing (and an issue will be created to track this problem too smile).

I encourage you to downgrade the libhdb++cassandra version you are using for the EventSubribers.
At the ESRF, we are currently running libhdb++cassandra v0.11.0 (libhdb++cassandra.so.7.1.0) for the EventSubscribers amd libhdb++cassandra v0.12.0 (libhdb++cassandra.so.7.2.0) for the Configuration Manager because a bug affecting the ConfigurationManager was fixed in v0.12.0

I hope it helps.
Kind regards,
Reynald
Rosenberg's Law: Software is easy to make, except when you want it to do something new.
Corollary: The only software that's worth making is software that does something new.
Edited 5 years ago
Hi,

Thanks a lot for your fast reply. Apologies, I realise now what I wrote above was not very accurate. We are actually using a very old version of libhdb++cassandra. We pulled it into our repo back in 2016, from this commit:

https://github.com/tango-controls-hdbpp/libhdbpp-cassandra/tree/3939d32130393ceb1ea2e81cdc6cf9b86f461da4

The two people who set all this up are no longer with us.

In our version, I still believe it is this method that fails:

https://github.com/tango-controls-hdbpp/libhdbpp-cassandra/blob/3939d32130393ceb1ea2e81cdc6cf9b86f461da4/src/LibHdb%2B%2BCassandra.cpp#L1132

I suppose we should update to the versions you suggest and see what happens. However, my feeling is still that this is something with the Database itself, as the issue is that sometimes all attributes of a subscriber show storing error, yet we know they are OK (never invalid etc)

Anyway, we should do the update before speculating further.

Best regards,

Paul
Hi Paul,

The best is indeed to update.
In recent versions, the storing errors are coming with more details about the errors.
Maybe you're having issues with your Cassandra cluster (nodes not available)…

Best regards,
Reynald
Rosenberg's Law: Software is easy to make, except when you want it to do something new.
Corollary: The only software that's worth making is software that does something new.
 
Register or login to create to post a reply.