Tango: notifd process memory requirements
This HowTo explains the notifd process amazing memory value in the VIRT column of the "top" utility
Tango is using the CORBA notification service for its event system. The implementation of the notification service used actually is called omniNotify. In practise, this is a daemon process called notifd. If you want to receive event(s) from Tango device attribute, one notifd process has to run on each crate where the Tango device server(s) is(are) running. This notifd process is highly multi-threaded.
Max number of threads in a multi-threaded process
On 32 bits computer, the process address space is 4 Gbytes. On Linux, 1 Gbyte is reserved for the kernel-space. Therefore, the user address space is 3 Gbytes. Each time a thread is created, some memory has to be reserved in this address space for the thread stack. The thread stack is used for two kind of things:- To store the CPU registers when the thread jumps into a function/method
- To store the function/method local variables.
core file size (blocks, -c) 0The interesting part are the ulimit -s and ulimit -v. The ulimit -v (unlimited in this case) tell us that a process can use all the virtual memory available (the 3 Gbytes). The ulimit -s tell us that each time a thread is created, the system will reserve 8 Mbytes of memory for the thread stack. Then, the maximum number of thread that the process can create is equal to (ulimit -v / ulimit -s). In our example, this is (3 * 1024) / 8 = 384.
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 20
file size (blocks, -f) unlimited
pending signals (-i) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) unlimited
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
This is confirmed by this little software which creates threads until it gets an error:
/* compile with: gcc -lpthread -o thread-limit thread-limit.c */Again, on the same ubuntu computer, the output of this program is:
/* originally from: http://www.volano.com/linuxnotes.html */
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <pthread.h>
#define MAX_THREADS 10000
int i;
void run(void) {
char c;
if (i < 10)
printf("Address of c = %u KB\n", (unsigned int) &c / 1024);
sleep(60 * 60);
}
int main(int argc, char *argv[]) {
int rc = 0;
pthread_t thread[MAX_THREADS];
printf("Creating threads ...\n");
for (i = 0; i < MAX_THREADS && rc == 0; i++) {
rc = pthread_create(&(thread[i]), NULL, (void *) &run, NULL);
if (rc == 0) {
pthread_detach(thread[i]);
if ((i + 1) % 100 == 0)
printf("%i threads so far ...\n", i + 1);
}
else
{
perror("pthread_create: ");
printf("Failed with return code %i creating thread %i.\n",
rc, i + 1);
}
}
exit(0);
}
taurel@pcantares:~/tmp$ ./thread-limitAn error has been returned by the system when the software tried to create thread number 383. This is conform to the theoritical computation.
Creating threads ...
Address of c = 3012488 KB
Address of c = 3004292 KB
Address of c = 2996096 KB
100 threads so far ...
200 threads so far ...
300 threads so far ...
pthread_create: : Cannot allocate memory
Failed with return code 12 creating thread 383.
If, we reduce the amount of memory to 2 Mbytes per thread stack size, the computation tells us that the max number of thread will be (3 * 1024) / 2 = 1536. Practically, we have:
taurel@pcantares:~/tmp$ ulimit -s 2048
taurel@pcantares:~/tmp$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 20
file size (blocks, -f) unlimited
pending signals (-i) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) unlimited
real-time priority (-r) 0
stack size (kbytes, -s) 2048
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
taurel@pcantares:~/tmp$ ./thread-limit
Creating threads ...
Address of c = 3013040 KB
Address of c = 3010988 KB
Address of c = 3008936 KB
Address of c = 3006884 KB
100 threads so far ...
200 threads so far ...
300 threads so far ...
400 threads so far ...
500 threads so far ...
600 threads so far ...
700 threads so far ...
800 threads so far ...
900 threads so far ...
1000 threads so far ...
1100 threads so far ...
1200 threads so far ...
1300 threads so far ...
1400 threads so far ...
1500 threads so far ...
pthread_create: : Cannot allocate memory
Failed with return code 12 creating thread 1532.
Process virtual memory
We now re-use to the default thread stack size (8 Mb) and we modify a little bit our test software that it sleeps after the creation of 300 threads and we run top on the process :
taurel@pcantares:~/tmp$ ps -ef | grep thread-limit
taurel 29495 29335 0 09:10 pts/4 00:00:00 ./thread-limit
taurel 29817 29798 0 09:11 pts/17 00:00:00 grep thread-limit
taurel@pcantares:~/tmp$ top -p 29495
top - 09:11:36 up 10 days, 1:06, 18 users, load average: 0.14, 0.14, 0.12
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.3%us, 0.0%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.3%hi, 0.0%si, 0.0%st
Mem: 1554420k total, 1248916k used, 305504k free, 248212k buffers
Swap: 1502036k total, 0k used, 1502036k free, 509464k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
29495 taurel 25 0 2402m 1744 400 S 0.0 0.1 0:00.01 thread-limit
The process virtual memory is 2402 Mbytes. This is 300 * 8. This means that the amount of memory displayed in this column is mostly equal to the number of threads in the process multiplied by the thread stack size. If we reduce the stack size to 2 Mbytes, top now displays:
taurel@pcantares:~/tmp$ top -p 30154The VIRT column is now 602 Mbytes (300 * 2). You can also notice that in both cases, the RES column shows the same value (1744 Kbytes).
top - 09:21:45 up 10 days, 1:16, 18 users, load average: 0.15, 0.14, 0.11
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.7%us, 0.3%sy, 0.0%ni, 97.3%id, 0.0%wa, 0.0%hi, 0.7%si, 0.0%st
Mem: 1554420k total, 1247056k used, 307364k free, 248212k buffers
Swap: 1502036k total, 0k used, 1502036k free, 508744k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
30154 taurel 25 0 602m 1744 400 S 0.0 0.1 0:00.01 thread-limit
The VIRT column shows us the amount of memory reserved in the process address space but not the memory actually used by the process. Effectively, if all the threads creates local data needing around 2 Mbytes of memory per thread, then, the process will effectively need 602 Mbytes of memory.
Notifd process top command VIRT column
The notifd process is higly multi-threaded. In average, it creates 15 - 16 threads per event channel. Each device server process running on a crate has its own event channel. If you have 10 DS running on a crate, you will have 10 event channels and therefore, a number of threads which will be close to 150. If the thread stack size is 10 Mbytes, you will have a top command VIRT column with 1500 Mbytes (150 * 10)
Let's take an example in a ESRF crate used in the ESRF machine control system. The thread stack size is 10 Mbytes. The proecss virtual memory is unlimited (therefore 3 Gbytes), the notifd process has 282 threads. Then, the notifd process VIRT column should be around 2820 Mbytes
dserver@l-pinj-2:~ >ulimit -a
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
pending signals (-i) 1024
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 32511
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
dserver@l-pinj-2:~ >ps -efL | grep notifd | grep -v grep | wc -l
282
top - 09:47:41 up 5 days, 12:17, 1 user, load average: 0.14, 0.15, 0.16
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
Cpu(s): 5.7% us, 1.5% sy, 0.0% ni, 92.8% id, 0.0% wa, 0.0% hi, 0.0% si
Mem: 2058696k total, 915596k used, 1143100k free, 143160k buffers
Swap: 4096564k total, 0k used, 4096564k free, 495660k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2811 dserver 20 0 2901m 45m 7012 S 0.7 2.3 59:20.25 notifd
We have 2901 Mbytes which is closed to the theoritical value (2820)
As a test, we are actually running a notifd process started with a thread stack size of 2 MBytes. This process has 14 event channels and 199 threads. The top VIRT column shows 450 Mbytes:
top - 09:52:00 up 13 days, 19:32, 2 users, load average: 0.19, 0.59, 0.62
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
Cpu(s): 27.7% us, 3.6% sy, 0.0% ni, 68.1% id, 0.7% wa, 0.0% hi, 0.0% si
Mem: 515556k total, 490616k used, 24940k free, 14088k buffers
Swap: 0k total, 0k used, 0k free, 302836k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
526 dserver 19 0 450m 32m 7044 S 1.3 6.5 11:27.49 notifd
Conclusion
The VIRT column is an amount of memory potentially used but not effectively used because most of the time all the threads created by a process does not need all the stack size.
In order to save a heart attack when looking at top command result on a crate where a notifd process is running, you can start it with a smaller stack size (2 Mbytes seems OK but still need confirmation)
Reducing the thread stack size will also allow you to start more device server process on a crate. With a 10 Mbytes thread stack size, you are able to start 19 device server processes on the crate ( (3 * 1024) / 10 / 16). With a 2 Mbytes thread stack size, you will be able to start 96 device server processes on the crate ( (3 * 1024) / 2 / 16).

