lunes, 5 de septiembre de 2016

Linux limits 101 - Ulimit

Resources aren't infinite, and that's old news, right? We are usually worried about disk space and memory utilization, however these are far from being the only available resources on a Linux system you should worry about. A few months ago I wrote an entry about cgroups and I mentioned they are a way to limit/assign resources to processes, this time we'll see a different kind of restrictions that can also cause some production pain situations.

Ulimit - Users Limits


User limits are restrictions enforced to processes spawn from your shell, and they are placed in order to keep users under control, somehow. For every resource that is tracked (by the kernel) there are 2 limits, a soft limit and a hard limit. While the hard limit cannot be changed by an unprivileged processes, the soft limit can be raised up to the hard limit if necessary (more details here).

We can see the User limits by using the bash builtin command ulimit, for example we can see the soft limits with ulimit -a:
juan@test:~$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 11664
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 11664
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
juan@test:~$
and we can see the hard limits with -aH:
juan@test:~$ ulimit -aH
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 11664
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 4096
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) 11664
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
juan@test:~$
There are two different situations here:
  • Some resources like "open files" and "core file size" have a soft limit lower than the hard limit, which means the process itself can increase it if necessary. 
  • Other resources like "max memory size" and "max user processes" have both soft and hard limit with the same value, which suggests  the user can only decrease the value.
These limits are inherited by the child processes after a fork call and they are maintained after an execve call.  

Using ulimit command, you can also update the values, for example if we believe the maximum number of open files per process (1024) is not enough, we can go ahead and raise the soft limit with -S, like this:
juan@test:~$ ulimit -n
1024
juan@test:~$ ulimit -S -n 2048
juan@test:~$ ulimit -n
2048
juan@test:~$
now the process (our shell in this case) can open up to 2048 files. If we spawn a new process out of this shell we'll see the limit is still there:
juan@test:~$ /bin/bash
juan@test:~$ ulimit -n
2048
juan@test:~$ exit
exit
juan@test:~$
Using -H we can decrease (or increase if it's a privileged process) the hard limit for a particular value, but be careful you can't increase it back!!!
juan@test:~$ ulimit -H -n 1027
juan@test:~$ ulimit -Hn
1027
juan@test:~$ ulimit -H -n 1028
-bash: ulimit: open files: cannot modify limit: Operation not permitted
juan@test:~$
at this point we decreased the hard limit from 4096 to 1027 so if we want to open more than 1027 files with this particular process we won't be able to.
All these changes we've done on the soft and hard limits are persistent as long as the shell used is still there, if we just close that shell and open a new one the default limits will come back to play. So how the heck do I get them to be persistent?

File /etc/security/limits.conf


This is the file used by pam_limits module to enforce ulimit  limits to all the user sessions on the system. Just by reading the comments on the file you will be able to understand its syntax, for more check here. I could easily change the default ulimits for user juan by adding for example:
juan               soft           nofile             2048
this would increase the soft limit for the number of files a process can open. The change will take effect for the next session, not for the current one.

C examples


Just for the sake of fun, I wrote a small C program that will try to open 2048 files, and will abort if it doesn't succeed. The first code open_files.c is here:
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#define SIZE 2048

int main()
{
        int open_files[SIZE];
        int index=0;
        int i,keep_it;

        for(i=0;i<SIZE;i++)
        {
                printf("Opening file number %d:\n",i);
                open_files[i]=open("/etc/passwd",O_RDONLY);
                keep_it=errno;//we save errno before doing anything else
                if(open_files[i] == -1)
                {
                        printf("%s\n",strerror(keep_it));//we print the system error that corresponds to errno
                        return open_files[i];
                }
                printf("Opened file number %d, assigned FD=%d:\n",i,open_files[i]);
        }
        printf("%d files have been opened.\n",SIZE);

        return 0;
}
if you compile and run it you should see something like:
juan@test:~/ulimit$ ./open_files
Opening file number 0:
Opened file number 0, assigned FD=3:
Opening file number 1:
Opened file number 1, assigned FD=4:
Opening file number 2:
Opened file number 2, assigned FD=5:
Opening file number 3:
Opened file number 3, assigned FD=6:
Opening file number 4:
Opened file number 4, assigned FD=7:
...
Opening file number 1018:
Opened file number 1018, assigned FD=1021:
Opening file number 1019:
Opened file number 1019, assigned FD=1022:
Opening file number 1020:
Opened file number 1020, assigned FD=1023:
Opening file number 1021:
Too many open files
juan@test:~/ulimit$
a few things to take from the previous run:
  • The first file descriptor returned by the open syscall is 3, Why is that? :D exactly, because FD 0 is the STDIN, FD 1 is STOUT and FD 2 is STDERR, so the first available file descriptor for a new process is 3.
  • As soon as the process tries to open the 1021st file the open call returns -1 and sets errno to "Too many open files". This is because the maximum number of open files has been reached.
How could we address this? Well, the easiest way would by changing the soft limit before running the program, but that would allow all newly spawn processes to open 2048 files and we might not want that side effect. So lets change the soft limit inside the C program:
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#define SIZE 2048
#define JUMP 100

int main()
{
        int open_files[SIZE];
        int index=0;
        int i,keep_it,aux;
        struct rlimit old, new;

        for(i=0;i%lt;SIZE;i++)
        {
                printf("Opening file number %d:\n",i);
                open_files[i]=open("/etc/passwd",O_RDONLY);
                keep_it=errno;//we save errno before doing anything else
                if(open_files[i] == -1)
                {
                        if(keep_it == 24)//Too many open files
                        {
                                printf("%s\n",strerror(keep_it));//we print the system error that corresponds to errno
                                printf("Increasing NOFILE in %d\n",JUMP);
                                getrlimit(RLIMIT_NOFILE,&old);
                                printf("Current soft limit %d, current hard limit %d\n",(int)old.rlim_cur,(int)old.rlim_max);
                                new.rlim_max=old.rlim_max;
                                new.rlim_cur=old.rlim_cur+JUMP;
                                aux=setrlimit(RLIMIT_NOFILE,&new);
                                keep_it=errno;
                                if(aux==0)
                                {
                                        i=i-1;//reduce i in 1 to "move back" the loop one cycle.
                                }
                                else
                                {
                                        printf("Couldn't raise the soft limit: %s\n",strerror(keep_it));
                                        return -1;
                                }
                        }
                        else
                        {//some different error
                                return -1;
                        }
                }
                else
                {
                        printf("Opened file number %d, assigned FD=%d:\n",i,open_files[i]);
                }
        }
        printf("%d files have been opened.\n",SIZE);

        return 0;
}

The example will get the current soft and hard limit using getrlimit syscall and then update the soft limit using setrlimit. Two rlimit structures were added to the code, old and new, in order to update the limit. We can see the update is done by adding JUMP to the current limit, in this case adding 100. The rest of the code is pretty much the same :D.

If we ran the new code we'll see something like:
juan@test:~/ulimit$ ./open_files_increase_soft
Opening file number 0:
Opened file number 0, assigned FD=3:
Opening file number 1:
Opened file number 1, assigned FD=4:
Opening file number 2:
Opened file number 2, assigned FD=5:
Opening file number 3:
Opened file number 3, assigned FD=6:
Opening file number 4:
Opened file number 4, assigned FD=7:
...
Opening file number 1019:
Opened file number 1019, assigned FD=1022:
Opening file number 1020:
Opened file number 1020, assigned FD=1023:
Opening file number 1021:
Too many open files
Increasing NOFILE in 100
Current soft limit 1024, current hard limit 4096
Opening file number 1021:
Opened file number 1021, assigned FD=1024:
Opening file number 1022:
Opened file number 1022, assigned FD=1025:
...
Opened file number 2043, assigned FD=2046:
Opening file number 2044:
Opened file number 2044, assigned FD=2047:
Opening file number 2045:
Opened file number 2045, assigned FD=2048:
Opening file number 2046:
Opened file number 2046, assigned FD=2049:
Opening file number 2047:
Opened file number 2047, assigned FD=2050:
2048 files have been opened.
juan@test:~/ulimit$
now the process was able to open 2048 files by increasing its soft limit slowly on demand.

Wrapping up 


So whenever you are working with production systems you need to be aware of these limits unless of course you enjoy getting paged randomly haha. I've seen production systems going unresponsive because of reaching these limits, bear in mind that when we talk about Open Files we talk about file descriptors therefore this limit also applies to network connections, not just files! On top of that, if the application doesn't capture the error showing it on its logs it can be pretty hard to spot...

No hay comentarios:

Publicar un comentario