Re: Linux kernels DoSable by file-max limit

From: Andrea Arcangeli (andreaat_private)
Date: Wed Jul 10 2002 - 14:07:41 PDT

  • Next message: Pauli Porkka: "RE: New Paper: Microsoft SQL Server Passwords"

    On Sun, Jul 07, 2002 at 10:54:44PM +0200, Paul Starzetz wrote:
    > Hi,
    > 
    > the recently mentioned problem in BSD kernels concerning the global 
    > limit of open files seems to be present in the Linux-kernel too. However 
    > as mentioned in the advisory about the BSD specific problem the Linux 
    > kernel keeps some additional file slots reserved for the root user. This 
    > code can be found in the fs/file_table.c source file (2.4.18):
    > 
    > struct file * get_empty_filp(void)
    > {
    >    static int old_max = 0;
    >    struct file * f;
    > 
    >    file_list_lock();
    >    if (files_stat.nr_free_files > NR_RESERVED_FILES) {
    >    used_one:
    >        f = list_entry(free_list.next, struct file, f_list);
    > 
    > [...]
    > 
    >    /*
    >     * Use a reserved one if we're the superuser
    >     */
    > [*]  if (files_stat.nr_free_files && !current->euid)
    >        goto used_one;
    > 
    > 
    > Greping the source code (2.4.18) reveals that the limit is pretty low:
    > 
    > ./include/linux/fs.h:#define NR_RESERVED_FILES 10 /* reserved for root */
    
    well, that's not really secure in the first place, I mean there's
    nothing to exploit, it's more an hack to try to have more chances to
    keep an usable machine as root after you hit the file-max, but it's not
    guaranteed to work at all regardless of malicious or non malicious
    workloads. Linux never enforce to keep the nr_free_files to a level >=
    NR_RESERVED_FILES, it just tries to do that lazily, but it's not
    guaranteed you will have any nr_free_files when you happen to need them.
    
    For example if you keep only opening files since boot and you never
    execute a single close() or exit() syscall, you will never get any
    nr_free_file available, so no matter who you are (root or not), you will
    never pass this test "if (files_stat.nr_free_files && !current->euid)"
    because nr_free_files will be always zero.
    
    Furthmore that part of the vfs file allocation management needs a
    rewrite (hope it will happen in 2.5) and the file-max should go away
    like the inode-max gone away too in 2.3. At the moment all released
    files have no way to be releaed dynamically, and that's not good. There
    should be a proper slab cache and the fput should kmem_cache_free,
    instead of putting the file into the unshrinkable
    "file_table.c::free_list". But this is more a linux-kernel topic...
    
    After we make possible to shrink the released files, the file-max limit
    can go away (we need it now or we can pin all the ram into this not
    shrinkable "free_list"). Then if you allocate all the ram into files you
    will run the machine oom at some point. Which moves the DoS issues
    elsewere: in the memory management area, which becomes a generic
    problem, not specific to the file allocations anymore. After you hit the
    oom point, even if you could allocate the file with a
    root-file-reserved-pool, still you may not be able to allocate the
    dentry and the inode then.
    
    Anyways regardless of the memory management oom possible DoSes (when
    running out of ram resources), removing the file-max is a goodness
    because it makes the usability of linux much better, if you need lots of
    files in a temporarly spike of load, then you won't be left with an huge
    leak of files hanging around the the vm will shrink them as you need
    more ram later. And if you hit oom, it's very likely (though not
    guaranteed, also considering the different algorithms to handle oom
    conditions, some deadlock prone, some not deadlock prone) that the
    offending task will be killed too rendering any malicious attack much
    less reproducible than now.
    
    > [..]
    > Exploitability to get uid=0 has not been confirmed yet but seems possible.
    
    If that's the case it's an userspace bug in the suid apps that you're
    executing, certainly it's not a kernel issue.
    
    Andrea
    



    This archive was generated by hypermail 2b30 : Wed Jul 10 2002 - 21:26:44 PDT