Low-cost hooks, multiple modules, per-task data

From: David Wheeler (dwheelerat_private)
Date: Wed Apr 18 2001 - 15:02:42 PDT

  • Next message: Greg KH: "Re: Hooking into Linux using the Linux Trace Toolkit"

    Hi, I'm really glad to see this subject being discussed, and I
    hope that a very useful security framework results.
    After reading the emails so far, I have a few comments. Take them
    all with large pounds of salt, but hopefully they'll be useful.
    
    My comments:
    
    * Having a _very_ small overhead at each hook is important, for several
      reasons.  It increases the likelihood that the framework will be
      incorporated into the kernel, & the likelihood that distributors will enable
      it by default.  Also, it increases the number of hooks people are
      likely to accept -- if unused hooks are "free" (from a performance
      perspective), then most people won't mind adding a lot of hooks!
    
      The LTT micro-benchmarks don't look so great to me.  They look like a
      lot of overhead, frankly, which may make any hooks hard to accept.
    
      The "hook" call (whatever it looks like) could be implemented in a way
      that does things efficiently, as long as it appears simple
      to the caller.  It could support an #ifdef to completely disable it
      (mollifying those who "don't want it at all").
      Other possibilities include embedding the "module" code -- many other
      modules can be embedded in, why not the security modules.
    
      I think it'd be wiser to in-line this stuff, using code modification.
      This could be done by patching on request, or by inserting NOPs
      and replacing them with code when the hook is active.
    
      The result would have NO or NEARLY NO performance overhead
      for the case where there's no hooked function (a common case even when
      a security module is loaded), and no more performance overhead when it is.
    
      This isn't insane, as long as the programmer's model is kept simple.
      Clearly inserting a security module is a rarely-invoked operation, while
      invoking the (potentially empty) hook will be very common.
      So, let's optimize the common case.
      Indeed, one VERY nice thing about using modifying code to implemeng
      hooks is that they make unused hooks a ZERO-COST operation
      (or NEAR zero-cost, if NOP replacement is used).
    
      This isn't self-modifying code in the "nasty" sense -- it's merely
      an optimization.  The source code invoking the hook could look quite normal:
        operation1();
        hook(...);
        operation2();
      The resulting assembly might look like:
        op1.1
        op1.2
        op1.3
        nop    ; replace these with a jsr when the hook is active.
        nop
        nop
        op2.1
        op2.2
    
      Since self-patching is CPU-architecture-specific, there might a
      "generic" implementation (to aid those porting Linux to a new CPU type
      or one less used) and a "highly optimized" implementation for hooks.
      Early implementation could use the "NOP trick", an old trick that's
      easier to implement and almost zero cost.  Then try to see if the
      "patch and redirect" is worth the extra cost -- which means benchmark it.
    
      So, here are the advantages of having hooks where, if unused, have
      zero performance cost (or at least VERY close to it):
      1. You're more likely to gain acceptance from those who worry about
         performance.
      2. You're more likely to have a more flexible system.. because you'll
         be willing to add more hooks.
      3. You'll save time arguing over whether or not a hook is "really necessary";
         if only one obscure module needs it, no one else needs to pay a
         performance penalty for the hook.
    
    * I think allowing multiple security modules is needed.  If nothing else,
      it would support breaking security models into smaller composable pieces.
      Also, you might want to implement IDS/audit systems separately from the
      security model, and have them invoked in turn (so they can be
      implemented/maintained separately).
    
      This could be _IMPLEMENTED_ as requiring all nonempty hooks to invoke a
      single "multiplexing module" that then determined how all other modules would
      be called.  I think PAM is a reasonable model here -- usually you'd
      want _all_ things to permit, else deny a request, but perhaps you'd want
      one modules "permit" to override.  I can even see the possibility of
      having different kinds of "multiplexing modules", giving you different
      ways of combining security modules if you need it.
    
    * There needs to be "portable" support for state in various in-memory
      data structures.  The obvious one is task_struct (in Linux 2.2, defined in
      /usr/src/linux/include/linux/sched.h), so that a module can store additional
      security state data about each process.  One approach would a
      standardized "pointer to data".  Another approach -- though I don't
      know if it'd be too hard to do -- would be to support a startup-time
      parameter of the number of bytes to add to a process that can be used
      by security modules.  Security modules would then need to request
      allocation from these storage areas ("please give me a block of 8 bytes
      in the task area") and they'd get the offset for their data.
      On-disk also needs to be handled, e.g., "Extended attributes".
    
      Yes, modules _CAN_ store state in themselves, but it'd be more painful
      to store them separately from the data structures they logically belong to.
    
    * What hooks should be included?
    
      I think a good starting list for hooks is the ones in existing projects,
      esp. Security-Enhanced Linux (Flask) and LIDS.  I very much agree with
      the earlier poster who suggested inserting hooks at different branch
      points, so that much more is known before invoking a hook.
      One disadvantage is that sometimes you don't need all that info, which
      means that you might have to intercept a large number of hooks to
      do one simple thing.
    
      However, I think inserting hooks for each system call is also important.
      It's likely that SOME module won't be well-supported by "existing" hooks,
      but adding hooks at the system call level gives enough room to cover the
      "surprise" cases.
    
      Of course, if hooks are free, then it'll be easier
      to convince people that it's okay to add many different hooks.
      Which goes back to my first point: the cost of hooks is going to be a
      key influencer of the rest of the design, so you may as well make them
      free or nearly so.  Then, you're more free to be flexible about the rest.
    
    * Could Process-specific security modules be supported in a similar way?
    
      I'd love to enable processes to implement "sandboxes" on
      other processes.  ptrace() _almost_, though not quite, makes it
      (it has trouble with racing processes).  See "subterfugure" for more on
      this.  Janus makes it possible -- but it has to implement its own
      security module, and it's not clear how well it'd work with others.
      It'd be nice if the same mechanisms could be used for process-specific
      security models, but it's not clear to me how to do that easily
      without impacting other goals.  Still, if there's a way, I'd love
      to see it.  Otherwise, this just another item on a "dream list."
    
    * More needs to be done about auditing.
    
      But I agree that would be outside of this group.
      Currently, it's not easy to do things like
      "don't allow this operation if I can't audit it"  -- printk just
      doesn't cut it.  It'd be nice to have something better than printk.
    
    
    Anyway, I hope my kibitzes have some value.
    
    
    _______________________________________________
    linux-security-module mailing list
    linux-security-moduleat_private
    http://mail.wirex.com/mailman/listinfo/linux-security-module
    



    This archive was generated by hypermail 2b30 : Wed Apr 18 2001 - 15:05:00 PDT