Low-cost hooks, multiple modules, per-task data

dwheelerat_private

Hi, I'm really glad to see this subject being discussed, and I
hope that a very useful security framework results.
After reading the emails so far, I have a few comments. Take them
all with large pounds of salt, but hopefully they'll be useful.

My comments:

* Having a _very_ small overhead at each hook is important, for several
  reasons.  It increases the likelihood that the framework will be
  incorporated into the kernel, & the likelihood that distributors will enable
  it by default.  Also, it increases the number of hooks people are
  likely to accept -- if unused hooks are "free" (from a performance
  perspective), then most people won't mind adding a lot of hooks!

  The LTT micro-benchmarks don't look so great to me.  They look like a
  lot of overhead, frankly, which may make any hooks hard to accept.

  The "hook" call (whatever it looks like) could be implemented in a way
  that does things efficiently, as long as it appears simple
  to the caller.  It could support an #ifdef to completely disable it
  (mollifying those who "don't want it at all").
  Other possibilities include embedding the "module" code -- many other
  modules can be embedded in, why not the security modules.

  I think it'd be wiser to in-line this stuff, using code modification.
  This could be done by patching on request, or by inserting NOPs
  and replacing them with code when the hook is active.

  The result would have NO or NEARLY NO performance overhead
  for the case where there's no hooked function (a common case even when
  a security module is loaded), and no more performance overhead when it is.

  This isn't insane, as long as the programmer's model is kept simple.
  Clearly inserting a security module is a rarely-invoked operation, while
  invoking the (potentially empty) hook will be very common.
  So, let's optimize the common case.
  Indeed, one VERY nice thing about using modifying code to implemeng
  hooks is that they make unused hooks a ZERO-COST operation
  (or NEAR zero-cost, if NOP replacement is used).

  This isn't self-modifying code in the "nasty" sense -- it's merely
  an optimization.  The source code invoking the hook could look quite normal:
    operation1();
    hook(...);
    operation2();
  The resulting assembly might look like:
    op1.1
    op1.2
    op1.3
    nop    ; replace these with a jsr when the hook is active.
    nop
    nop
    op2.1
    op2.2

  Since self-patching is CPU-architecture-specific, there might a
  "generic" implementation (to aid those porting Linux to a new CPU type
  or one less used) and a "highly optimized" implementation for hooks.
  Early implementation could use the "NOP trick", an old trick that's
  easier to implement and almost zero cost.  Then try to see if the
  "patch and redirect" is worth the extra cost -- which means benchmark it.

  So, here are the advantages of having hooks where, if unused, have
  zero performance cost (or at least VERY close to it):
  1. You're more likely to gain acceptance from those who worry about
     performance.
  2. You're more likely to have a more flexible system.. because you'll
     be willing to add more hooks.
  3. You'll save time arguing over whether or not a hook is "really necessary";
     if only one obscure module needs it, no one else needs to pay a
     performance penalty for the hook.

* I think allowing multiple security modules is needed.  If nothing else,
  it would support breaking security models into smaller composable pieces.
  Also, you might want to implement IDS/audit systems separately from the
  security model, and have them invoked in turn (so they can be
  implemented/maintained separately).

  This could be _IMPLEMENTED_ as requiring all nonempty hooks to invoke a
  single "multiplexing module" that then determined how all other modules would
  be called.  I think PAM is a reasonable model here -- usually you'd
  want _all_ things to permit, else deny a request, but perhaps you'd want
  one modules "permit" to override.  I can even see the possibility of
  having different kinds of "multiplexing modules", giving you different
  ways of combining security modules if you need it.

* There needs to be "portable" support for state in various in-memory
  data structures.  The obvious one is task_struct (in Linux 2.2, defined in
  /usr/src/linux/include/linux/sched.h), so that a module can store additional
  security state data about each process.  One approach would a
  standardized "pointer to data".  Another approach -- though I don't
  know if it'd be too hard to do -- would be to support a startup-time
  parameter of the number of bytes to add to a process that can be used
  by security modules.  Security modules would then need to request
  allocation from these storage areas ("please give me a block of 8 bytes
  in the task area") and they'd get the offset for their data.
  On-disk also needs to be handled, e.g., "Extended attributes".

  Yes, modules _CAN_ store state in themselves, but it'd be more painful
  to store them separately from the data structures they logically belong to.

* What hooks should be included?

  I think a good starting list for hooks is the ones in existing projects,
  esp. Security-Enhanced Linux (Flask) and LIDS.  I very much agree with
  the earlier poster who suggested inserting hooks at different branch
  points, so that much more is known before invoking a hook.
  One disadvantage is that sometimes you don't need all that info, which
  means that you might have to intercept a large number of hooks to
  do one simple thing.

  However, I think inserting hooks for each system call is also important.
  It's likely that SOME module won't be well-supported by "existing" hooks,
  but adding hooks at the system call level gives enough room to cover the
  "surprise" cases.

  Of course, if hooks are free, then it'll be easier
  to convince people that it's okay to add many different hooks.
  Which goes back to my first point: the cost of hooks is going to be a
  key influencer of the rest of the design, so you may as well make them
  free or nearly so.  Then, you're more free to be flexible about the rest.

* Could Process-specific security modules be supported in a similar way?

  I'd love to enable processes to implement "sandboxes" on
  other processes.  ptrace() _almost_, though not quite, makes it
  (it has trouble with racing processes).  See "subterfugure" for more on
  this.  Janus makes it possible -- but it has to implement its own
  security module, and it's not clear how well it'd work with others.
  It'd be nice if the same mechanisms could be used for process-specific
  security models, but it's not clear to me how to do that easily
  without impacting other goals.  Still, if there's a way, I'd love
  to see it.  Otherwise, this just another item on a "dream list."

* More needs to be done about auditing.

  But I agree that would be outside of this group.
  Currently, it's not easy to do things like
  "don't allow this operation if I can't audit it"  -- printk just
  doesn't cut it.  It'd be nice to have something better than printk.

Anyway, I hope my kibitzes have some value.

_______________________________________________
linux-security-module mailing list
linux-security-moduleat_private
http://mail.wirex.com/mailman/listinfo/linux-security-module