Re: How to do lockless stacking (was Re: LSM Stacker)

From: David A. Wheeler (dwheeler@private)
Date: Tue Dec 21 2004 - 08:15:42 PST

Thanks for the many comments -- please let me respond
with a single reply.

Serge E. Hallyn wrote:
> thanks for your comments.  Your lockless solution was
> definately nifty.  I didn't switch it out lightly...
> It was tempting to leave it this way :)  but I really didn't
> like tossing out memory like that, at least not until I made
> sure there was no other way.  And I couldn't think up a
> clean solution to "wait a while".

It's a trade.  I expected memory from module unloading to
be no big deal, in fact, I expect production systems to
NEVER unload an LSM module, and experimental/development
machines to have enough memory that a few K are irrelevant.
So, I wrote for that case.  If you expect a different
usage model, you'll do things differently, though I
suspect that usage model is still true.

> Once I get some other preliminaries worked out (hopefully
> enabling the "correct" behavior of stacker loading selinux +
> capabilities) I plan to do some performance results
> comparing the original lockless approach vs my rcu approach
> to see what the impact really is.  For the moment, I felt
> that a simple, clean RCU implementation would be most likely
> to be acceptable to the kernel community.  Using the common
> APIs also makes it easier for others to read my code and
> spot my bugs.

Performance testing, to actually determine impact?
Horrors!  Don't do that, it shows up the rest of us :-).

Kidding aside, that sounds like a wonderful approach.
In fact, if the simple approach has sufficient performance,
I don't see the need to use my more complex approach.

Things appear to have changed since 2002.  When I was
advocating the utility of a stacker, the main complaint
I received was "stacking will be too slow to ever be useful."
Thus, I intentionally pulled out various optimization
tricks (while staying in C) to create an implementation
that I expected to NOT be slow, just to counter those objections.

If the main issue now is simplicity of implementation,
instead of absolute maximum performance, then it sounds
like you're taking exactly the right approach.

 > If the impact turns out to be significant, I'll certainly
 > have to rethink my approach.

Sure.  Think of my locking approach, then, as a backup plan
that might help you. If the RCU approach has sufficient
performance, then great, please use it!  If it's not enough,
then feel free to switch to the implementation approach that
I drafted up if it might help.  Ideally, your interface
shouldn't depend on the approach (other than unloading
the module), so you could always change it later.

> I am also very interested in comments about the locking on
> kernel object security fields in lsm-chain.patch.  I would
> like to use RCU there, but this seems to require more work
> from the security modules themselves.  I prefer to present
> them with a simple "security_set_value()" type of API as the
> current patch does (at the cost of using a rwlock).

I wouldn't be surprised if many stackable LSM modules rarely
or never wrote kernel object security fields.
Many of the kinds of modules I was interested in
didn't require any special security field state at all.
E.G.: forbidding creation of
files under certain "dangerous" conditions, detecting states
that are likely intermediate steps in an attack, etc.
So that _might_ be okay.

One approach might be to implement both the simple API
"security_set_value()", and support an API for those who
can support other locking approaches (if they need the
performance).... this might simply be documenting how to
"do it yourself" if you need to.

Greg KH said:
 >I think that using a well defined, and debugged, and documented data
 >structure (like RCU) offsets any custom type of lock and other fun
 >tricks.  In short, try using RCU first.  And if, over time, it turns out
 >that the downside to RCU is too great, then work on customizing it.
 >What's that good quote about optimizing too early... :)

It may surprise you, but I completely agree with you.
As I noted above, back in 2002 the comments I generally
got about stacking was that "no one will ever use it,
it will necessarily be too slow".  So,
I worked on a heavily optimized version to overcome
those objections.

I haven't been following this closely since I handed it off,
and I presumed that the objections hadn't changed
when I saw the recent stacker discussion. It looks like
I was wrong -- it looks like the discussion has
switched to emphasizing simplicity of implementation
instead of demanding absolute maximum performance.
Suits me fine!  I'm delighted, in fact!

A private email asked about my lockless approach:
 >This still requires the use of barriers for safety, right? ... you
 >have to explicitly insert wmb() calls as appropriate to ensure proper
 >ordering of memory accesses.  Otherwise, they can be reordered
 >transparently by the CPU.

That's correct.  Take a look at:
add_module_entry() and deactivate_entry() call wmb().
But they only needed to be invoked when you CHANGE
the list of stacked modules, which I expect to be
a very rare event (in many cases it would NEVER happen
once booting has completed).  The normal flow of invoking
each LSM module in turn, and computing the final result,
does not require a lock if you do things this way.
At least I _believe_ that's true... it wouldn't
be the _first_ mistake I've ever made... :-)...
but I did stew on the issue, and think the idea works.

--- David A. Wheeler

This archive was generated by hypermail 2.1.3 : Tue Dec 21 2004 - 08:25:27 PST