Serge E. Hallyn wrote: > Hi, > > Please take a look at the version at www.sf.net/projects/lsm-stacker. It > uses rcu to protect the list while walking it, but drops the lock before > calling each hook. The reference count on the struct module_entry protects > the entry while the hook is being called. Hi - thanks for 'cc'ing me about locking approaches in stacker modules. Feel free to look at my old stacker code: http://www.dwheeler.com/misc/stacker.c. In my old stacker module, I took a completely different approach to locking. You might find useful in your stackers. Since others might also be interested, I'm "cc"ing this message to linux-security-module. In my old stacker module, there's never a need for a lock while calling a hook at all; you only need a lock when adding or deactivating an LSM module on the list of stacked modules. Basically, I used a lockless approach instead of the locking-based approaches you guys are discussing. Feel free to borrow its ideas. Let me outline its basic approach here, in the hopes you'll find it useful. Perhaps you'll like it, and if you do, please steal it into your stackers! Background: I wanted to make sure that the stacker would be useful in _production_ environments, and I think it's critical that the stacker give production-level performance to meet that goal. I was _very_ concerned about the overhead of locking; locking could easily cause serious performance problems. A lot of unlocking/unlocking could dwarf all other work, since most hooks are pretty trivial. My old stacker module normally NEVER holds a lock merely for checking the set of stacked modules (which may happen 1000s of times a second). Locks are only required when adding and deactivating modules, which maybe happen 2-5 times per 6-month period. Basically, I optimize the common, and accept slow operations for the uncommon (big surprise). I did throw in a compile-time option to add more traditional locks, but they're off by default & not needed. Obviously whatever locks the kernel held before are unchanged by this. So, depending on what you manipulate, you may need to lock other kernel structures. This magic requires that overwrites of aligned address pointers be atomic; I did some checking, and this is true on all architectures Linux runs on (and Linux would fail where this is not true). This assumption is documented in Paul Rusty Russell's "Unreliable Guide to Locking" which is part of the Linux kernel documentation; it notes specifically that Alan Cox already assumes atomic pointer writes for Linux kernel code. James Morris and Greg KH also confirmed that they believe this is a valid assumption in a Linux kernel module. This speed-up does have a cost: it means that removing LSM modules becomes a two-step operation, not a one-step operation. You have to deactivate a module, wait for a while, and THEN if you care about 1K-overheads remove the module (I recommend skipping the second step). You can also reactivate a deactivated entry. The problem is "wait a while" - you need to ensure that all threads currently running in a module have left, and there isn't any _guaranteed_ way to do that. But this is more academic cleanliness than real; if your threads aren't leaving a given LSM module quickly, your LSM module is probably badly hosed. The simple solution is to just not unload these modules, they'll typically be a K or 2 (yawn). Another is the reality that threads will eventually make some progress, so if you're desparate for that K, say you're doing development and don't want to reboot, waiting a few minutes should be plenty. On a production environment, don't bother; just leak the K. In my mind this extra difficulty in removing modules is very worthwhile; indeed, you argue that you shouldn't be ABLE to remove an LSM module! And removing a lock entirely is a good thing; I think this trade (higher performance for normal operations in exchange for a two-step removal process) is worth it. This is pretty straightfoward to achieve. Basically, just maintain a singly linked list of the currently stacked modules (you can do it with doubly-linked given certain conditions, but I didn't need that, and it's easier to explain with singly linked lists). To add something, set up its node (that records info about the module being stacked), and as the last step, link it into the list. To deactivate something, change its predecessors' "next" pointer, but DO NOT MODIFY THE DEACTIVATED NODE'S NEXT POINTER -- in fact, don't touch the node at all. That way, threads that are in the middle of executing the deactivated node's actions can complete and move on normally, blissfully unaware that the next time they walk the list, that module won't be on the list and thus won't get called. Here's my main housekeeping structure: struct module_entry; struct module_entry { struct module_entry *next; /* MUST BE FIRST for alignment */ struct module_entry *inactive_next; /* USE THIS for inactive list */ char *module_name; struct security_operations module_operations; }; Invoking modules is easy; the key code is: #define RETURN_ERROR_IF_ANY_ERROR(FUNCTION_TO_CALL) do { \ int final_result = 0; \ int result; \ struct module_entry *m; \ LOCK_STACKER_FOR_READING; \ for (m = stacked_modules; m; m = m->next) { \ result = m->module_operations.FUNCTION_TO_CALL; \ if (result && !final_result) { \ final_result = result; \ if (short_circuit_restrictive) break; \ } \ } \ LOCK_STACKER_FOR_READING; \ return final_result; } while (0) Where LOCK_STACKER_FOR_READING; UNLOCK_STACKER_FOR_READING; are normally do-nothings. To add a module, first grab a "modifying stacker list lock", locate where you want to add it, set up its information node (i.e. its name, operations, and "next" pointer), and finally write its address onto its "preceding" pointer. Voila - it's getting invoked. To deactivate it, just do: preceding pointer = to_be_deactivated.next (DO NOT modify to_be_deactivated.next, since some threads may still be running that modules' operations). When my stacker started, it normally started with a one-element list (containing the capability module). Anyway, I recommend that you consider the lockless approach described here. RCU is nice, but no locking approach can be NEARLY as efficient as a lockless approach, if you design your system to achieve lockless. This approach gives you lockless-on-reading, which is a desirable property since reads are FAR more common than writes/changes of the list of stacked modules. --- David A. Wheeler
This archive was generated by hypermail 2.1.3 : Mon Dec 20 2004 - 12:09:47 PST