> * tvrtko.ursulin@private (tvrtko.ursulin@private) wrote: >>I agree with David's point of treating writes (add/remove from chain) as >>extremely rare. But I do not like the "leave module loaded" idea. In the >>production environment, there must be a mechanism which enables updating >>critical OS parts without a reboot. LSM per se does not offer that, so the >>"stacker" module should remedy that. I'm sorry, I guess my reply was unclear. Please let me explain what I meant. My stacker could CERTAINLY let you change critical OS parts without a reboot! You could update critical OS parts without a reboot. You could even reclaim unused memory from no-longer-used kernel modules, without a reboot, so that you don't even waste any memory with unused modules. So, while there's an issue with my approach, it's NOT the case that you have to leave the module loaded. The issue is a little more subtle.. and hopefully that means it'd be acceptable to you. The issue is that stackers that allow the unloading of modules have the same issues that unloading _any_ module does in the kernel, and that's what I was trying to address. A significant issue in the Linux kernel is that there's a race condition if you unload a kernel module & remove it from memory. Indeed, in the 2.5 series, many people honestly recommended FORBIDDING the removal of ANY kernel modules (once loaded, always loaded). Obviously that would be a real loss of function, so unloading is still possible, but the problems are real. The problem is that the Linux kernel's design has a fundamental race condition if you try to unload a module. E.G.: imagine that a kernel thread is about to call a module, or is currently executing a module's code. You then remove module & reclaim the module's memory. What will that thread do? It'll probably cause a kernel panic. This race condition in the Linux kernel can be removed, but many perceive it as not worthwhile; the cures appear worse than the disease. Note that this has NOTHING to do with stackers per se; it has to do with Linux's kernel module unloading mechanism. This doesn't come up in the non-stacking case, because once you load the "first" LSM module, it cannot be unloaded... so you never hear about this without stackers. The only thing special about unloading LSM modules, as opposed to general unloading of kernel modules, is that LSM modules are called a LOT, by LOTS of threads. Once you stop using a sound card, the odds are good that no one else is using the module. In contrast, there are LOTS of threads that are constantly banging on the LSM modules. So there's a greater risk of encountering the same-old-problem if you don't handle unloading carefully. Since a failure, though unlikely, would be catestrophic, I thought it'd be better to break things down to eliminate the problem. My solution was to split "unloading" into two operations: * disable the stacked module. From here on, no one new would call the module. Threads currently in the module could finish running the module normally, and exit. * remove the module from memory. THIS is where the race condition raises its ugly head. But the race can only occur if a thread is about to enter the module, or is in it. If you want to be really conservative, as I mentioned, you can even skip this latter step; you can always reload another module without releasing the memory of the old module. You could eliminate this race entirely by waiting until every thread ran some checkpoint after the disabling, to show that they'd made it through after the disabling. In my stacker, if you were worried about the latter race, you could just disable the old module, and load a new one. At 3K/module, you could disable-and-load-a-new module 20 times and only use up 60K. The kernel sneezes more than that :-). And this is me being VERY conservative. In fact, if you waited a few seconds, all the threads will be out of the module. You could just disable the module, wait a few seconds, and then unload it. The odds of a thread still being in the thread are pretty low, though it depends on the stacked module (if the stacked module does a lot of file I/O, then obviously a thread could be there a long time... if it only does in-memory stuff, it's pretty unlikely to be there long). There are other problems, of course: if you disable a module, and reload an update of it, what happens (1) between those times, and (2) is there state that needs to transfer from the old to the new version? If there's no state, there may not be a problem. You might be able to solve (1) by having the "old" and "new" versions loaded simultaneously, then disabling the old one (and later unloading it). You can solve (2) trivially if there's no state. But I don't see how a stacker could solve that problem in general if state needs to tranfer; I believe that has to be the job of the modules themselves (to handle state transfers). For many of the modules I was interested in (limiting the creation of files to certain name patterns, detecting likely temp file problems, etc.) these weren't problems, because the module didn't need to store internal state. --- David A. Wheeler
This archive was generated by hypermail 2.1.3 : Wed Jan 05 2005 - 07:52:42 PST