Hi, On www.sf.net/projects/lsm-stacker there is a new stacker release. Some performance results on x86 are attached. To summarize: Plain Stack --------------------------------------------------------------- unixbench final score: 520.9 522.6 dbench avg 10 runs: 198 stdev 1 205 stdev 1 hackbench: 17 stdev 5 18 stdev 7 kernbench (sys time): 28.45 stdev .04 28.35 stdev .08 So hackbench is the only one which is not actually faster with stacking. Stacker now tries to symbol_get() the security_ops during mod_reg_security, because unloading a security modules is not safe. You can disable a security module as before by doing echo modulename > /sys/stacker/unload This is because we can easily protect against removal from the list of modules, but if the module operations can also disappear on a whim, then we'll need to introduce a usage count, or make very good use of the fact that module unloading seems to stop all cpus. In order to make stacker as fast as possible for the common case of selinux+capabilities, security_get_value() is now a wrapper whose fastpath assumes the result will always be the first element on the chain. If that is not the case, then it calls __security_get_value() to actually walk the chain. This seems to walk the fine line between inlining the fast path, without making the code much larger by inlining the whole __security_get_value(). The system also passed the LTP selinux testsuite. I know I still need to work out details like deselecting CONFIG_SECURITY_CAPABILITIES when selinux is enabled. However, does this adequately address performance issues, and are the overall design decisions sane? thanks, -serge -- Serge Hallyn <serue@private> BYTE UNIX Benchmarks (Version 4.1.0) System -- Linux serge.austin.ibm.com 2.6.12-rc2 #1 Wed Apr 13 14:04:45 CDT 2005 i686 i686 i386 GNU/Linux Start Benchmark Run: Fri Apr 15 09:52:48 CDT 2005 1 interactive users. 09:52:48 up 4 min, 1 user, load average: 0.08, 0.11, 0.06 lrwxrwxrwx 1 root root 4 Jan 5 11:56 /bin/sh -> bash /bin/sh: symbolic link to `bash' /dev/hda3 20161204 15817260 3319804 83% / Dhrystone 2 using register variables 3611201.8 lps (10.0 secs, 10 samples) Double-Precision Whetstone 802.3 MWIPS (10.4 secs, 10 samples) System Call Overhead 1095015.4 lps (10.0 secs, 10 samples) Pipe Throughput 702560.0 lps (10.0 secs, 10 samples) Pipe-based Context Switching 152786.8 lps (10.0 secs, 10 samples) Process Creation 11890.8 lps (30.0 secs, 3 samples) Execl Throughput 2403.4 lps (29.8 secs, 3 samples) File Read 1024 bufsize 2000 maxblocks 791326.0 KBps (30.0 secs, 3 samples) File Write 1024 bufsize 2000 maxblocks 493043.0 KBps (30.0 secs, 3 samples) File Copy 1024 bufsize 2000 maxblocks 281269.0 KBps (30.0 secs, 3 samples) File Read 256 bufsize 500 maxblocks 318297.0 KBps (30.0 secs, 3 samples) File Write 256 bufsize 500 maxblocks 196312.0 KBps (30.0 secs, 3 samples) File Copy 256 bufsize 500 maxblocks 114647.0 KBps (30.0 secs, 3 samples) File Read 4096 bufsize 8000 maxblocks 1351684.0 KBps (30.0 secs, 3 samples) File Write 4096 bufsize 8000 maxblocks 427733.0 KBps (30.0 secs, 3 samples) File Copy 4096 bufsize 8000 maxblocks 333182.0 KBps (30.0 secs, 3 samples) Shell Scripts (1 concurrent) 2333.4 lpm (60.0 secs, 3 samples) Shell Scripts (8 concurrent) 318.0 lpm (60.0 secs, 3 samples) Shell Scripts (16 concurrent) 159.0 lpm (60.0 secs, 3 samples) Arithmetic Test (type = short) 562871.6 lps (10.0 secs, 3 samples) Arithmetic Test (type = int) 556462.6 lps (10.0 secs, 3 samples) Arithmetic Test (type = long) 556522.1 lps (10.0 secs, 3 samples) Arithmetic Test (type = float) 530795.0 lps (10.0 secs, 3 samples) Arithmetic Test (type = double) 530983.1 lps (10.0 secs, 3 samples) Arithoh 13096172.8 lps (10.0 secs, 3 samples) C Compiler Throughput 780.9 lpm (60.0 secs, 3 samples) Dc: sqrt(2) to 99 decimal places 92229.7 lpm (30.0 secs, 3 samples) Recursion Test--Tower of Hanoi 86730.9 lps (20.0 secs, 3 samples) INDEX VALUES TEST BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 3611201.8 309.4 Double-Precision Whetstone 55.0 802.3 145.9 Execl Throughput 43.0 2403.4 558.9 File Copy 1024 bufsize 2000 maxblocks 3960.0 281269.0 710.3 File Copy 256 bufsize 500 maxblocks 1655.0 114647.0 692.7 File Copy 4096 bufsize 8000 maxblocks 5800.0 333182.0 574.5 Pipe Throughput 12440.0 702560.0 564.8 Process Creation 126.0 11890.8 943.7 Shell Scripts (8 concurrent) 6.0 318.0 530.0 System Call Overhead 15000.0 1095015.4 730.0 ========= FINAL SCORE 520.9 BYTE UNIX Benchmarks (Version 4.1.0) System -- Linux serge.austin.ibm.com 2.6.12-rc2-stack #2 Wed Apr 13 16:47:38 CDT 2005 i686 i686 i386 GNU/Linux Start Benchmark Run: Thu Apr 14 14:30:10 CDT 2005 1 interactive users. 14:30:10 up 8 min, 1 user, load average: 111.37, 484.28, 251.80 lrwxrwxrwx 1 root root 4 Jan 5 11:56 /bin/sh -> bash /bin/sh: symbolic link to `bash' /dev/hda3 20161204 15878484 3258580 83% / Dhrystone 2 using register variables 3563782.9 lps (10.0 secs, 10 samples) Double-Precision Whetstone 801.8 MWIPS (10.4 secs, 10 samples) System Call Overhead 1084261.2 lps (10.0 secs, 10 samples) Pipe Throughput 666514.3 lps (10.0 secs, 10 samples) Pipe-based Context Switching 146425.5 lps (10.0 secs, 10 samples) Process Creation 11820.7 lps (30.0 secs, 3 samples) Execl Throughput 2518.4 lps (29.8 secs, 3 samples) File Read 1024 bufsize 2000 maxblocks 780638.0 KBps (30.0 secs, 3 samples) File Write 1024 bufsize 2000 maxblocks 496655.0 KBps (30.0 secs, 3 samples) File Copy 1024 bufsize 2000 maxblocks 280480.0 KBps (30.0 secs, 3 samples) File Read 256 bufsize 500 maxblocks 294455.0 KBps (30.0 secs, 3 samples) File Write 256 bufsize 500 maxblocks 186746.0 KBps (30.0 secs, 3 samples) File Copy 256 bufsize 500 maxblocks 109086.0 KBps (30.0 secs, 3 samples) File Read 4096 bufsize 8000 maxblocks 1350343.0 KBps (30.0 secs, 3 samples) File Write 4096 bufsize 8000 maxblocks 479821.0 KBps (30.0 secs, 3 samples) File Copy 4096 bufsize 8000 maxblocks 369086.0 KBps (30.0 secs, 3 samples) Shell Scripts (1 concurrent) 2380.9 lpm (60.0 secs, 3 samples) Shell Scripts (8 concurrent) 323.7 lpm (60.0 secs, 3 samples) Shell Scripts (16 concurrent) 163.0 lpm (60.0 secs, 3 samples) Arithmetic Test (type = short) 562824.3 lps (10.0 secs, 3 samples) Arithmetic Test (type = int) 556465.5 lps (10.0 secs, 3 samples) Arithmetic Test (type = long) 556443.7 lps (10.0 secs, 3 samples) Arithmetic Test (type = float) 530474.5 lps (10.0 secs, 3 samples) Arithmetic Test (type = double) 530719.7 lps (10.0 secs, 3 samples) Arithoh 13026111.8 lps (10.0 secs, 3 samples) C Compiler Throughput 799.3 lpm (60.0 secs, 3 samples) Dc: sqrt(2) to 99 decimal places 97581.1 lpm (30.0 secs, 3 samples) Recursion Test--Tower of Hanoi 86501.1 lps (20.0 secs, 3 samples) INDEX VALUES TEST BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 3563782.9 305.4 Double-Precision Whetstone 55.0 801.8 145.8 Execl Throughput 43.0 2518.4 585.7 File Copy 1024 bufsize 2000 maxblocks 3960.0 280480.0 708.3 File Copy 256 bufsize 500 maxblocks 1655.0 109086.0 659.1 File Copy 4096 bufsize 8000 maxblocks 5800.0 369086.0 636.4 Pipe Throughput 12440.0 666514.3 535.8 Process Creation 126.0 11820.7 938.2 Shell Scripts (8 concurrent) 6.0 323.7 539.5 System Call Overhead 15000.0 1084261.2 722.8 ========= FINAL SCORE 522.6 199.135 199.419 198.793 197.881 199.753 197.514 198.788 195.684 197.963 199.099 207.87 205.668 203.238 204.763 205.482 203.463 204.168 204.98 204.39 207.253 1 real 4m53.340s user 4m12.992s sys 0m28.394s 2 real 4m42.507s user 4m13.479s sys 0m28.435s 3 real 4m43.986s user 4m13.462s sys 0m28.495s 4 real 4m42.890s user 4m13.410s sys 0m28.482s real 4m49.405s user 4m12.739s sys 0m28.464s 2 real 4m42.661s user 4m13.325s sys 0m28.254s 3 real 4m43.414s user 4m13.423s sys 0m28.365s 4 real 4m42.300s user 4m13.426s sys 0m28.315s
This archive was generated by hypermail 2.1.3 : Fri Apr 15 2005 - 16:07:53 PDT