Re: 16-way stacker performance

From: serue@private
Date: Wed Jun 01 2005 - 14:33:02 PDT


Quoting Valdis.Kletnieks@private (Valdis.Kletnieks@private):
> On Wed, 01 Jun 2005 15:39:01 CDT, serue@private said:
> > REAIM Workload
> > Times are in seconds - Child times from tms.cstime and tms.cutime
> > 
> > nostack
> > 
> > Num     Parent   Child   Child  Jobs per   Jobs/min/  Std_dev  Std_dev  JTI
> > Forked  Time     SysTime UTime   Minute     Child      Time     Percent 
> > 1       0.07     0.03    0.03    87428.57   87428.57   0.00     0.00     100
> > 3       0.17     0.27    0.23    108000.00  36000.00   0.00     0.00     99 
> > 5       0.20     0.49    0.46    153000.00  30600.00   0.01     6.19     93 
> > 7       0.29     0.92    0.97    147724.14  21103.45   0.01     4.04     95 
> > 9       0.36     1.40    1.75    153000.00  17000.00   0.01     2.70     97 
> > Max Jobs per Minute 153000.00
> > 
> > stack:
> > 
> > REAIM Workload
> > Times are in seconds - Child times from tms.cstime and tms.cutime
> > 
> > Num     Parent   Child   Child  Jobs per   Jobs/min/  Std_dev  Std_dev  JTI
> > Forked  Time     SysTime UTime   Minute     Child      Time     Percent 
> > 1       0.07     0.04    0.03    87428.57   87428.57   0.00     0.00     100
> > 3       0.18     0.30    0.22    102000.00  34000.00   nan      nan      -2147483648 
> > 5       0.21     0.52    0.50    145714.29  29142.86   0.00     1.98     98 
> > 7       0.30     1.06    0.99    142800.00  20400.00   0.01     3.01     96 
> > 9       0.36     1.59    1.55    153000.00  17000.00   0.01     2.85     97 
> > Max Jobs per Minute 153000.00
> 
> How many significant figures does this actually provide?

I don't know offhand.

> I'm *very* leery of the
> fact that for 1, 9, and max, jobs/minute was identical (additionally, the values
> for 3 and 7 seem to point at only 2-3 significant digits).  It's *very* hard to
> make any real conclusions about 2-3% performance hits (which is what the other

The tbench numbers are actually better for stacker, while the others are
within the standard deviation.  Haven't computed 95% CI.

The bigger problem I was trying to hint at was that these were virtual
cpus.  So while lock contention would show up in the results, I suspect
there are memory effects (ie NUMA) which could be masked or confused.  Not
by stacker, since we're not really locking :)  But we're trying to get
together a new set of machines.  (argh)

> numbers hint at) when you don't have enough significant digits in the number to
> accurately measure a 2% difference.....

To make matters worse, here is another separate run I made under the
stacking kernel:

Num     Parent   Child   Child  Jobs per   Jobs/min/  Std_dev  Std_dev  JTI
Forked  Time     SysTime UTime   Minute     Child      Time     Percent 
1       0.07     0.03    0.03    87428.57   87428.57   0.00     0.00     100  
3       0.18     0.31    0.21    102000.00  34000.00   0.00     2.72     97   
5       0.20     0.48    0.48    153000.00  30600.00   0.00     2.50     97   
7       0.28     0.94    0.94    153000.00  21857.14   0.01     2.66     97   
9       0.36     1.61    1.54    153000.00  17000.00   0.01     3.69     96   
Max Jobs per Minute 153000.00

I guess while waiting for the other machines to come up, I'll try to run a
more comprehensive set of tests - maybe 50 dbench, tbench, and kernbench
iterations and 20 reaims.

thanks,
-serge



This archive was generated by hypermail 2.1.3 : Wed Jun 01 2005 - 20:10:17 PDT