[logs] Re: syslogd / some analysis

From: Marcus J. Ranum (mjrat_private)
Date: Tue Jan 29 2002 - 14:26:00 PST

  • Next message: Marcus J. Ranum: "Re: [logs] Re: syslogd / some analysis"

    Note: if you want the fun stuff, skip to the bottom...
    William D. Colburn (aka Schlake) wrote:
    >Since the beginning of the year (not counting today) it has sysloged a
    >total of 10394522 messages from about 100 different machines via UDP
    >(average 4 per second). 
    That's good data. 4/second should be well within the bounds of a '286. :)
    I'm doing some roughies on an OpenBSD box right now to see how long
    syslogging a million records takes, and whether there are any overruns
    or dropouts...  Right now it appears to be eating a lot of system time - from
    looking at it, I'm guessing nobody's ever put syslogd against gprof to see
    where it spends its time....
    Hmmm... Some initial results: on an OpenBSD machine with 800Mhz CPU
    and a bunch of memory syslogging one million records in a while loop 
    using the default UNIX domain socket results in a log file containing
    468677 records - 46% or close to 1/2 the number sent.  Drop-outs occurred
    pretty much immediately:
    Jan 29 16:03:19 hussar hammer[17461]: 138
    Jan 29 16:03:19 hussar hammer[17461]: 139
    Jan 29 16:03:19 hussar hammer[17461]: 1896
    Jan 29 16:03:19 hussar hammer[17461]: 1897
    140 records into the session I lost 1700 or so messages. Perhaps this is
    some kind of undocumented "compression" technique?   ;-)
    The client side was also a software piggie:
     98.55s real    25.03s user    24.08s system
    Ok, I gproffed syslogd just for the hell of it...
      %   cumulative   self              self     total
     time   seconds   seconds    calls  ms/call  ms/call  name
     15.0       5.50     5.50                             __thread_sys_open [1]
     12.3      10.03     4.52                             __thread_sys_select [5]
     12.2      14.51     4.48                             _writev [6]
     10.7      18.43     3.93                             _memcpy [7]
     10.6      22.33     3.89                             _read [8]
      6.2      24.62     2.29                             mcount (67)
    Looks like weirdness hidden in writev; I'm not going to dig
    any further. Looks like a lot of my expensive CPU cycles are
    going to waste, though.. ;)
    My client took 98 seconds to send a million records, which is to say
    10,000+ records/second. So we know that syslog (on my box, with
    my O/S) can't handle 10,000+ records/second. On the client, the
    usage profile looks a lot more sane:
      %   cumulative   self              self     total
     time   seconds   seconds    calls  ms/call  ms/call  name
     16.0       8.85     8.85                             _sendto [8]
     15.4      17.39     8.54 27419012     0.00     0.00  ___qdivrem [9]
    I whipped a UDP syslog client that sends to localhost over the
    syslog port, which meant I had to fire up "syslogd -u" - and repeated
    my million record test. The results there were much more interesting!
    It used a lot less CPU time and completed quite quickly. :) Of course
    the messages file only contained 4604 messages - that is  .4%
    presumably some output queue was getting overrun and messages
    were silently discarded - my sendto(...) code never returned an error,
    by the way...
    syslogd is decidedly not OK for loads in the 10,000 messages/second range on
            my machine
    Marcus J. Ranum          Chief Technology Officer, NFR Security, Inc.
    Work:                           http://www.nfr.com
    Personal:                      http://www.ranum.com
    To unsubscribe, e-mail: loganalysis-unsubscribeat_private
    For additional commands, e-mail: loganalysis-helpat_private

    This archive was generated by hypermail 2b30 : Tue Jan 29 2002 - 14:27:11 PST