Re: [logs] Syslog payload format

From: Balazs Scheidler (bazsiat_private)
Date: Mon Dec 30 2002 - 08:06:45 PST

  • Next message: Marcus J. Ranum: "Re: [logs] Syslog payload format"

    Hi,
    
    On Mon, Dec 30, 2002 at 11:46:31PM +1100, Darren Reed wrote:
    > In some mail from Balazs Scheidler, sie said:
    > > 
    > > On Sun, Dec 22, 2002 at 03:10:37PM +1100, Darren Reed wrote:
    > > > In some mail from Balazs Scheidler, sie said:
    > > > > 
    > > > > xnewsyslog(LOG_DAEMON|LOG_DEBUG, "debug: %(user) %(tty) from %(host)",
    > > > > 	   "marcus", tty, where);
    > > > [...]
    > > > > That is, a constant event description comes first without _any_ variable
    > > > > data, semicolon and a list of tag/value pairs. This makes the message easy
    > > > > to read by humans (the description itself is not a tag), and the event is
    > > > > still easily parseable. So the above xnewsyslog() call would become
    > > > > something like this:
    > > > > 
    > > > > xnewsyslog(LOG_DAEMON | LOG_INFO, "User logged in; %(user), %(tty), %(host)",
    > > > > 	"marcus", tty, where);
    > > > 
    > > > I just realised there's a "problem" with both of these messages, and
    > > > this API if something like XML is going to be used.
    > > > 
    > > > The problem is these formats suggest that a message is going to be
    > > > logged in a manner that is similar to the formatting string and it
    > > > is not.
    > > > 
    > > > The above message would be logged, at best, like:
    > > > 
    > > > <event user="marcus" tty="ttyp6" host="ranum.com">User logged in; , ,</event>
    > > 
    > > no, it would be logged as:
    > > 
    > > <event user="marcus" tty="ttyp6" host="ranum.com">User logged in</event>
    > 
    > And how do you arrive at that ?
    
    By using the ';' as a separator in the format string. No macros would be
    allowed before the ';'. Maybe this separation could be strictened by using
    two separate arguments:
    
    xnewsyslog(LOG_DAEMON | LOG_INFO, 
               "User logged in", 
               "%(user)s %(tty)s %(host)s",
               "marcus", "ttyp6", host);
               
    Note the extension in the formatting part, the '%' sequence was extended by
    a printf style format specifier, so it is easy to add conversion to the
    output:
    
    xnewsyslog(LOG_DAEMON | LOG_INFO,
    	   "Formatting example",
    	   "%(seq)d %(error)s",
    	   seq, strerror(errno));
    
    The scheme could be extended with assigning the type info with the tag
    itself, e.g.:
    
    xdefinetag("seq", LOG_TAG_INTEGER);
    
    and later the 'seq' tag would implicitly format the argument as an integer.
    
    > 
    > Is some amount of arbitrary text following the $(macro) scrubbed from
    > the text message that gets logged ?
    > 
    > I know you as a human can look at the input string and arrive at what
    > you want the output to be, but are you going to write a logging interface
    > that's intelligent enough to do that ?  How then do I over-ride it when
    > I find your interface is removing information that I *want* there ?
    > 
    > Let me rewrite the xnewsyslog() like this:
    > 
    > xnewsyslog(LOG_DAEMON | LOG_INFO,
    >            "User %(user) logged in. On %(tty) from %(host) on %(date)",
    >            "marcus", tty, where);
    > 
    > Are you going to build an english parser into xnewsyslog() so it
    > knows what it can and cannot remove ?  What about for German, French,
    > Japanese and other languages ?  Is there some other magical way to
    > get from the above to what you think it means, algorithmicly ?
    
    My original intention was to clearly mark the separation between human
    readable description and variable part. In my original suggestion this was
    the ';' though it may not have been emphasized enough.
    
    > > which could be represented in a non-XML format for human processing. The
    > > problem with difficult APIs that they will simply be ignored by programmers.
    > > If the API is simple enough, the benefits it provides will overweigh the
    > > lazyness of programmers.
    > > 
    > > So my suggestion is this:
    > > 1) provide a clean API for sending tagged messages
    > > 2) provide a not-so-clean but easier to use interface based on the first
    > 
    > I think you need a (3) as well:
    > 
    > 3) provide a replacement for the current syslog(3) API that produces
    >    tagged messages.
    
    I think 2) and 3) is the same. A completely syslog(3) compatible function
    could not generate tags, as there is no tagging in its arguments.
    
    syslog compatibility would be provided by using the syslog message as
    specified by the format string+args as a simple message without tags.
    
    Maybe prior to trying to agree on the API to use, we should discuss what a
    message should consists of (without the obvious things like time stamp and
    program identification).
    
    I think the following is needed:
    * type of event
    
      we should not be limited in what kind of events to log, thus this needs to
      be a string. I would not put too much effort in standardizing the event
      type as each possible daemon has its own set of events (the number of
      possible events is huge).
      
      I think the event type field should simply be the human readable
      description of the event, _or_ a reference to a human readable description
      of the event. This should not contain variable data.
      
      Log analysing programs do not really interpret this (IMHO), they only need
      the uniqueness of this field. Maybe a separate 'message category' tag
      should be defined which correlate identical events between multiple
      programs.
      
      Examples:
      
        type of event			category
        'User logged in'			'auth.login'
        'Packet DROP'			'router.ip.forward.drop'
      
      Then we can standardize on 'category' and leave 'type of event' as free as
      possible.
    
    * variable number of name/value pairs
    
      These are simply XML attributes or XML tags associated with messages. 
      'name' should be standardized as much as possible, but private extensions
      must be possible. All variable information should be put into this space.
      
    Examples of log messages:
    
    <event facpri="191" cat="auth.login" user="marcus" tty="ttyp6" host="ranum.com">User logged in</entry>
    <event facpri="191" cat="router.ip.forward.drop" srcip="1.1.1.1" dstip="2.2.2.2" proto="tcp" srcport="1025" dstport="6667">Packet DROP</entry>
    
    (other XML and non-XML representations are possible)
    
    The call to generate these kind of events:
    
    xnewsyslog(LOG_DAEMON | LOG_INFO, 
    	   "Used logged in",
    	   "%(cat)s %(user)s %(tty)s %(host)s",
    	   "auth.login", "marcus", "ttyp6", "ranum.com");
    
    xnewsyslog(LOG_DAEMON | LOG_INFO, 
    	   "Packet DROP",
    	   "%(cat)s %(srcip)s %(dstip)s %(proto)s %(srcport)d %(dstport)d",
    	   "router.ip.forward.drop", "1.1.1.1", "2.2.2.2", "tcp", 1025, 6667);
    
    What do you think?
    
    -- 
    Bazsi
    PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
    _______________________________________________
    LogAnalysis mailing list
    LogAnalysisat_private
    http://lists.shmoo.com/mailman/listinfo/loganalysis
    



    This archive was generated by hypermail 2b30 : Mon Dec 30 2002 - 09:47:25 PST