RE: [logs] Syslog payload format

From: Frank O'Dwyer (fodat_private)
Date: Wed Dec 18 2002 - 16:48:10 PST

  • Next message: Rainer Gerhards: "RE: [logs] Syslog payload format"

    Marcus J. Ranum wrote:
    > RULE #1:
    >         - An event record is an arbitrary collection of tagged values
    > RULE #2:
    >         - Some tags are "known" public tags, some tags are application,
    >         site, or event specific - cannot know them all in advance
    >         therefore there must be some flexibility in letting app
    >         designers pick their own
    > RULE #3:
    >         - the API for tossing event records should be EASIER to use than
    >         the current syslog API and should inherently discourage
    >         implementation of buffer overruns
    > RULE #4:
    >         - defaults should be used wherever possible
    
    Works for me, as does the sketch of the API and format you suggest.
    
    I can see a few issues with it, but I think they are pretty minor. Nitpicks
    really, this is good stuff IMO. I'd be happy to knock up a Java equivalent
    API for it if that is of any use (there's already a Java logging framework,
    not sure how this would fit with it though).
    
    On to the nitpicks:
    
    (1) You can be sure that application programmers are going to do both of the
    following types of calls:
    
           eventlog_addvalue(EVENTLOG_TEXT, "memory remaining < 10M!");
    	 eventlog_addvalue(EVENTLOG_TRADESUMMARY,
    "<trade><stock>MSFT</stock><blah>blah</blah></trade>");
    
    Where of course you need to apply some translation to the metacharacters on
    output, so these come out the back in the form of:
    
           <TEXT>memory remaining &lt; 10M</TEXT>
           <TRADESUMMARY>&lt;trade&gt;&lt;/stock&gt; ... etc.</TRADESUMMARY>
    
    So both basically work, since you can escape the metacharacters and that's
    fine.
    
    However, since the second case also happens to be well-formed XML, it would
    be acceptable and useful to leave it alone and turn escaping off for that
    case.
    
    If you don't do that, then it's definitely klunky and potentially impossible
    for an analyzer to intelligently unescape the metacharacters and restore the
    embedded XML if it wants to do anything interesting with those details (like
    find unusual trades, or even just translate them to HTML or something like
    that).
    
    (2) Needs some support for binary crap (thinking of Windows event log here,
    but you'll also get applications wanting it). Base64 I guess?
    
    (3) Needs some I18N story. Allowing the encoding to be specified would allow
    UNICODE, which probably provides enough I18N to be good enough, but I'm not
    an expert. It would be annoying to have to transmit an XML encoding line on
    every message though, especially if it is the same every time. So you either
    want to pick one implicit encoding and that's the encoding the protocol
    always uses, or you have a really small number of encodings (maybe just 2)
    and just put out a short identifier to say which one you're using for this
    message.
    
    (4) You're going to get name clashes on the app-specific tags if this
    catches on. That may or may not be an issue. If it is, XML does provide a
    namespace mechanism to deal with such things, which is probably heavier than
    you want for this, but it would be good to be able to use available parsers.
    Maybe app specific tags could be something like <progname:tag>, either
    explicitly so or implicitly (smaller), and then have a register of prognames
    if you're really worried about clashes.
    
    (5) Probably needs some way to retrofit all of this to the syslog
    protocol(s) (without length limit, and without UDP), so that any
    infrastructure that understands that can be glued in. I guess items such as
    timestamp and source could be put into the existing slots for those, with
    the remaining tagged text stuffed into the message.
    
        If that's done then some means for an analyzer to know that a particular
    message is structured is also needed. Maybe it's enough to just test that
    the messages "look tagged", or are well-formed, but it would probably be
    more efficient if there was some clue kludged into the syslog header. There
    are other ways round it too - like if it's from a particular host and/or
    progname, maybe you know or can be configured to know that it is using the
    tagged format.
    
    <snip API & format - see Marcus's original post>
    
    > >So the first question is what does an event consist of? Things like
    > >generation time, event id, priority, source, human readable message,
    > >forwarding trail, maybe some application-specific payload, what
    > else? It's
    > >easy to come up with a long shopping list, but what are the basics that
    > >people could agree on, and which are optional and which are mandatory?
    >
    > You want to make the number of options small and the core stuff
    > should ALL be defaulted if not provided.
    
    Presumably you would have a reasonably large number of well-known tags that
    are optional though? In other words, a fixed core that is mandatory, and a
    set of standard options like EVENTLOG_PROTO that are only there when
    relevant?
    
    I think it would be sensible to have some kind of register of well-known
    optional tags, to avoid unnecessary proliferation and duplication of
    application specific ones.
    
    It may also make sense to have profiles, so that different implementations
    of the same thing could log the basics in a largely uniform way ("these are
    the standard tags we log when implementing a foo server"). That could be
    something like "logging considerations" in an RFC-type document for example.
    
    Cheers,
    Frank
    
    _______________________________________________
    LogAnalysis mailing list
    LogAnalysisat_private
    http://lists.shmoo.com/mailman/listinfo/loganalysis
    



    This archive was generated by hypermail 2b30 : Thu Dec 19 2002 - 19:38:32 PST