Re: [logs] Syslog payload format

From: Kyle R. Hofmann (krhat_private)
Date: Wed Dec 18 2002 - 10:09:46 PST

  • Next message: Justin H Tran: "[logs] IIS log"

    On Tue, 17 Dec 2002 19:14:09 +0000, "Frank O'Dwyer" wrote:
    > So the first question is what does an event consist of? Things like
    > generation time, event id, priority, source, human readable message,
    > forwarding trail, maybe some application-specific payload, what else? It's
    > easy to come up with a long shopping list, but what are the basics that
    > people could agree on, and which are optional and which are mandatory?
    
    Very few things should be mandatory, and very many should be optional.  The
    minimum necessary information for a log message to be useful, I think, is a
    timestamp, some sort of source identifier (i.e., who sent this message), and
    some statement of purpose (i.e., what this message is about).  That's not
    very useful, but I'm not sure you could reasonably expect more.  (What about,
    for example, mark messages?  "As of midnight, I, your mail server, am alive.")
    
    With that in mind, I'm going to propose a list of fields.  I owe a debt to
    mjr's logging data map at http://www.ranum.com/logging/logging-data-map.html.
    
    DATE	RFC 3339 timestamp.
    FAC	Facility.  Freeform case-insensitive alphanumeric, e.g., "mail",
    	"ftp", "ids", "firewall", "local6".
    PRI	Priority.  A single digit number, 0-9.
    PRO	Protocol.  Freeform case-insensitive alphanumeric, e.g., "smtp",
    	"ftp", "dns", "http".
    MSGID	Numeric identifier for this log message.  Freeform numeric.
    MSG	Freeform ASCII printable description of the log message.
    	Discouraged.
    
    LOGHOST	The identifier of the host sending the log.  Freeform ASCII printable,
    	preferably a DNS name, but could also be something else, e.g., a
    	Netbios workstation name.
    LOGADDR	The address of LOGHOST.  Freeform ASCII printable, preferably an
    	IPv4 or IPv6 address (assuming they're meaningful).
    LOGNET	The network of LOGHOST.  Freeform ASCII printable.
    LOGORG	The organization to which LOGHOST belongs.  Freeform ASCII printable.
    LOGDIV	The division to which LOGHOST belongs.  Freeform ASCII printable.
    LOGSET	The administrative grouping ("set") of hosts to which LOGHOST
    	belongs.  Freeform ASCII printable.
    LOGGEO	The geographic location of the host sending the log.  Freeform ASCII
    	printable, preferably a latitude and longitude.
    LOGPROG	The program on LOGHOST that emits the log message.  Freeform ASCII
    	printable. (e.g., "postfix")
    LOGMOD	The module of LOGPROG sending the message.  Freeform ASCII printable.
    	(continuing the example of LOGPROG="postfix", e.g., LOGMOD="smtpd")
    LOGPID	A numeric identifier for LOGPROG.  Freeform numeric, preferably
    	something like a Unix PID.
    LOGUSER	The user running LOGPROG.  Freeform alphanumeric.  (e.g., "bob",
    	"root")
    LOGGRP	The group running LOGPROG.  Freeform alphanumeric.  (e.g., "wheel",
    	"Power Users")
    LOGCRED	The credentials of LOGUSER or LOGGRP.  Freeform ASCII printable.
    LOGUSERCRED	The credentials of LOGUSER.  Freeform ASCII printable.
    LOGGRPCRED	The credentials of LOGGRP.  Freeform ASCII printable.
    LOGTRN	The transaction that this message references.  Freeform ASCII
    	printable.
    LOGTID	An identifier for LOGTRN.  Freeform ASCII printable; preferably a
    	number or something like it.  Probably more useful than LOGTRN.
    LOGSTAT	The status of LOGTRN.  Freeform ASCII printable.
    LOGSID	An identifier for LOGSTAT.  Freeform ASCII printable, preferably
    	a number or something like it.
    LOGSUBJ	The subject that LOGPROG is sending a message about.  Freeform ASCII
    	printable.  E.g., "connection", "message", "packet", "request"
    	(esp. in conjuction with LOGOID), or "/var/log/messages",
    	"http://www.example.com/index.html".
    LOGSID	An identifier for LOGOBJ ("object id").  Freeform ASCII printable;
    	preferably a number or something like it.  (e.g., "391458",
    	"gBHJE9T42059")
    LOGEVT	The event that LOGPROG has noticed.  Freeform ASCII printable.
    LOGEID	An identifier for LOGEVT.  Freeform ASCII printable, preferably
    	a number or something like it.
    LOGACT	The action that LOGPROG is taking.  Freeform ASCII printable.
    	This should be a short string taken from a list of possible actions,
    	e.g., "created", "deleted", "fatal error", "file not found"
    LOGAID	An identifier for LOGACT.  Freeform ASCII printable, preferably
    	a number or something like it.  E.g., "404", "200".
    LOGOBJ	The object of LOGACT (as if LOGACT were a transitive verb).  Freeform
    	ASCII printable.
    LOGOID	An identifier for LOGOBJ, preferably a number or something like it.
    LOGAUX	Auxilliary information given by LOGPROG.  Freeform ASCII printable,
    	preferably something easy to parse.
    
    SRCHOST, SRCARDDR, SRCNET, SRCORG, SRCDIV, SRCSET, SRCGEO, SRGPROG, SRCMOD,
    SRCPID, SRCUSER, SRCGRP, SRCCRED, SRCUSERCRED, SRCGRPCRED, SRCTRN, SRCTID,
    SRCSTAT, SRCSID, SRCEVT, SRCEID, SRCSUBJ, SRCSID, SRCACT, SRCAID, SRCOBJ,
    SRCOID, SRCAUX all have the same meanings as their corresponding LOG* fields,
    except they refer to the source of the transaction.  Similarly, DSTHOST,
    DSTARDDR, DSTNET, DSTORG, DSTDIV, DSTSET, DSTGEO, SRGPROG, DSTMOD, DSTPID,
    DSTUSER, DSTGRP, DSTCRED, DSTUSERCRED, DSTGRPCRED, DSTTRN, DSTTID, DSTSTAT,
    DSTSID, DSTEVT, DSTEID, DSTSUBJ, DSTSID, DSTACT, DSTAID, DSTOBJ, DSTOID,
    DSTAUX refer to the destination of the transaction.
    
    My model here is the English language; though it is awful for automated
    parsing, it is very flexible.  Many English sentences consist of a subject,
    a verb, a direct object, and an indirect object.  Here, LOG*, SRC*, and DST*
    are the three nouns; LOG* will usually be the subject.  I have almost
    duplicated that structure again with *SUBJ, *ACT or *EVT, and *OBJ being the
    subject, verb, and object of a sentence.  You don't have to strictly make
    sentences out of log messages; however, I've tried manually reformatting a
    few lines, and it seems to work well.
    
    The analogy with grammar has made me wonder if a tree structure would be best,
    but I'm not sure how it should be set up.  If it only duplicated what I've
    proposed above, I don't think it would be worth the effort, and if it were
    much more flexible, it would require extra effort from the programmer.
    
    Some internationalization advocate is going to complain that I keep saying
    ASCII above.  I do that because everyone can read ASCII.  Perhaps a more
    flexible solution would be to declare syslog messages to be binary (they just
    happen to have lots of ASCII characters), and then put whatever you like in
    each field.
    
    I think DATE, FAC, PRI, LOGHOST, and LOGPROG should be the only mandatory
    fields,  Any sensible message should contain more, but I'm not sure what
    other sensible restrictions we can make.
    
    > Once you know the content of the object/struct, you can then worry about
    > getting it from A to B, and safely tucked into some log or other.
    
    I'd like to make one comment on timestamps.  There should be two of them, one
    from the host that receives and stores the log message, and one from the
    program that creates it (or from the host that creates it).  This is because
    they correspond to two different things: one corresponds to the event (and
    should correspond to DATE above), and the other corresponds to the message
    (and should correspond to nothing above).
    
    -- 
    Kyle R. Hofmann <krhat_private>
    _______________________________________________
    LogAnalysis mailing list
    LogAnalysisat_private
    http://lists.shmoo.com/mailman/listinfo/loganalysis
    



    This archive was generated by hypermail 2b30 : Wed Dec 18 2002 - 21:50:10 PST