RE: [logs] Logging: World Domination

From: Solomon, Frank (sysfrankat_private)
Date: Fri Aug 23 2002 - 13:24:10 PDT

  • Next message: Kyle R. Hofmann: "Re: Re[2]: [logs] Logging: World Domination"

    Greetings:
    
    (Sorry this is so long)
    
    I've been reading the discussion of the question:  "to XML or not to XML"
    and I have to admit that I'm a bit confused.
    
    Would someone seriously consider only storing their logs in XML format?  How
    would that be useful?
    
    Our central logging system uses XML.  But, it stores the log entries in a
    real relational database where it can be queried efficiently, summarized,
    and used for reporting.  When it comes time to "archive off" old entries, we
    output those (the raw entries) as XML, compress them, burn them to CD and
    file them away.  If we need to bring that data back online, it's then
    trivial to load the XML back into the database, or into another temporary
    query database.  The beauty of using XML in this way is that one can easily
    bulk load the data into Access or SQL Server or Oracle or whatever without
    having to write or define custom scripts.  One could, I suppose, use grep or
    write a perl report against an XML file, but that would be a horrible
    experience (IMHO), since the overhead of reading the tags would come into
    play with each query or report.
    
    Like the cartoon character Dilbert, it seems we always advocate building a
    database for any new project.
    
    I'm probably a step behind everyone else on this, because I don't attempt to
    parse each message and assign XML tags to the message content.  Once again,
    I depend on the database's text query ability to take care of that for me.
    Thus, my XML schema is extremely simple:
    
    <?xml version="1.0" encoding="utf-8" ?>
    <xs:schema id="Syslog" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:sql="urn:schemas-microsoft-com:mapping-schema">
    	<xs:element name="row" sql:relation="Syslog">
    		<xs:complexType>
    			<xs:attribute name="LogSeqNo" type="xs:string"
    sql:field="LogSeqNo" sql:datatype="numeric(12)" />
    			<xs:attribute name="DateTimeLocal"
    type="xs:dateTime" sql:field="DateTimeLocal" sql:datatype="datetime" />
    			<xs:attribute name="DateTimeUTC" type="xs:dateTime"
    sql:field="DateTimeUTC" sql:datatype="datetime" />
    			<xs:attribute name="DateTimeRFC822Local"
    type="xs:string" sql:field="DateTimeRFC822Local" sql:datatype="char(31)" />
    			<xs:attribute name="DateTimeRFC822UTC"
    type="xs:string" sql:field="DateTimeRFC822UTC" sql:datatype="char(31)" />
    			<xs:attribute name="IPAddressSource"
    type="xs:string" sql:field="IPAddressSource" sql:datatype="char(15)" />
    			<xs:attribute name="HostnameSource" type="xs:string"
    sql:field="HostnameSource" sql:datatype="varchar(255)" />
    			<xs:attribute name="IPAddressDestination"
    type="xs:string" sql:field="IPAddressDestination" sql:datatype="char(15)" />
    			<xs:attribute name="Facility" type="xs:string"
    sql:field="Facility" sql:datatype="char(9)" />
    			<xs:attribute name="Priority" type="xs:string"
    sql:field="Priority" sql:datatype="char(9)" />
    			<xs:attribute name="MessageText" type="xs:string"
    sql:field="MessageText" sql:datatype="varchar(1024)" />
    		</xs:complexType>
    	</xs:element>
    </xs:schema>
    
    For those of you who don't speak Microsoft, this schema includes the mapping
    to a SQL Server database, that's what the sql namespace is all about.
    
    My point is that the entire text of the Syslog message is contained in the
    "MessageText" field.  Once again, outputting from the database to the XML
    file automatically takes care of quoting special characters in the text;
    which eliminates one of the common problems with delimited files (as someone
    pointed out in an earlier portion of the discussion).  This also ensures
    that when the data is re-imported, it will come back into the database
    intact.  Note also that the type conversion is handled automatically by the
    database import facilities.  This all seems trivial, but having written a
    large number of control files for the bulk-loading of delimited data into
    various databases, I don't consider writing them fun; it's rare that they
    work (for me) the first time.
    
    I haven't had that trouble with XML; so, that's why I decided to use it.
    
    Having a database that will output in XML is also really handy when it comes
    time to display the log entries on a web page.  With a few quick XSL style
    sheets one can present the XML formatted SQL query results in a respectable,
    if not particularly fancy way.  Color coding by the contents of the Priority
    field can make those "CRITICALS" really stand out from the sea of "INFO"
    messages.  Yes, there are many other ways of doing this, but there's
    something really elegant about the XSL way.
    
    Using XML between the logging client and the logging server when they are
    operating in a real-time mode seems problematic to me.  But, it might be
    useful if logs were accumulated on the client and needed to be batch loaded
    into a central database.  We try avoid such batch loading for two reasons:
    First, it defeats one of the primary purposes of centralizing the logs if
    they remain on the server where they were generated, viz., to prevent
    tampering in case of a compromise or loss in case of a hardware failure.
    Second, when the logs are not shipped in real-time, one loses the standard
    provided by the arrival time on the central log server, i.e., one ends up
    trusting the remote clock on the originating server for the time.  This
    leaves another opening for compromise, and makes it difficult to compare the
    occurrence times of events across servers.  We use NTP, but it isn't always
    reliable.
    
    I'll stop rambling now. . .
    
    Frank Solomon
    University of Kentucky
    http://www.franksolomon.net
    
    _______________________________________________
    LogAnalysis mailing list
    LogAnalysisat_private
    http://lists.shmoo.com/mailman/listinfo/loganalysis
    



    This archive was generated by hypermail 2b30 : Fri Aug 23 2002 - 13:35:41 PDT