RE: [logs] Logging: World Domination

sysfrankat_private

Greetings:

(Sorry this is so long)

I've been reading the discussion of the question:  "to XML or not to XML"
and I have to admit that I'm a bit confused.

Would someone seriously consider only storing their logs in XML format?  How
would that be useful?

Our central logging system uses XML.  But, it stores the log entries in a
real relational database where it can be queried efficiently, summarized,
and used for reporting.  When it comes time to "archive off" old entries, we
output those (the raw entries) as XML, compress them, burn them to CD and
file them away.  If we need to bring that data back online, it's then
trivial to load the XML back into the database, or into another temporary
query database.  The beauty of using XML in this way is that one can easily
bulk load the data into Access or SQL Server or Oracle or whatever without
having to write or define custom scripts.  One could, I suppose, use grep or
write a perl report against an XML file, but that would be a horrible
experience (IMHO), since the overhead of reading the tags would come into
play with each query or report.

Like the cartoon character Dilbert, it seems we always advocate building a
database for any new project.

I'm probably a step behind everyone else on this, because I don't attempt to
parse each message and assign XML tags to the message content.  Once again,
I depend on the database's text query ability to take care of that for me.
Thus, my XML schema is extremely simple:

<?xml version="1.0" encoding="utf-8" ?>
<xs:schema id="Syslog" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:sql="urn:schemas-microsoft-com:mapping-schema">
	<xs:element name="row" sql:relation="Syslog">
		<xs:complexType>
			<xs:attribute name="LogSeqNo" type="xs:string"
sql:field="LogSeqNo" sql:datatype="numeric(12)" />
			<xs:attribute name="DateTimeLocal"
type="xs:dateTime" sql:field="DateTimeLocal" sql:datatype="datetime" />
			<xs:attribute name="DateTimeUTC" type="xs:dateTime"
sql:field="DateTimeUTC" sql:datatype="datetime" />
			<xs:attribute name="DateTimeRFC822Local"
type="xs:string" sql:field="DateTimeRFC822Local" sql:datatype="char(31)" />
			<xs:attribute name="DateTimeRFC822UTC"
type="xs:string" sql:field="DateTimeRFC822UTC" sql:datatype="char(31)" />
			<xs:attribute name="IPAddressSource"
type="xs:string" sql:field="IPAddressSource" sql:datatype="char(15)" />
			<xs:attribute name="HostnameSource" type="xs:string"
sql:field="HostnameSource" sql:datatype="varchar(255)" />
			<xs:attribute name="IPAddressDestination"
type="xs:string" sql:field="IPAddressDestination" sql:datatype="char(15)" />
			<xs:attribute name="Facility" type="xs:string"
sql:field="Facility" sql:datatype="char(9)" />
			<xs:attribute name="Priority" type="xs:string"
sql:field="Priority" sql:datatype="char(9)" />
			<xs:attribute name="MessageText" type="xs:string"
sql:field="MessageText" sql:datatype="varchar(1024)" />
		</xs:complexType>
	</xs:element>
</xs:schema>

For those of you who don't speak Microsoft, this schema includes the mapping
to a SQL Server database, that's what the sql namespace is all about.

My point is that the entire text of the Syslog message is contained in the
"MessageText" field.  Once again, outputting from the database to the XML
file automatically takes care of quoting special characters in the text;
which eliminates one of the common problems with delimited files (as someone
pointed out in an earlier portion of the discussion).  This also ensures
that when the data is re-imported, it will come back into the database
intact.  Note also that the type conversion is handled automatically by the
database import facilities.  This all seems trivial, but having written a
large number of control files for the bulk-loading of delimited data into
various databases, I don't consider writing them fun; it's rare that they
work (for me) the first time.

I haven't had that trouble with XML; so, that's why I decided to use it.

Having a database that will output in XML is also really handy when it comes
time to display the log entries on a web page.  With a few quick XSL style
sheets one can present the XML formatted SQL query results in a respectable,
if not particularly fancy way.  Color coding by the contents of the Priority
field can make those "CRITICALS" really stand out from the sea of "INFO"
messages.  Yes, there are many other ways of doing this, but there's
something really elegant about the XSL way.

Using XML between the logging client and the logging server when they are
operating in a real-time mode seems problematic to me.  But, it might be
useful if logs were accumulated on the client and needed to be batch loaded
into a central database.  We try avoid such batch loading for two reasons:
First, it defeats one of the primary purposes of centralizing the logs if
they remain on the server where they were generated, viz., to prevent
tampering in case of a compromise or loss in case of a hardware failure.
Second, when the logs are not shipped in real-time, one loses the standard
provided by the arrival time on the central log server, i.e., one ends up
trusting the remote clock on the originating server for the time.  This
leaves another opening for compromise, and makes it difficult to compare the
occurrence times of events across servers.  We use NTP, but it isn't always
reliable.

I'll stop rambling now. . .

Frank Solomon
University of Kentucky
http://www.franksolomon.net

_______________________________________________
LogAnalysis mailing list
LogAnalysisat_private
http://lists.shmoo.com/mailman/listinfo/loganalysis