Greetings: (Sorry this is so long) I've been reading the discussion of the question: "to XML or not to XML" and I have to admit that I'm a bit confused. Would someone seriously consider only storing their logs in XML format? How would that be useful? Our central logging system uses XML. But, it stores the log entries in a real relational database where it can be queried efficiently, summarized, and used for reporting. When it comes time to "archive off" old entries, we output those (the raw entries) as XML, compress them, burn them to CD and file them away. If we need to bring that data back online, it's then trivial to load the XML back into the database, or into another temporary query database. The beauty of using XML in this way is that one can easily bulk load the data into Access or SQL Server or Oracle or whatever without having to write or define custom scripts. One could, I suppose, use grep or write a perl report against an XML file, but that would be a horrible experience (IMHO), since the overhead of reading the tags would come into play with each query or report. Like the cartoon character Dilbert, it seems we always advocate building a database for any new project. I'm probably a step behind everyone else on this, because I don't attempt to parse each message and assign XML tags to the message content. Once again, I depend on the database's text query ability to take care of that for me. Thus, my XML schema is extremely simple: <?xml version="1.0" encoding="utf-8" ?> <xs:schema id="Syslog" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:sql="urn:schemas-microsoft-com:mapping-schema"> <xs:element name="row" sql:relation="Syslog"> <xs:complexType> <xs:attribute name="LogSeqNo" type="xs:string" sql:field="LogSeqNo" sql:datatype="numeric(12)" /> <xs:attribute name="DateTimeLocal" type="xs:dateTime" sql:field="DateTimeLocal" sql:datatype="datetime" /> <xs:attribute name="DateTimeUTC" type="xs:dateTime" sql:field="DateTimeUTC" sql:datatype="datetime" /> <xs:attribute name="DateTimeRFC822Local" type="xs:string" sql:field="DateTimeRFC822Local" sql:datatype="char(31)" /> <xs:attribute name="DateTimeRFC822UTC" type="xs:string" sql:field="DateTimeRFC822UTC" sql:datatype="char(31)" /> <xs:attribute name="IPAddressSource" type="xs:string" sql:field="IPAddressSource" sql:datatype="char(15)" /> <xs:attribute name="HostnameSource" type="xs:string" sql:field="HostnameSource" sql:datatype="varchar(255)" /> <xs:attribute name="IPAddressDestination" type="xs:string" sql:field="IPAddressDestination" sql:datatype="char(15)" /> <xs:attribute name="Facility" type="xs:string" sql:field="Facility" sql:datatype="char(9)" /> <xs:attribute name="Priority" type="xs:string" sql:field="Priority" sql:datatype="char(9)" /> <xs:attribute name="MessageText" type="xs:string" sql:field="MessageText" sql:datatype="varchar(1024)" /> </xs:complexType> </xs:element> </xs:schema> For those of you who don't speak Microsoft, this schema includes the mapping to a SQL Server database, that's what the sql namespace is all about. My point is that the entire text of the Syslog message is contained in the "MessageText" field. Once again, outputting from the database to the XML file automatically takes care of quoting special characters in the text; which eliminates one of the common problems with delimited files (as someone pointed out in an earlier portion of the discussion). This also ensures that when the data is re-imported, it will come back into the database intact. Note also that the type conversion is handled automatically by the database import facilities. This all seems trivial, but having written a large number of control files for the bulk-loading of delimited data into various databases, I don't consider writing them fun; it's rare that they work (for me) the first time. I haven't had that trouble with XML; so, that's why I decided to use it. Having a database that will output in XML is also really handy when it comes time to display the log entries on a web page. With a few quick XSL style sheets one can present the XML formatted SQL query results in a respectable, if not particularly fancy way. Color coding by the contents of the Priority field can make those "CRITICALS" really stand out from the sea of "INFO" messages. Yes, there are many other ways of doing this, but there's something really elegant about the XSL way. Using XML between the logging client and the logging server when they are operating in a real-time mode seems problematic to me. But, it might be useful if logs were accumulated on the client and needed to be batch loaded into a central database. We try avoid such batch loading for two reasons: First, it defeats one of the primary purposes of centralizing the logs if they remain on the server where they were generated, viz., to prevent tampering in case of a compromise or loss in case of a hardware failure. Second, when the logs are not shipped in real-time, one loses the standard provided by the arrival time on the central log server, i.e., one ends up trusting the remote clock on the originating server for the time. This leaves another opening for compromise, and makes it difficult to compare the occurrence times of events across servers. We use NTP, but it isn't always reliable. I'll stop rambling now. . . Frank Solomon University of Kentucky http://www.franksolomon.net _______________________________________________ LogAnalysis mailing list LogAnalysisat_private http://lists.shmoo.com/mailman/listinfo/loganalysis
This archive was generated by hypermail 2b30 : Fri Aug 23 2002 - 13:35:41 PDT