Re: Re[2]: [logs] Logging: World Domination

From: Bennett Todd (betat_private)
Date: Thu Aug 22 2002 - 09:58:12 PDT

  • Next message: Bennett Todd: "Re: [logs] tokens and layouts..."

    2002-08-21-17:59:58 Chris Adams:
    > On Wednesday, August 21, 2002, at 12:05 , Greg Black wrote:
    > >| if you propose something like this and don't use XML, the first
    > >| question you're going to get will invariably be "why didn't you
    > >| use XML?"
    > >
    > >To which a reasonable answer is: "because it sucks."
    > 
    > Hyperbole is unlikely to prove very persuasive here.
    
    I can't speak for Greg's intent here, but I didn't take his comment
    as hyperbole, just a calm and reasonable statement of the well-known
    truth.
    
    He put it compactly. A more elaborate statement might be "XML is a
    very heavy-weight framework for constructing languages; while it may
    be valuable in certain contexts involving highly automated,
    distributed, and heterogenous maintenance of gigantic corpuses of
    structured text, both XML the language specification standard and
    also the tools available to help implement it are vastly too complex
    for many, perhaps most of the jobs to which people try to apply it.
    
    Or, in short, XML sucks. At least for anything besides a somewhat
    cleaned-up replacement for SGML.
    
    > Not using XML means giving up everything from the existing
    > parsers and language support to the XML support many databases
    > are starting to have (given the value of a database for ad-hoc
    > queries, I'm inclined to say that's worth a little bloat to get
    > all of your logs into one).
    
    Ad-hoc queries are practical with various frameworks. XML-based
    databases are just the hairiest, slowest, most fragile one around.
    Giving up existing parsers and language support is only a negative
    if there exists a securely-written, high-performance, portable XML
    parser toolkit. I've never heard of one. So it sounds like giving up
    XML support tools would be beneficial for this application.
    
    > The size issue becomes a lot less of a problem if you've designed your 
    > DTD properly (e.g. resisting the urge to be unnecessarily verbose - 
    > <event host="..." timestamp="1234567890"> instead of 
    > <event><ip_hostname>fqdn.example.com</ip_hostname><timestamp>Fri Feb 13 
    > 15:31:30 PST 2009</timestamp>) and are using compression.
    
    But even:
    
    	<event host="..." timestamp="1234567890">
    
    would seem to me to be less desireable than
    
    	1234567890 ...
    
    I sure know which I'd rather parse.
    
    > The processing time concern is more of a problem but XML parsers have 
    > advanced considerably over the last few years. A well designed DTD 
    > should be surprisingly close to something like the typical Perl script 
    > which has to parse all of the slightly different variations of the same 
    > syslog message.
    
    The point of this discussion (assuming I've understood Tina's intent
    properly) is to do away with the slightly different variations, to
    produce a canonical structured format suitable for highly automated
    processing. I've yet to see anything that XML would add to this,
    that I would like to see added.
    
    > In both cases, neither would be a significant problem even now and 
    > Moore's law suggests this won't change for the worse.
    
    Extra complexity needs some justification. How will XML improve our
    position relative to a few fixed fields followed by
    heirarchically-assigned tokens? A whitespace-separated token list is
    sufficiently expressive for everything I've heard claimed that we
    want to do now; and the increased flexibility that XML offers would
    seem to me to be a negative for this job.
    
    Or, to put it succinctly, "XML sucks".
    
    -Bennett
    
    
    

    _______________________________________________ LogAnalysis mailing list LogAnalysisat_private https://lists.shmoo.com/mailman/listinfo/loganalysis



    This archive was generated by hypermail 2b30 : Thu Aug 22 2002 - 10:13:51 PDT