(Since people requested that this discussion continue on-list, I'm retracting my earlier abjuration of this subthread.) On Wed, Jun 05, 2002 at 03:36:26PM -0400, Marcus J. Ranum wrote: > weaknesses of the technique. :) Anyhow - I know you didn't > intend to hit a hot button but calling something "religious" is > a hefty insult where I come from. ;) Point taken. $my_statement =~ s/religious/conditioned/g :) > You are correct. What you're saying is that with sufficiently > energetic application of duct tape spit and baling wire you > can build Notre Dame Cathedral. I'll grant you that. But I think > that developing vastly superior syntax(es) and approaches for > the kind of parsing needed for log analysis would take much less > time than even bothering to understand the regexps and overcome > their flaws. :) Again, I think this all depends on where ones prior experience lies. To people unfamiliar with the ins and outs of regexes, but familiar with parsing in general, this is probably true. > I didn't even go into the question of parsing binary data, which > would be darned useful for any advanced log parser... That's actually a hairy beast of a different sort; I no of no log parser today that makes parsing of binary data easy. That's yet another argument in favor of starting this endeavor with a set of grammars, since binary data can at least be represented in the grammar as some sort of placeholder node while the details are worked out. > If the industry is going to make big strides in log parsing > it's got to be sh&t simple to write new parse rules for new > log messages as they appear. During my brief tenure at Counterpane, I was in the group that was responsible for producing new parse rules for a regex-based parse implementation, so I'm _very_ aware of how important something like this is, and how hard it is with a pure-regex implementation. :) > I know you're not. :) And I'm not trying to be deliberately a pain > in the neck on this issue - I just fear that a lot of people who > look at the log parsing problem immediately reach for duct tape, > spit, and baling wire and start hammering nails without thinking > the problem through. And - in this environment - perl regexps appear > to be the most popular form of duct tape. ;) It's a shame to think > that lots of people are going to burn lots of brain cycles > re-implementing the same things that don't work very well when > it'd be really straightforward for someone to blast a single > bullet through the whole problem. I guess I'm arguing the reverse--rather than waiting for someone to figure out what that single bullet is, let's define the problem set more explicitly so that then someone who wants to come up with the single bullet can do so confidently. Basically: if we can come up with a repository of sample messages, and a set of grammars that describes all of those messages in an easily extensible way, then we can set people loose on the problem of creating an engine to implement those grammars with a high degree of certainty that they're solving the right problem; until we have the abstraction layer that those grammars provide, however, any such engine implementations are just shots in the dark. In the interim, any engine for testing those grammars that people want to use should be fine, so long as the metagrammar is well-defined. -- Sweth. -- Sweth Chandramouli Idiopathic Systems Consulting svcat_private http://www.idiopathic.net/ --------------------------------------------------------------------- To unsubscribe, e-mail: loganalysis-unsubscribeat_private For additional commands, e-mail: loganalysis-helpat_private
This archive was generated by hypermail 2b30 : Wed Jun 05 2002 - 13:01:30 PDT