[logs] Re: regex-less parsing of messages

From: Anton Chuvakin (anton@private)
Date: Thu Dec 08 2005 - 15:44:27 PST


All,

So, it looks like this discussion did generate some solutions and very
cool ideas!

Here is my summary with comments, of sorts.

1. If parsing/tokenizing is hard, wait for the XML standard to emerge.
And then things will be easy indeed. There is definite value in this
one... a humorous value :-)
2. Don't tokenize, there are good (?) tools to analyzes logs without
it (BTW, by tokenizing I meant not only 'splitting things', but also
naming/categorizing the resulting  chunks)
3. I am surprised that nobody picked up on the 'can we solve this
problem if you have  a lot of similar log data to look at' Clustering
and similar approaches seem, IMHO, "almost doable."

I also wanted to comment on analysis methods that do not rely on
tokenized logs.  I agree that they can solve some problems (mentioned
in this thread), but I suspect that they will hit a sturdy wall in
some others.  For example, I do not think that tracing an email
message thru logs from multiple diverse devices can be solved without
understanding each device log format. Similarly, rule-based
correlation approaches require knowledge of the nature of specific log
fields (such as source, destination, table name, etc)

And, finally, I wanted to address this one:

>I see it, we'll be stuck with "expert systems" for a while - the market for
>log analysis software is not that rich to justify the type of investments
>required to keep a couple of Ph.D's on your payroll.
Hmmm, what makes you say so? Some solutions that help with logs are
not exactly bargain priced, if you know what I mean :-) I do not think
that the market is small, considering that *everybody* has the 'log
problem' to some extent... And the problem can only become worse, thus
increasing the market and providing jobs for those Ph.D.s :-)

Next, I will launch something on the use of data mining for logs...stand by :-)

Best,
--
Anton Chuvakin, Ph.D., GCIA, GCIH, GCFA
         http://www.chuvakin.org
    http://www.securitywarrior.com
_______________________________________________
LogAnalysis mailing list
LogAnalysis@private
http://lists.shmoo.com/mailman/listinfo/loganalysis



This archive was generated by hypermail 2.1.3 : Thu Dec 08 2005 - 16:36:24 PST