[logs] Bayes - good or bad?

From: Mike Heisler (mgh4@private)
Date: Thu Feb 24 2005 - 12:38:07 PST


We've been thinking the same thing lately. We've got a central log server 
already so feeding a lot of data with good/bad pattern matches would be straight 
forward. Yes, to start with it's no better than pattern matching. But over time 
will it save work in maintaining the patterns and detecting new issues?

Normalization of the data might also help. Making sure that the input is more 
similar each time by adding fields. Like if some sylog messages don't have a 
year, put one in, etc.

 > I've been playing with my reiplementation of Marcus Ranum's fnort, and it
 > seems that the only way to get good sensible results out of it is to have
 > good training data. As you can guess, the above is just another way of
 > saying that "it doesn't work" :-)
 >
 > If I separate log lines into good and bad (easy, huh...) and then feed
 > them line by line into Bayesian classifier (such as bogofilter) for
 > training, and then stuff an unknown sample into it, I only get the lines
 > equal to whatever was bad classified as bad. E.g. if 'ssh auth failed' was
 > in a 'known bad' sample, bogofilter will mark them as bad in the unknown
 > sample. In other words, the results are the same as with a simple pattern
 > matching.
 >
 > Any other experiences? Ideas? Comments?

-- 
Mike Heisler    mgh4@private     Systems & Operations
607-255-3058    Cell: 607-227-6791   Fax: 607-255-8521
CIT  723 Rhodes Hall  Cornell University  Ithaca, NY 14853
_______________________________________________
LogAnalysis mailing list
LogAnalysis@private
http://lists.shmoo.com/mailman/listinfo/loganalysis



This archive was generated by hypermail 2.1.3 : Thu Feb 24 2005 - 16:56:57 PST