On Thu, Nov 14, 2002 at 04:12:44PM +0000, Will Partain wrote: > "Jon Stearley" <jrstearat_private> writes: > > > i'm using the teiresias algorithm > > (http://cbcsrv.watson.ibm.com/Tspd.html) to classify log lines. > > (OK, I'm scared :-) > > Just an idle possibly-related thought: could any of the > principles of Bayesian spam filtering (quite the rage in > some circles...) be applied to logging? > > (Best to go googling if you want real info on Bayesian spam > filtering, but the rough user-interface is: you train the > filter on a big pile of spam, and then on a big pile of > non-spam ('ham'); thereafter, it tells you whether messages > look more like the one or t'other.) > > I'm guessing that a typical syslog message lacks enough info > to play the Bayesian game. But I suppose you could feed it > chunks of logs ({1,5,10} {secs,mins}) and it could at least > eliminate the chunks that were entirely "uninteresting". yes, it could be. but i've been more interested in the bioinformatics these days given the amount of hubbub, effort, and advances there lately. some of the sequencing algorithms etc are ideal candidates for application to log analysis imho, so i've kept my focus there. maybe someday i'll get paid for this work or go back to school - so i can do a proper comparitive study on various pattern-discovery/ai/etc approaches in this domain. :) in the meantime it's kinda just-for-fun... -jon _______________________________________________ LogAnalysis mailing list LogAnalysisat_private http://lists.shmoo.com/mailman/listinfo/loganalysis
This archive was generated by hypermail 2b30 : Sun Nov 17 2002 - 20:00:48 PST