We've been thinking the same thing lately. We've got a central log server already so feeding a lot of data with good/bad pattern matches would be straight forward. Yes, to start with it's no better than pattern matching. But over time will it save work in maintaining the patterns and detecting new issues? Normalization of the data might also help. Making sure that the input is more similar each time by adding fields. Like if some sylog messages don't have a year, put one in, etc. > I've been playing with my reiplementation of Marcus Ranum's fnort, and it > seems that the only way to get good sensible results out of it is to have > good training data. As you can guess, the above is just another way of > saying that "it doesn't work" :-) > > If I separate log lines into good and bad (easy, huh...) and then feed > them line by line into Bayesian classifier (such as bogofilter) for > training, and then stuff an unknown sample into it, I only get the lines > equal to whatever was bad classified as bad. E.g. if 'ssh auth failed' was > in a 'known bad' sample, bogofilter will mark them as bad in the unknown > sample. In other words, the results are the same as with a simple pattern > matching. > > Any other experiences? Ideas? Comments? -- Mike Heisler mgh4@private Systems & Operations 607-255-3058 Cell: 607-227-6791 Fax: 607-255-8521 CIT 723 Rhodes Hall Cornell University Ithaca, NY 14853 _______________________________________________ LogAnalysis mailing list LogAnalysis@private http://lists.shmoo.com/mailman/listinfo/loganalysis
This archive was generated by hypermail 2.1.3 : Thu Feb 24 2005 - 16:56:57 PST