On Wed, Oct 30, 2002 at 07:27:12AM -0500, Chris Brenton wrote: > On Tue, 2002-10-29 at 23:04, Marcus J. Ranum wrote: > > Dale.Drewat_private wrote: > > >You need to be able to look for > > >"abnormal" patterns in log data > > > > I'd like to know how to do this. Any pointers? > > ;-) > > I think Marcus himself has probably posted the best ideas along this > thread, namely his whole concept of "stateful logging". Key in on what > you know and understand to be normal, question everything else. The > entries might in fact be "abnormal", or they could be false positives. > Best way to tell is to have a clueful human on the back end sorting it > out. From there its just a matter of tweaking the system to reduce the > false positive rate. i've been tinkering with an ai-aided approach and will toss in my 2c. i'm using the teiresias algorithm (http://cbcsrv.watson.ibm.com/Tspd.html) to classify log lines. this does pretty well at classifying the lines (teiresias calls these motifs, i use it to basically write functional equivalents of regular expressions for me). those lines which don't match a motif (depends on the l,w,k inputs to the algorithm) are "anomalous". the total number of unique words and motifs in a given set of lines is easy to calculate and provides two additional (high-level) metrics. i'm currently working on how these metrics change over time, and calculating the time characteristics of the motifs (ie- this motif occurs (bursty, constant, periodic, etc) at some rate, aiming to cluster them into "event" groups (ie- these X lines usually appear together, at the following rate, over the following duration (rigidly fixed, or not), etc). i expect that comparing all this over time and/or across (sets) of hosts should yeild some pretty useful anomaly analysis, we'll see... current code is rough but works, help is welcome (lemme know if interested)... -- +--------------------------------------------------------------+ | Jon Stearley (505) 845-7571 (FAX 844-2067) | | Compaq Federal LLC High Performance Solutions | | Sandia National Laboratories Scalable Systems Integration | +--------------------------------------------------------------+ _______________________________________________ LogAnalysis mailing list LogAnalysisat_private http://lists.shmoo.com/mailman/listinfo/loganalysis
This archive was generated by hypermail 2b30 : Wed Oct 30 2002 - 10:32:16 PST