Tina Bird wrote: > Of course, I didn't have the sense to find >easy< problems to work on. Oh, come on, that wouldn't be fun :) > This is why whenever I'm looking at logs or doing any other repetitive task, > I consciously try to observe what my brain is doing for pattern detection -- Well then - let's try to brainstorm this a little bit. Novelty detection works well with multivariate time series. Discrete or continuos, it doesn't matter, but usually with values with a distance defined. USUALLY. There are cathegorical algorithms also, but they are not so good. So - I know we've gone through this a billion times, Tina, but bear with me - which type of data should we analyze for anomalies ? Let's try to agree on an example of data to look at, and to break it into pieces that can be automatically analyzed. I have read thousands of papers doing this, but they lacked a real understanding of what WE as (more or less) skilled humans look for in logs. What do you look for, esteemed colleagues ? Which data do these logs contain ? What are the characteristics that make you "more" through logs at light speed, and make you stop all of a sudden and look closer ? Detecting interesting spots seems to me a much more doable thing than automatically detecting intrusions :) But most important, how do we MAP these data to values suitable for these algorithms ? Mapping is everything, because if you choose the wrong mapping, algorithms are blind. Stefano _______________________________________________ LogAnalysis mailing list LogAnalysis@private http://lists.shmoo.com/mailman/listinfo/loganalysis
This archive was generated by hypermail 2.1.3 : Thu Aug 19 2004 - 12:32:20 PDT