What experience, thinking/dreaming, and interest do people have in making the computer learn what is and isn't "normal" in syslog output? ie- having the computer process/classify syslog output (or, an arbitrary stream) and present it in a high signal/noise ratio manner? I'm not talking about writing regexps, I'm talking about having the computer infer/learn the regexps (characterization information, regardless of its form) over some training period (ie- ongoing), and then presenting the analysis in a high signal/noise ratio manner. it could then tie into some action/response mechanism of which there are many to choose from, but my interests are mainly in the learning process. My thinking/researching/hacking has ranged from using simple statistics, LZ77, natural language modelling (WHIRL, SRILM, others), and sequencing algorithms (only TEIRESIAS yet) in this effort. It's basically a statistics/signal-processing problem imho. I don't get paid for this and haven't put sufficient personal time on it to make a huge amount of progress/success, but it looks quite likely that I'll be able to spend more time on it in the coming year. http://www.counterpane.com/log-analysis.html and other loggy spots I know of are mostly "expert system" based (ie- we enumerate the expert knowledge). I know this ai/buzzword/etc approach is not particularly new, but it does appear to me to be unsolved - interesting, at least. I'm basically polling for pointers, experience/advice, and collaborators. Thanks! -- +--------------------------------------------------------------+ | Jon Stearley (505) 845-7571 (FAX 844-2067) | | Compaq Federal LLC High Performance Solutions | | Sandia National Laboratories Scalable Systems Integration | +--------------------------------------------------------------+ --------------------------------------------------------------------- To unsubscribe, e-mail: loganalysis-unsubscribeat_private For additional commands, e-mail: loganalysis-helpat_private
This archive was generated by hypermail 2b30 : Thu Dec 20 2001 - 12:38:52 PST