I completely agree about the need for a human interface -- it's going to be a long, long time before we can escape, at a minimum, verifying that the machines are flagging the right trends. Still can't beat the human brain for pattern matching, alas... On Fri, 21 Dec 2001 dgillettat_private wrote: > I have three basic concerns with this approach: > > 1. A stealthy/patient attacker might be able to stay "below radar" > while the system acclimates to his presence. i.e. Normal/routine may > not equate to *authorized*. > > 2. Anent the recent thread about court admissability, it is likely > to become necessary to explain why such a system flagged some > particular traffic. I haven't followed the field closely, but my > impression has long been that reporting/reproducing the learned > "reasoning" is a particularly thorny issue. > > 3. There remain persistent anecdotes to the effect that some > automated British defence system, during the 1982 Falklands war, > detected an incoming missile, identified it as an Exocet, and on that > basis classified it as "friendly" -- even though it was rapidly > closing on a British ship. I think there has to remain some human > interface to the ruleset, so that for instance an administrator can > revoke permissions previously granted to some traffic. I'm not sure > how else to get such a learning system to converge on policy changes > in an acceptable time. > > Dave Gillett > > > On 20 Dec 2001, at 17:21, Tina Bird wrote: > > > Hi Jon -- > > > > Just in case you haven't yet seen this (but you might have, > > given the SRI address in your headers): > > > > http://www.sdl.sri.com/projects/emerald > > > > is this first thing I've found in this category, in the > > current round of revisions of my log analysis notes... > > > > On Thu, 20 Dec 2001, Jon Stearley wrote: > > > > > What experience, thinking/dreaming, and interest do people have in > > > making the computer learn what is and isn't "normal" in syslog output? > > > ie- having the computer process/classify syslog output (or, an > > > arbitrary stream) and present it in a high signal/noise ratio manner? > > > I'm not talking about writing regexps, I'm talking about having the > > > computer infer/learn the regexps (characterization information, > > > regardless of its form) over some training period (ie- ongoing), and > > > then presenting the analysis in a high signal/noise ratio manner. it > > > could then tie into some action/response mechanism of which there are > > > many to choose from, but my interests are mainly in the learning > > > process. > > > > > > My thinking/researching/hacking has ranged from using simple > > > statistics, LZ77, natural language modelling (WHIRL, SRILM, others), > > > and sequencing algorithms (only TEIRESIAS yet) in this effort. It's > > > basically a statistics/signal-processing problem imho. I don't get > > > paid for this and haven't put sufficient personal time on it to make a > > > huge amount of progress/success, but it looks quite likely that I'll > > > be able to spend more time on it in the coming year. > > > > > > http://www.counterpane.com/log-analysis.html and other loggy spots I > > > know of are mostly "expert system" based (ie- we enumerate the expert > > > knowledge). I know this ai/buzzword/etc approach is not particularly > > > new, but it does appear to me to be unsolved - interesting, at least. > > > > > > I'm basically polling for pointers, experience/advice, and > > > collaborators. Thanks! > > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: loganalysis-unsubscribeat_private > > For additional commands, e-mail: loganalysis-helpat_private > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: loganalysis-unsubscribeat_private > For additional commands, e-mail: loganalysis-helpat_private > --------------------------------------------------------------------- To unsubscribe, e-mail: loganalysis-unsubscribeat_private For additional commands, e-mail: loganalysis-helpat_private
This archive was generated by hypermail 2b30 : Fri Dec 21 2001 - 11:17:30 PST