I run into a variety of list members at conferences and such. Those of you who have seen me in person in the last six months have probably heard parts of my "why logging sucks" rant, and may have heard me threaten to start a couple of list discussions related to those issues. I've been threatening to start consensus building (i.e. stating my claims on the list and watching those of you with strong opinions correct me) on the following issues: 1) What sort of state changes "should" applications and operating systems log in the first place? --> A standard for programmers 2) Given a particular operating system and/or system purpose (such as a UNIX mail server, or a Windows Domain Controller, or whatever), what are the (pick your favorite integer) 15 most frequently logged messages in the elusive "typical" environment? What do they mean? Do we have sample data? 3) Given a particular operating system and/or system purpose, what are (pick your favorite integer) 15 messages that pretty much always mean bad news: that the system has been compromised, that a catastrophic failure has happened, however we choose to define "bad news" for that "typical" environment? What >>is<< "bad news"? Do we have sample data? 4) If you're a new system administrator and you're just starting to integrate machines into a central logging infrastructure, where should you start? 5) What sort of situations do >>not<< create log data for default configurations of a particular operating system or application? We spend a lot of energy worrying about what syslog server application to use, how to transport the data, how to archive it, but there are a lot of issues bigger even than getting the logs out of the damn originating applications and servers. If we can reach any sort of consensus on these issues then we can actually build >>useful<< templates for swatch, logsurfer, and the other log parsing tools out there. And we can work on tools that can find deviations from baseline numbers if we can come up with a guess for what set of messages define the baseline. It's hard to tell people to look for "weird things" in their log files when we've got absolutely no resources -- other than the logs themselves -- to provide that help describe what normal things look like. Maybe it's because I live in California now, but the idea of a "quest for normal" really appeals to me ;-) I suppose it's possible that a couple of the commercial log management systems -- NetForensics or Intellitactics -- may already have the answers to these questions, but I bet they don't have the visibility into the large number and types of networks that we have here. Over the next couple of days, now that I've finally admitted to working on this in public, I will be documenting my first pass at answers to these questions, based on my own research and on the data in Counterpane's customer base (suitably sanitized, of course). Please rev up your engines for the discussion...and I'll warn the Log Analysis Webmistress about the sort of chaos we're likely to be creating. cheers -- tbird "Wine is strong, the King is stronger, women are strongest, but TRUTH conquers all." ----- Inscription in the Rosslyn Chapel (near Edinburgh, Scotland) http://www.shmoo.com/~tbird Log Analysis http://www.counterpane.com/log-analysis.html VPN http://vpn.shmoo.com _______________________________________________ LogAnalysis mailing list LogAnalysisat_private https://lists.shmoo.com/mailman/listinfo/loganalysis
This archive was generated by hypermail 2b30 : Mon Aug 19 2002 - 23:49:42 PDT