I run into a variety of list members at conferences and such. Those of
you who have seen me in person in the last six months have probably heard
parts of my "why logging sucks" rant, and may have heard me threaten to
start a couple of list discussions related to those issues. I've been
threatening to start consensus building (i.e. stating my claims on the
list and watching those of you with strong opinions correct me) on the
following issues:
1) What sort of state changes "should" applications and operating systems
log in the first place? --> A standard for programmers
2) Given a particular operating system and/or system purpose (such as a
UNIX mail server, or a Windows Domain Controller, or whatever), what are
the (pick your favorite integer) 15 most frequently logged messages in
the elusive "typical" environment? What do they mean? Do we have sample
data?
3) Given a particular operating system and/or system purpose, what are
(pick your favorite integer) 15 messages that pretty much always mean bad
news: that the system has been compromised, that a catastrophic failure
has happened, however we choose to define "bad news" for that "typical"
environment? What >>is<< "bad news"? Do we have sample data?
4) If you're a new system administrator and you're just starting to
integrate machines into a central logging infrastructure, where should you
start?
5) What sort of situations do >>not<< create log data for default
configurations of a particular operating system or application?
We spend a lot of energy worrying about what syslog server application to
use, how to transport the data, how to archive it, but there are a lot of
issues bigger even than getting the logs out of the damn originating
applications and servers. If we can reach any sort of consensus on
these issues then we can actually build >>useful<< templates for swatch,
logsurfer, and the other log parsing tools out there. And we can work on
tools that can find deviations from baseline numbers if we can come up
with a guess for what set of messages define the baseline.
It's hard to tell people to look for "weird things" in their log files
when we've got absolutely no resources -- other than the logs themselves
-- to provide that help describe what normal things look like. Maybe it's
because I live in California now, but the idea of a "quest for normal"
really appeals to me ;-)
I suppose it's possible that a couple of the commercial log management
systems -- NetForensics or Intellitactics -- may already have the answers
to these questions, but I bet they don't have the visibility into the
large number and types of networks that we have here.
Over the next couple of days, now that I've finally admitted to working on
this in public, I will be documenting my first pass at answers to these
questions, based on my own research and on the data in Counterpane's
customer base (suitably sanitized, of course). Please rev up your engines
for the discussion...and I'll warn the Log Analysis Webmistress about the
sort of chaos we're likely to be creating.
cheers -- tbird
"Wine is strong, the King is stronger, women are strongest, but TRUTH
conquers all."
----- Inscription in the Rosslyn Chapel (near Edinburgh, Scotland)
http://www.shmoo.com/~tbird
Log Analysis http://www.counterpane.com/log-analysis.html
VPN http://vpn.shmoo.com
_______________________________________________
LogAnalysis mailing list
LogAnalysis@lists.shmoo.com
https://lists.shmoo.com/mailman/listinfo/loganalysis
This archive was generated by hypermail 2b30 : Mon Aug 19 2002 - 23:49:42 PDT