> -----Original Message----- > From: Tina Bird [mailto:tbird@precision-guesswork.com] > Sent: Tuesday, August 20, 2002 10:35 AM >snip!< > Look, everyone, presumably at least some of us have access to log data on > a "live server." I posit that we'll learn really really interesting > things by taking a day's or a week's worth of data and looking at the > messages. We're not building a standard here, I'm after > quick and dirty. Guidance for someone just starting out. > > So if everyone gets their logs into a text format (for those > of you who aren't on UNIX boxen) and does something like: > > cat /your/log/files* \ > |sed -e "/^... ...........$HOSTNAME //" -e "s/\[[0-9]]*\]:/:/" \ > |sort |uniq -c |sort -nr > uniq.sorted.freq > > we'll get actually observational data on what shows up on production > machines. > > I'm not claiming it will be the same for everyone. I'm > claiming it will > teach us something, and that by providing that kind of > "here's what shows > up" view of things we'll make it easier for the newbies. > > My hope is that by providing this sort of information we'll > make it easier > for people to get up to speed on what is and is not typical > >>for them<<. I think perhaps we need two types of information: 1. A basic methodology for actually profiling boxen, based on the OS, the intended use, location within the network, etc. There are as many approaches to this as there are tools to collect the data, if not more. However, providing some guidelines or direction as to what data is the most useful to collect and what are some of the more universally accepted "useful" ways to look at that data will go a long way. 2. A repository of sanitized profiles, where a specific type of configuration is described (e.g., web, DNS and mail servers located in a DMZ, all running on a single flavor of *nix), and a snapshot of what has been determined as "normal" activity for that config. These two items will probably give people starting out with an idea of how to look at the data being collected within their own environments, and maybe how to adapt existing profiles to their own environments. > > I'll posit the next straw man to torch: what would be useful is a > > standardized methodology for how to turn two weeks of > verbose logging > > into a template against which to compare "normal", "abnormal" and > > "catastrophic" at your particular site/application. This does assume > > your first point of somewhat standardized logging being available on > > all critical OSs and Apps. > > Remember that by "standardized logging" I'm not >>even<< > worrying about > log formats or severities --just message categories. Taking > yet another > step back. > > This strawman is certainly one of the goals. Unless sufficient pressure is placed on vendors to provide a standard format (beyond what is covered in BSD syslog, which is barely enough to hang your hat on), you won't see it happen. Even in standards efforts such as syslog-reliable and the Intrusion Detection Working Group (IDMEF/IDXP), there are more standardized formats, but the _meanings_ of crucial fields can still vary widely from vendor to vendor. I don't know where that pressure is going to come from or how effective it will be, but I am not holding my breath. This is one reason why, at least for the short to medium term, I think the methodology or approach is more key. Knowing what data to look at, and what to look for (in general) will allow people to adapt the process to the data they have available to them, regardless of format. Getting folks to actually look at and try to understand the data they are logging, rather than just looking for one or two stock events that are known issues, is really what is important. -- J. Gregory Wright Senior Software Engineer AT&T Information Security Center Cyber Defense Platform Development _______________________________________________ LogAnalysis mailing list LogAnalysisat_private https://lists.shmoo.com/mailman/listinfo/loganalysis
This archive was generated by hypermail 2b30 : Tue Aug 20 2002 - 10:47:16 PDT