2002-08-20-02:42:49 Tina Bird: > 1) What sort of state changes "should" applications and operating systems > log in the first place? --> A standard for programmers Perhaps it's my Unix upbringing, but I think it's best to have adjustable logging levels; certainly, "alert", "info", and "debug" are reasonable. A program should log an alert (log a descriptive message at alert priority) when a human really urgently needs to take a look. "info" is good for logging routine behavior, when routine actions are something for which it's likely that some sites would want to do reporting or stats or whatever. And "debug" should dump enough details to help track down when you've mis-configured something --- when the gizmo is doing what you told it, rather than what you wanted it to do. > 2) Given a particular operating system and/or system purpose (such as a > UNIX mail server, or a Windows Domain Controller, or whatever), what are > the (pick your favorite integer) 15 most frequently logged messages in > the elusive "typical" environment? What do they mean? Do we have sample > data? Those vary wildly from application to application, and often vary at least a little from version to version. Mail servers will log lots of stuff related to handling email, in formats dependant on the version of MTA you're running. If you're running a server with packet filtering on, and you tell it to log rejected packets, in many environments that will dominate the logs. > 3) Given a particular operating system and/or system purpose, what are > (pick your favorite integer) 15 messages that pretty much always mean bad > news: that the system has been compromised, that a catastrophic failure > has happened, however we choose to define "bad news" for that "typical" > environment? What >>is<< "bad news"? Do we have sample data? I think for many sites, the best approach is to hand-craft, for each special-purpose server, a swatchrc with ignore lines for each normal routine message, and alert lines for anything that doesn't match the normal stuff. Combine that with daily reporting of summaries of the routine stuff, suitable for monitoring long-term trends for capacity planning, and some availability monitoring stuff to catch when the box keels over altogether and when it gets overloaded, and you've got a pretty decent grip on what's happening. > 4) If you're a new system administrator and you're just starting to > integrate machines into a central logging infrastructure, where should you > start? Pick a decent logging protocol. AFAIK, syslog-ng is currently about as good as we've got. Build a logging box. Logging boxes like to have plenty of RAM for buffering, and they like to have fast disk subsystems. Remember, if you want to make use of the log data, you can't have the system anywhere near saturated; it's gotta have enough extra bandwidth for you to grep the logs while it continues to collect more. Thank goodness syslog-ng at least lets you log over TCP, avoiding the problem UDP-based syslog has of losing lots of messages when the system load goes up. There are different schools on log centralization design; I personally favour treating log data as generic goo, and wedging it all into a horking big server (praise cthulhu disk is so cheap), then pulling whatever bits I deem interesting out of that; I find it comforting for forensic analysis. Do make sure your clocks are nicely synced (ntp for hard cases, clockspeed where you can use it), and log everything in UTC nee GMT, dealing with timezones that stagger and lurch about whenever congress is in session is a pain. Other folks like to direct different grades of logs to different places, info to one place, alerts to another, debugging goo never leaves the original servers. > 5) What sort of situations do >>not<< create log data for default > configurations of a particular operating system or application? I'm not sure what you mean by this question. > It's hard to tell people to look for "weird things" in their log files > when we've got absolutely no resources -- other than the logs themselves > -- to provide that help describe what normal things look like. I dunno, I don't expect enough regularity from one server's log file to another, from one platform to another, from one point in time to another, to have much optimism about universal "normal" logfiles. > Maybe it's because I live in California now, but the idea of a > "quest for normal" really appeals to me ;-) Huh. I guess times have changed. 'Twas a time when that was more of a right-coast kinda goal, and the folks out on the left coast were chasing individuality:-). -Bennett
This archive was generated by hypermail 2b30 : Tue Aug 20 2002 - 05:56:51 PDT