Good morning, Bennett, I think that some of the definitional difficulties we're having will be made clearer by my simply jumping into the fray and filling out my thoughts from last night, so I'm going to talk about issue #1, the sets of changes that applications and operating systems "ought" to log. On Tue, 20 Aug 2002, Bennett Todd wrote: > 2002-08-20-02:42:49 Tina Bird: > > 1) What sort of state changes "should" applications and operating systems > > log in the first place? --> A standard for programmers > > Perhaps it's my Unix upbringing, but I think it's best to have > adjustable logging levels; certainly, "alert", "info", and "debug" > are reasonable. A program should log an alert (log a descriptive > message at alert priority) when a human really urgently needs to > take a look. "info" is good for logging routine behavior, when > routine actions are something for which it's likely that some sites > would want to do reporting or stats or whatever. And "debug" should > dump enough details to help track down when you've mis-configured > something --- when the gizmo is doing what you told it, rather than > what you wanted it to do. > I'm not arguing with this, but this answer doesn't particularly address the question I was trying to ask. This is more a set of directives on how to prioritize the messages once they've been generated. But I believe there's a set of generic-enough administrative and error conditions that can be defined to provide guidelines for developers who are curious about what they ought to do. Once that list exists, developers and administrators can customize the severities to their own environment. So what's on the list? Here's my start for operating systems, with notes and queries -- bearing in mind that one of the tasks at hand is to generate these conditions on everyone's favorte systems so we can do things like answer question #5 (what >doesn't< get logged) in an orderly fashion: - System startup: are there multiple run levels? If so, system should record which level is starting in some way that a human can make sense of it - System shutdown: are there multiple modes of shutdown? Does the system have any capacity to send "oh my god i'm going down" messages in the case of an emergency crash or power loss? Are there distinctions between normal and abnormal shutdowns that can be differentiated in the logs? - File system full: including thresholds (default or user defined) -- boy wouldn't it be nice if the logs "automagically" included the three (or however many) biggest culprits in terms of file size or space consumed by a directory or folder in an error message? - Hardware failures: power supplies, network interfaces, etc. I am relatively uneducated about hardware diagnostics, other than Cisco gear... - Logins: failed and successful; console, remote (what protocol if remote); anonymous account, unprivileged user account, privileged user account, including switches to other users (unprivileged, privileged) from user accounts - Account creation: failed and successful; adding new user ID, assigning rights and privileges to new user, adding password to new user - Account modification: failed and successful; assigning or removing rights and privileges, resetting password; privileged user or unprivileged user - Account removal: failed and successful - Account disabled: too many failed logins, account expired, etc. - Password/security information copied: failed and successful - System configuration change: failed and successful; including access control, network addressing, audit policy; who made change, what changed, from system kernel on out to user-level applications - Operating system patch applied: who applied patch, what system components changed, source of patch (?) - Network connections: failed and successful connection attempts; anonymous service, user-specific service, access to administrative tools or control connection; DNS zone transfers, etc. - Audit logs: failed and successful attempts to modify or clear audit logs - Object access: failed and successful attempts to read files, start or stop processes, etc (understanding that most organizations will not need or want this level of detail) *whew* I'm sure I've left things out, and I'm sure this can be sorted into a less intimidating list of message categories. But in addition to worrying about format and how to handle the expected flow of data and how to protect audit data traveling across networks, we need to worry about what we expect to see. With regard to Marcus' post, this list represents my (not-sufficiently-caffeinated) first stab at a set of messages that could have standardized tokens across a variety of operating system platforms. In addition to these conditions, specific applications "should" record explicit errors when they fail to start due to misconfiguration (syslogd, anyone?); messages when they receive incorrect or unexpected input (yes, I know that in order to do this the programmer has to manage to detect incorrect or unexpected input, which is what creates buffer overflows in the first place, but since this is tbird daydreaming I'm allowed). What I'll do whilst everyone is discussing this list -- and fixing it ;-) -- is to start collecting samples of these messages from the data I've got and the machines in my lab. And getting it on the Web site. Go to! Bennett, does this clarify what I was getting at? tbird _______________________________________________ LogAnalysis mailing list LogAnalysisat_private https://lists.shmoo.com/mailman/listinfo/loganalysis
This archive was generated by hypermail 2b30 : Tue Aug 20 2002 - 07:08:41 PDT