this is exactly what i had in mind. the next step might be for us to start trying to >classify< messages in this fashion. best i've been able to come up with for figuring out, for instance, what sendmail messages correspond to which severity is to set up 7 different selectors in my syslog.conf, and see what ends up where (clearly for situations in which one doesn't have source code)... i guess we need to also encourage developers to document this stuff, or to include the severity of the message in the message, the way cisco does. On 16 Aug 2001, Hal Snyder wrote: > Date: 16 Aug 2001 13:05:12 -0500 > From: Hal Snyder <halat_private> > To: loganalysisat_private > Subject: Re: [loganalysis] Logging standards and such > > Tina Bird <tbird@precision-guesswork.com> writes: > > > So maybe we could reach consensus on the categories of events that > > fall into different syslog priorities, for a start? > > Is this what you meant? - or is it a waste of time :-) > > Priority = (facility, level) > > Facilities are hard - they are so open ended, and how finely you cut > it depends on how much granularity your operation has in any area. > No solution wins for everyone. > > > Levels are much easier: > > For example take an ops point of view on the levels, classifying > events on the basis of what action should be taken when they occur. > Imagine flow of entries into a NOC and fielded by ops staff. > > Paraphrasing "man syslog" on my workstation: > > LOG_DEBUG info useful to developers for debugging the app, not useful > during operations > > LOG_INFO normal operational messages - may be harvested for reporting, > measuring throughput, etc - no action required > > LOG_NOTICE events that are unusual but not error conditions - > might be summarized in an email to developers or admins > to spot potential problems - no immediate action required > > LOG_WARNING warning messages - not an error, but indication that an > error will occur if action is not taken, e.g. filesystem > 85% full - each item must be resolved within a given time > > LOG_ERR non-urgent failures - these should be relayed to > developers or admins; each item must be resolved within > a given time > > LOG_ALERT should be corrected immediately - notify staff who can fix > the problem - example is loss of backup ISP connection > > LOG_CRIT should be corrected immediately, but indicates failure in > a primary system - fix LOG_CRIT problems before LOG_ALERT > - example is loss of primary ISP connection > > LOG_EMERG a "panic" condition - notify all tech staff on call? > (earthquake? tornado?) - affects multiple apps/servers/sites... > > --------------------------------------------------------------------- > To unsubscribe, e-mail: loganalysis-unsubscribeat_private > For additional commands, e-mail: loganalysis-helpat_private > VPN: http://kubarb.phsx.ukans.edu/~tbird/vpn.html life: http://kubarb.phsx.ukans.edu/~tbird work: http://www.counterpane.com --------------------------------------------------------------------- To unsubscribe, e-mail: loganalysis-unsubscribeat_private For additional commands, e-mail: loganalysis-helpat_private
This archive was generated by hypermail 2b30 : Thu Aug 16 2001 - 11:45:23 PDT