[loganalysis] smart alerting from log analysis

From: Nate Campi (nateat_private)
Date: Fri Aug 24 2001 - 15:01:57 PDT

  • Next message: Brian Hatch: "Re: [loganalysis] stopping/starting swatchn"

    I feel this is related to the topic of log analysis - the smart handling
    and escalation of the alerts. In our shop we have a home-grown system we
    call "escalationbot" that collects alerts and sends them to our oncall
    Ops person. If that person doesn't cancel the alert withing thirty
    minutes, it escalates to a secondary oncall person, and then it sends
    mail to our whole team (after another thirty minutes).
    
    Part of what it does is automatically cancel an escalation if a
    BigBrother or Mon message comes in that reports that the problem is
    cleared up (BB "green" or Mon "UPALERT").
    
    Escalationbot is written entirely in bash scripting, handling only email
    alerts, and sending pages with qpage and a modem.
    
    What I'm wondering is how many other people have grown their own
    escalation system, and if this should start being shared on this list as
    well. The handling of the alerts is just as important as generating the
    alerts in the first place, IMHO.
    
    I'd actually be happy to look at other people's implementions, either to
    improve ours or ditch it for something better.
    
    Thoughts?
    -- 
    Nate Campi  (415) 276-8678  UNIX Ops, Terra Lycos - WiReD SF
    
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: loganalysis-unsubscribeat_private
    For additional commands, e-mail: loganalysis-helpat_private
    



    This archive was generated by hypermail 2b30 : Fri Aug 24 2001 - 17:08:07 PDT