Re: [logs] Logging: World Domination

From: Bennett Todd (betat_private)
Date: Tue Aug 20 2002 - 05:32:55 PDT

  • Next message: Marcus J. Ranum: "Re: [logs] Logging: World Domination"

    2002-08-20-02:42:49 Tina Bird:
    > 1) What sort of state changes "should" applications and operating systems
    > log in the first place?  --> A standard for programmers
    
    Perhaps it's my Unix upbringing, but I think it's best to have
    adjustable logging levels; certainly, "alert", "info", and "debug"
    are reasonable. A program should log an alert (log a descriptive
    message at alert priority) when a human really urgently needs to
    take a look. "info" is good for logging routine behavior, when
    routine actions are something for which it's likely that some sites
    would want to do reporting or stats or whatever. And "debug" should
    dump enough details to help track down when you've mis-configured
    something --- when the gizmo is doing what you told it, rather than
    what you wanted it to do.
    
    > 2) Given a particular operating system and/or system purpose (such as a
    > UNIX mail server, or a Windows Domain Controller, or whatever), what are
    > the (pick your favorite integer) 15 most frequently logged messages in
    > the elusive "typical" environment?  What do they mean?  Do we have sample
    > data?
    
    Those vary wildly from application to application, and often vary at
    least a little from version to version. Mail servers will log lots
    of stuff related to handling email, in formats dependant on the
    version of MTA you're running. If you're running a server with
    packet filtering on, and you tell it to log rejected packets, in
    many environments that will dominate the logs.
    
    > 3) Given a particular operating system and/or system purpose, what are
    > (pick your favorite integer) 15 messages that pretty much always mean bad
    > news: that the system has been compromised, that a catastrophic failure
    > has happened, however we choose to define "bad news" for that "typical"
    > environment?  What >>is<< "bad news"?  Do we have sample data?
    
    I think for many sites, the best approach is to hand-craft, for each
    special-purpose server, a swatchrc with ignore lines for each normal
    routine message, and alert lines for anything that doesn't match the
    normal stuff. Combine that with daily reporting of summaries of the
    routine stuff, suitable for monitoring long-term trends for capacity
    planning, and some availability monitoring stuff to catch when the
    box keels over altogether and when it gets overloaded, and you've
    got a pretty decent grip on what's happening.
    
    > 4) If you're a new system administrator and you're just starting to
    > integrate machines into a central logging infrastructure, where should you
    > start?
    
    Pick a decent logging protocol. AFAIK, syslog-ng is currently about
    as good as we've got. Build a logging box. Logging boxes like to
    have plenty of RAM for buffering, and they like to have fast disk
    subsystems. Remember, if you want to make use of the log data, you
    can't have the system anywhere near saturated; it's gotta have
    enough extra bandwidth for you to grep the logs while it continues
    to collect more. Thank goodness syslog-ng at least lets you log over
    TCP, avoiding the problem UDP-based syslog has of losing lots of
    messages when the system load goes up.
    
    There are different schools on log centralization design; I
    personally favour treating log data as generic goo, and wedging it
    all into a horking big server (praise cthulhu disk is so cheap),
    then pulling whatever bits I deem interesting out of that; I find
    it comforting for forensic analysis. Do make sure your clocks are
    nicely synced (ntp for hard cases, clockspeed where you can use
    it), and log everything in UTC nee GMT, dealing with timezones that
    stagger and lurch about whenever congress is in session is a pain.
    
    Other folks like to direct different grades of logs to different
    places, info to one place, alerts to another, debugging goo never
    leaves the original servers.
    
    > 5) What sort of situations do >>not<< create log data for default
    > configurations of a particular operating system or application?
    
    I'm not sure what you mean by this question.
    
    > It's hard to tell people to look for "weird things" in their log files
    > when we've got absolutely no resources -- other than the logs themselves
    > -- to provide that help describe what normal things look like.
    
    I dunno, I don't expect enough regularity from one server's log file
    to another, from one platform to another, from one point in time to
    another, to have much optimism about universal "normal" logfiles.
    
    > Maybe it's because I live in California now, but the idea of a
    > "quest for normal" really appeals to me ;-)
    
    Huh. I guess times have changed. 'Twas a time when that was more of
    a right-coast kinda goal, and the folks out on the left coast were
    chasing individuality:-).
    
    -Bennett
    
    
    

    _______________________________________________ LogAnalysis mailing list LogAnalysisat_private https://lists.shmoo.com/mailman/listinfo/loganalysis



    This archive was generated by hypermail 2b30 : Tue Aug 20 2002 - 05:56:51 PDT