RE: [logs] What's normal?

From: Wright, Joseph G (Gregory), SOLCM (josephgwrightat_private)
Date: Tue Aug 20 2002 - 08:09:41 PDT

  • Next message: Jason Royes: "Re: [logs] Re: Central Log Server"

    > -----Original Message-----
    > From: Tina Bird [mailto:tbird@precision-guesswork.com]
    > Sent: Tuesday, August 20, 2002 10:35 AM
    
    >snip!<
    
    > Look, everyone, presumably at least some of us have access to log data
    on
    > a "live server."  I posit that we'll learn really really interesting
    > things by taking a day's or a week's worth of data and looking at the
    > messages.  We're not building a standard here, I'm after 
    > quick and dirty. Guidance for someone just starting out.
    > 
    > So if everyone gets their logs into a text format (for those 
    > of you who aren't on UNIX boxen) and does something like:
    > 
    > cat /your/log/files* \
    > |sed -e "/^... ...........$HOSTNAME //" -e "s/\[[0-9]]*\]:/:/" \
    > |sort |uniq -c |sort -nr > uniq.sorted.freq
    > 
    > we'll get actually observational data on what shows up on production
    > machines.
    > 
    > I'm not claiming it will be the same for everyone.  I'm 
    > claiming it will
    > teach us something, and that by providing that kind of 
    > "here's what shows
    > up" view of things we'll make it easier for the newbies.
    > 
    > My hope is that by providing this sort of information we'll 
    > make it easier
    > for people to get up to speed on what is and is not typical 
    > >>for them<<.
    
    I think perhaps we need two types of information:
    
    1. A basic methodology for actually profiling boxen, based on the OS,
       the intended use, location within the network, etc. There are as
       many approaches to this as there are tools to collect the data, if
       not more. However, providing some guidelines or direction as to 
       what data is the most useful to collect and what are some of the 
       more universally accepted "useful" ways to look at that data will
       go a long way.
    
    2. A repository of sanitized profiles, where a specific type of
       configuration is described (e.g., web, DNS and mail servers located
       in a DMZ, all running on a single flavor of *nix), and a snapshot
       of what has been determined as "normal" activity for that config.
    
    These two items will probably give people starting out with an idea
    of how to look at the data being collected within their own
    environments,
    and maybe how to adapt existing profiles to their own environments.
    
    > > I'll posit the next straw man to torch: what would be useful is a
    > > standardized methodology for how to turn two weeks of 
    > verbose logging
    > > into a template against which to compare "normal", "abnormal" and
    > > "catastrophic" at your particular site/application. This does assume
    > > your first point of somewhat standardized logging being available on
    > > all critical OSs and Apps.
    > 
    > Remember that by "standardized logging" I'm not >>even<< 
    > worrying about
    > log formats or severities --just message categories.  Taking 
    > yet another
    > step back.
    > 
    > This strawman is certainly one of the goals.
    
    Unless sufficient pressure is placed on vendors to provide a standard
    format (beyond what is covered in BSD syslog, which is barely enough to
    hang your hat on), you won't see it happen. Even in standards efforts
    such as syslog-reliable and the Intrusion Detection Working Group 
    (IDMEF/IDXP), there are more standardized formats, but the _meanings_
    of crucial fields can still vary widely from vendor to vendor. I don't
    know where that pressure is going to come from or how effective it will
    be, but I am not holding my breath.
    
    This is one reason why, at least for the short to medium term, I think
    the methodology or approach is more key. Knowing what data to look at,
    and what to look for (in general) will allow people to adapt the process
    to the data they have available to them, regardless of format. Getting
    folks to actually look at and try to understand the data they are
    logging,
    rather than just looking for one or two stock events that are known
    issues,
    is really what is important.
    
    --
    J. Gregory Wright
    Senior Software Engineer
    AT&T Information Security Center
    Cyber Defense Platform Development
    _______________________________________________
    LogAnalysis mailing list
    LogAnalysisat_private
    https://lists.shmoo.com/mailman/listinfo/loganalysis
    



    This archive was generated by hypermail 2b30 : Tue Aug 20 2002 - 10:47:16 PDT