Re: [loganalysis] SIDS 0.20

From: Ryan Russell (ryanat_private)
Date: Tue Aug 21 2001 - 10:22:09 PDT

  • Next message: Tina Bird: "Re: [loganalysis] any experience with parsers on nokia/ipso platform"

    On Mon, 20 Aug 2001, Tina Bird wrote:
    
    > Actually, I'd rather people sent replies with sample data to the list.
    > That way anyone who is working on any kind of parser can take advantage
    > of it.
    
    Fine by me... that's why I left it up to you.  Where it would live, I
    don't know.  SecurityFocus could certainly host some pages, but whoever is
    doing the maintainance work would have to go through me or another
    employee to get the pages posted as they are updated.  Ideally, you'd
    probably like to have a simple database to hold them and give people a way
    to submit them themselves.  I'm afraid we don't have any spare developer
    time at present to provide that function on our site.
    
    What I'm trying to make for each log type is a regex that splits it into
    different fields (which I stuff into a perl array).  I'd be willing to
    write a regex or two for each log type I get, and that would probably be
    useful for a number of applications.  I imagine we could make a simple DB
    entry or web page for each log type, that had a number of sample lines
    from the log, followed by some regexs (and sample output) and any other
    useful code snippets (and sample output.)
    
    For example (Roxen Customized):
    
    x.x.x.x - - [21/Jan/2001:00:00:34 -0800] "GET /topnews.html HTTP/1.0" 200
    4304 "-" "Lynx/2.8.1pre.9%20libwww-FM/2.14" "0" "1"
    
    x.x.x.x - - [21/Jan/2001:00:00:34 -0800] "GET /images/ads/BHW2KSF-a.gif
    HTTP/1.1" 200 16587 "http://www.securityfocus.com/frames/ad.html?group=home"
    "Mozilla/4.0%20(compatible;%20MSIE%204.01;%20Windows%2098)" "0" "0"
    
    $splitter = '^(\S+) (\S+) (\S+) \[(\S+) (\S+)\] "(\S+) ([^ \?]+)[^ ]*
    (\S+)" (\S+) (\S+)';
    while (<STDIN>)
    {
      @fields = /$splitter/o;
      print @fields,"\n";
    }
    
    Output:
    
    x.x.x.x--21/Jan/2001:00:00:34-0800GET/topnews.htmlHTTP/1.02004304
    x.x.x.x--21/Jan/2001:00:00:34-0800GET/images/ads/BHW2KSF-a.gifHTTP/1.120016587
    
    (Not the best example output, perhaps... there is no separater... could
    have a more obvious field seperater on a web page).
    
    So, my regex throws away some fields, strips off some variables (after ?
    in URL), and a few other things that meet my particular needs.
    
    I guess each regex or snippet would have to have some docs as well. :)
    
    					Ryan
    
    
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: loganalysis-unsubscribeat_private
    For additional commands, e-mail: loganalysis-helpat_private
    



    This archive was generated by hypermail 2b30 : Tue Aug 21 2001 - 11:18:33 PDT