Clusters and the meaning of life

From: Andrew Rosen (asrdataat_private)
Date: Wed Dec 18 2002 - 22:49:51 PST

  • Next message: Guy Harris: "Re: [tcpdump-workers] Re: TCP/UDP Data Streams - Packet Reassembly"

    Earlier this year, I took a case involving many
    terabytes of data and seriously considered a cluster
    implementation.  I had ~327 desktop images + 2 servers
    and everyone wanted to know what had been deleted from
    each file system.  Systems consisted of a
    heterogeneous menagerie of operating and file systems.
    
    The investigative environment was very dynamic,
    lawyers coming in with fresh keyword lists, selected
    images assigned higher and lower priorities, new
    systems coming in and instructions to drop
    everything... quite difficult to anticipate which way
    the wind was going to blow next.
    
    I have found, in my limited experience, that this type
    of environment lends itself to chaos quite easily.  In
    a static environment in which you can say "This is the
    universe of data, I want to do x, y and z" a cluster
    could indeed probably reduce the time to perform the
    tasks by an order of magnitude.
    
    As a pragmatist, I ended up running six 1 Ghz boxes
    running optimized 2.4.X kernels, each dedicated to a
    data set and predefined tasks (listing/reporting,
    hashing, searching, etc.).  This afforded me the
    flexibility I needed and allowed me to crank out quite
    a bit of work while retaining the ability to adapt to
    the client's changing priorities. 
    
    I think a cluster configured to chew on all the data
    at once would have been ill suited to that environment
    but if I had a secret lab, deep in the base of a
    mountain and an unlimited budget...
    
    --------------------------------------------------
    
    "Forty-two!" yelled Loonquawl. "Is that all you've got
    to show for seven and a half million years' work?"
    "I checked it very thoroughly," said the computer,
    "and that quite definitely is the answer. I think the
    problem, to be quite honest with you, is that you've
    never actually known what the question is."
    
    Douglas Adams
    ------------------------------------------------------
    
    Regards -
    
    Andrew Rosen
    ASR Data
    --- "Mark St.Laurent" <nmstlauat_private> wrote:
    > Interesting, Thinking off the top of my head,
    > 
    > I can see a beowolf option come into play if a
    > facility (for example) performs forensics on Network
    > Intrusions, and there is terabytes of data to go
    > through.
    [snip]
    
    __________________________________________________
    Do you Yahoo!?
    Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
    http://mailplus.yahoo.com
    
    -----------------------------------------------------------------
    This list is provided by the SecurityFocus ARIS analyzer service.
    For more information on this free incident handling, management 
    and tracking system please see: http://aris.securityfocus.com
    



    This archive was generated by hypermail 2b30 : Thu Dec 19 2002 - 19:30:31 PST