RE: malicious code

From: Steven M. Christey (coleyat_private)
Date: Tue Jan 28 2003 - 13:30:18 PST

  • Next message: Michael McKay: "RE: PGP scripting..."

    "Jason Coombs" <jasoncat_private> said:
    
    >My experience has been that if you simply ask yourself the question
    >"what should this program NOT do/allow?" and then go try to make it
    >do/allow whatever your answer is, often times you will find security
    >flaws immediately.
    >
    >Programmers are so busy focusing on coding to specs, adding features
    >that make software do things, that they usually don't consider
    >adequately the anti-spec: the list of things the software should never
    >do.
    
    A formal spec would be a very nice feature for *all* software -
    shouldn't every program explicitly tell you which ports it will be
    using, which parts of the file system it will be accessing, etc.?
    
    However, in the case of large-scale systems - say, more than a single
    50K binary program - you may encounter major problems just trying to
    determine what the spec is supposed to be, before you can tell if
    there's anything wrong with the code.  If the system is under active
    development or its design/implementation contains trade secrets, then
    even high-level specs might not be available, and you won't
    necessarily know which capabilities are legitimate or not.  In this
    case, you may want to start by looking at portions of the code that
    interact with the operating system - file operations, networking,
    registry, processes, etc. - and work backwards to see which conditions
    enable that code.  But that won't find intentional "business logic"
    errors, which could be equally damaging from the consumer's
    perspective.
    
    If the malicious party is smart, then they probably won't use the
    typical "code patterns" like using standard library functions with
    hard-coded arguments.  As Crispin Cowan said, embedding a
    vulnerability would be one method for introducing malicious code in a
    non-obvious fashion.  But that approach could require some sort of
    external trigger that may not provide the degree of control that a
    malicious party would want; it would depend on the party's goals.
    Binary obfuscation techniques like in-line compression and encryption,
    anti-debugging, and anti-disassembly would make analysis more
    resource-intensive, and the malicious party could claim that they are
    using these techniques to protect their intellectual property.
    
    Note: I'm talking mostly from a static analysis perspective here.
    
    >There are only so many ways for bad code to get involved in
    >interprocess communication, and it's pretty easy to analyze all such
    >communication pathways in detail -- even without access to source
    >code.
    
    This still requires some knowledgeable analysis.  For example, Windows
    programs may dynamically load various functions from DLLs, using
    obfuscated names in the binary itself, and saving the address in a
    global variable.  This prevents the analyst from running a simple grep
    to look for, say, socket code.  If the system is broken into multiple
    libraries, and you want to perform a comprehensive static analysis,
    then you need to know how those libraries interact.
    
    And just focusing on IPC limits the scope of what you can find.  For
    example, sensitive information could be squirreled away in data fields
    that are not normally visible to the application or the user.
    
    >Many forensic analysts prefer to read assembly anyway; source code is
    >an illusion, only approximating what a program is capable of doing or
    >being forced to do.
    
    It would be very interesting to hear the experiences of other people
    who have tried to analyze binary-only code totalling more than, say,
    100K in size.  In my limited experience with such large-scale
    analysis, coming up with even a minimum level of confidence can be a
    daunting, resource-intensive task.
    
    Another issue comes in actually communicating to the
    consumer/management what you've looked for and how much remains
    uncertain.  Here, I've tried a needle-and-haystack analogy with some
    success: finding malicious code is like looking for a needle in a
    haystack, but hopefully you can state with some confidence whether the
    haystack contains any tractors or pitchforks.
    
    - Steve
    



    This archive was generated by hypermail 2b30 : Tue Jan 28 2003 - 14:18:59 PST