Scripting Languages [Was: OT: Re: Secure popen]

From: Glynn Clements (glynn.clementsat_private)
Date: Thu Jun 21 2001 - 21:47:54 PDT

  • Next message: John Viega: "Re: Secure popen"

    ___cliff rayman___ wrote:
    
    > > b) has (reasonably) strong typing, and
    > 
    > i can see why this makes a program more efficient, but
    > not more secure.
    
    Typing provides a cross-check against oversights on the part of the
    programmer.
    
    Suppose that you have a data "pipeline" comprising a "chain" of
    functions:
    
    	result = fN(...f2(f1(input))...)
    
    where:
    
    	input : 'a
    	f1 : 'a -> 'b
    	f2 : 'b -> 'c
    	...
    	fN : 'y -> 'z
    	result : 'z
    
    Omitting any of the stages will quite possibly result in a type
    mismatch being detected by the compiler/interpreter.
    
    If, OTOH, the language only supports string values and functions of
    type "string -> string", you'll never get a type mismatch. 
    
    To take a specific example, exec() type functions require a list of
    strings as input. Either this list is constructed from its individual
    elements using well-defined operations (cons, list, concat), or (as is
    common with scripting languages) it results from splitting a single
    string which was made by gluing the elements together with string
    concatenation, often overlooking the possibility of the separator
    being contained within one of the arguments.
    
    However, "data typing" is a secondary issue. The primary issue is that
    of code and data forming distinct, non-overlapping classes of types. 
    Rather than just having "strings" which are data or code or both
    depending upon context or whether any punctuation characters are
    present.
    
    > > c) tends to be legible.
    > 
    > beauty is in the eye of the beholder.  perl is much more legible to
    > me than c++,
    
    Legibility isn't entirely subjective.
    
    To consider the specific case of Perl, the context-dependency issue
    (list vs scalar context) makes it harder to determine the semantics of
    a given expression (having to consider more information, i.e. context,
    can only make the task harder, not easier).
    
    The main issue with scripting languages tends to be with "substitute"
    functions, not so much in the sense of visible named functions
    (although that's bad enough), but in the fact that data is often
    passed to these functions without any clear indication; metacharacters
    are a lot less clear than a verbose name, and context is less clear
    still.
    
    The fact that these functions are often near-identities (i.e. they
    /are/ identities if the string contains no metacharacters) means that
    their presence often doesn't become apparent with testing, either.
    
    > > Scripting languages such as Perl are useful for quick hacks, but
    > > security-wise, they truly suck. Scan the BugTraq archives for
    > > references to CGI programs; I would guess that around 90% of
    > > vulnerabilities are due to the above.
    > 
    > i don't think so.  the majority of the program crashes in this world
    > are related to C/C++ and its use of pointers.
    
    By far the most common C issue on BugTraq is buffer overruns, which
    arise from fixed-size buffers, rather than pointers as such. The fact
    that this problem is common doesn't mean that it isn't trivial to
    avoid; even easier if you use string or vector classes.
    
    Easier still in CGI programs, where you can dynamically allocate
    everything and leave the OS to tidy up when the program terminates
    (which will happen shortly after the program starts, unlike daemons
    which need to run for extended periods, and so have to conserve
    memory).
    
    However, none of this has anything to do with the "scripting language"
    vs "conventional language" issue; it's a high-level vs low-level
    language issue. Fixed size buffers without built-in bounds checking
    are an artifact of C being a (low-level) system language rather than a
    (high-level) application language. More typical application languages
    don't have this issue; either they allocate the memory dynamically, or
    they generate an exception if you violate the bounds.
    
    > it is very easy to
    > write secure perl programs.  lots of people, especially beginners
    > just happen to write CGI programs in perl and since they are not
    > yet capable programmers, they write insecure code.  beginners
    > don't write CGI programs in C++ because it is outside the capability
    > of beginners to do so.  a skilled programmer will write quality code
    > with either language.
    
    Of course, it's technically possible to write secure code in any
    language.
    
    The point which I'm making is that scripting languages have particular
    pitfalls, and avoiding them requires not only a thorough (as in
    "language lawyer") understanding of the language, but the ability to
    apply that understanding continually and without exception.
    
    To re-visit the original question:
    
    > I understand popen() is not very secure, because it uses the shell to
    > execute the command, but I don't know of a safe alternative
    
    The answer to which seems to be: implement something which is almost
    exactly like popen() but without involving /bin/sh (or any other
    script interpreter) in the pipeline. That way, the data (string) which
    the user supplies stays as data throughout, rather than being
    magically turned into code somewhere along the line.
    
    BTW: Nobody seems to have (explicitly, at least) taken issue with the
    assertion that the insecurity stems from the use of "the shell" (which
    epitomises the concept of a "scripting" language; a "script" being a
    file of commands which are fed to the shell as if typed by the user).
    
    -- 
    Glynn Clements <glynn.clementsat_private>
    



    This archive was generated by hypermail 2b30 : Fri Jun 22 2001 - 10:27:04 PDT