RE: safe strcpy()?

From: Daniel Reed (nat_private)
Date: Wed Jan 29 2003 - 14:04:58 PST

  • Next message: Dave Aitel: "Re: safe strcpy()?"

    On 2003-01-28T22:00-0600, Hall, Philip wrote:
    ) > Of course, the real way to build secure software is not
    ) > to use "safe" functions, but to check data validity :-)
    ) Hang on, that sounds akin to not having locks (safe functions) on your
    ) front door, but posting a guard (data validation) at the end of your
    ) drive way...hmmmmm I think I'll stick to my eXtreme Defensive
    ) Programming (XDP) and be paranoid about everything...unless you meant
    ) that by *adding* the data validity to the 'safe' functions to beef them
    ) up...?
    
    This discussion, while bringing up some interesting points, largely misses
    the point of what "safe" programming involves.
    
    For example, in one package I maintain I have to deal with converting
    printable strings into HTML entities. Due to protocol constraints, it's
    possible that the encoded string might be too large to send whole, so I
    split such strings at the entity boundary (see below).
    
    For each character in a printable string, I check whether it needs to be
    encoded or not. Once I have determined what entity will be sent in place of
    the original character, I check whether adding that entity to my buffer
    would push it past its limit. If so, I stop copying and send the buffer
    along before attempting to add the next character.
    
    This way, if the protocol I was dealing with limited strings to (let's say)
    22 characters, a string such as:
    	"1 < 3  -  That's the truth"
    might be split into:
    
    	 1234567890123456789012
    	"1 &lt; 3 &nbsp;- "
    	"&nbsp;That's the truth"
    
    instead of the less desirable:
    
    	 1234567890123456789012
    	"1 &lt; 3 &nbsp;- &nbsp"
    	";That's the truth"
    
    The former would decode into "I < 3  - " + " That's the truth", and could be
    glued back to the original "I < 3  -  That's the truth", whereas the latter
    would decode into "I < 3  - &nbsp" + ";That's the truth" and be recombined
    into "I < 3  - &nbsp;That's the truth". Whoops.
    
    Now, the code in question was originally written with a blind fear of buffer
    overflows clouding the original authour's style, and worked something like:
    
    	if (input[i] == ' ') {
    		strncpy(output+outputpos, sizeof(output)-outputpos, "&nbsp;");
    		outputpos += sizeof("&nbsp;")-1;
    	}
    
    This would allow a space occuring near the end of "output" to be truncated
    into "&nbsp", as in the example above. The new code is similar to:
    
    	if (input[i] == ' ')
    		if ((outputpos + sizeof("&nbsp;")) < sizeof(output)) {
    			strcpy(output+outputpos, "&nbsp;");
    			outputpos += sizeof("&nbsp;")-1;
    		} else
    			break;
    
    This allows the loop to break once the "output" buffer has become filled,
    for all intents and purposes, and will allow the procedure to empty "output"
    and start from where it left off (so the space wouldn't appear at all in the
    current line, and would instead appear whole in the next line).
    
    
    Security is indeed very important, and if more people made secure code-
    writing a priority, a lot of our lives would become much easier. However,
    there are no magic wands in programming:
    	Replacing strcpy()'s with strncpy()'s will not solve all problems,
    and may in fact introduce new ones. In the above example, strncpy() did not
    itself cause a problem, but its ignorant usage led to a misbehaviour.
    	Using manipulation routines that ensure the string is large enough
    to "hold" everything can lead to its own problems. A quick example: reading
    data from the network; all someone need do is feed your service a constant
    stream of characters, eventually the program will fill all available memory
    trying to store the string. Again, it would be a programmer ignorantly
    feeding a network socket directly into a string (as I've seen provided in
    examples on this very list). However, in all of these cases, programmer
    failure seems to be a common thread. There is no intrinsic flaw in the
    methods or implementations they are using.
    
    -- 
    Daniel Reed <nat_private>
    Real computer scientists like having a computer on their desk, else how could they read their mail?
    naim FAQ: http://128.113.139.111/~n/naim/FAQ
    



    This archive was generated by hypermail 2b30 : Wed Jan 29 2003 - 15:32:49 PST