Re: strcpy versus strncpy

From: Mark Whitis (whitisat_private)
Date: Wed Mar 04 1998 - 13:48:03 PST

  • Next message: der Mouse: "Re: strcpy versus strncpy"

    On Tue, 3 Mar 1998, Chris L. Mason wrote:
    > What's wrong with using the following (I got the idea from some of
    > Stevens' code)?
    >
    > char *sstrcpy(char *dst, size_t n, const char *src) {
    >     if (strlen(src) > (n - 1)) {
    >         errno = ENOSPC;
    >         return NULL;
    >     }
    >
    >     strcpy(dst, src);
    >     dst[n - 1] = '\0';
    >
    >     return dst;
    > }
    
    Note: This is a long message which discusses strncpy() at length
    and includes a new, improved, lemon-scented version.
    
    As you point out, it is not multithread safe and there is no
    good reason this has to be the case.  But there is a more
    serious problem which will manifest itself if you are not
    extremely carefull to always check the errno and never
    use the result if errno is set.
    
    This, and at least one other code example posted here have
    a serious flaw:  in the event of a buffer overflow, they
    fail to terminate the string (at zero length).
    If the calling function ever forgets to check for an error
    before using the result (or uses the result even though
    it checked for an error), you will get a surprise: either
    you will get an old value or you will get a garbage value
    which may result in buffer overflows or segment faults elsewhere.
    Even if you do check the return code, it is easy for situations
    to creep in where the value is still used for some purposes later.
    
    Almost all string functions should poke a zero into the first
    byte of the return string as long as two conditions are met: the
    pointer is not null and the string length is greater than zero.
    I sometimes do this at the very beginning along with copy
    routines that always add a zero after each byte copied; this
    insures that the string will always be valid even if the
    function ends or if the value is looked at from another thread
    (although you should never rely on the value being valid
    unless you have taken precautions to insure against
    simultaneous access).
    
    Also, many string functions are much friendlier if they treat
    a NULL pointer the same as a null string for source values.
    In some cases, this friendly behavior might let something slip
    through but I think it is best to save the programmer from unneccessary
    busy work so they have time to solve the more serious problems.
    For example:
       safe_strncpy(tmpdir,getenv("TEMP"),sizeof(tmpdir));
    In those few cases where a null string is undesirable, you should
    check for it explictly since it could result from other causes anyway.
    
    A good string copy should:
      - check for zero length on destination string
        options:
           assert()
           return exception
           do nothing
      - check for null pointer on destination string
        options:
           assert()
           return exception
           do nothing
      - check for null pointer on source string
        options:
           assert()             (rarely desireable)
           return exception  (rarely desireable)
           return zero length string
      - check for overflow
        options:
           assert()
           return exception and truncated string, zero terminated string
           return exception and zero length string
           return exception and unmodified string (Bad!)
           return truncated, zero terminated string
           return zero length string
           overflow (Very bad)
    
       Filling options:
           - Always zero fill to end of buffer
              slow with large buffers
              prevents information leaks
              tends to encourage immediate detection of overflow bugs
           - Never fill beyond what is needed
             may be faster
             strncpy(dest,src,INT_MAX) == strcpy(dest,src)
           - Moving zero terminate
           - poke zero into last byte of buffer
             this is the easiest workaround without a wrapper or replacement
             function but generally the least desireable.
    
    Usually, I prefer any string function to treat NULL and "" identically
    for a source string; this tends to behave well in most real world
    situations and saves writing a lot of extra code; #ifdef PEDANTIC,
    you can return an exception or abort using assert() here.
    
    I really dislike the use of dynamicly allocated strings for
    mission critical applications.  They have generally caused much
    grief, although usually much more so when written by someone else.
    Memory leaks and fragmentation problems have no place in
    code which needs to be robust and secure.  In preforked servers,
    if there is any dynamic memory code I normally kill off each
    subprocess after some finite number of uses to protect against
    leaks. I also monitor the process size for leaks.  It
    will typically grow for some number of iterations and then
    roughly stabilize (approach but not necessarilly reach an asymptote)
    if there are no leaks; the healthy growth, as opposed
    to the cancerous growth of memory leaks, can come from various sources.
    Buffers may not all be allocated at the completion of initialization
    or even the first few iterations since some buffers are not
    allocated until they are first used and they may not all
    be used until all code paths have been traversed.  Functions
    which use realloc() to double the size of a buffer as needed
    (generally much better than alloc and free behavior) will take
    a while to reach their final sizes.
    
    The use of strncpy() itself is usually a sign of a bug unless
    it is immediately followed by buf[sizeof(buf)-1]=0.  In trivial
    programs, I usually do this manually.  In more serious programs
    I use a wrapper or replacement function.
    
    When using truncating functions, avoid situations where
    the string is truncated to different lengths in different
    places.  Usually, truncating to a sufficiently conservative
    length on input will suffice (should be less than used
    by your library functions).  I.E. if you truncate all
    input strings to 256 bytes or less and all your library
    functions are required to handle at least 1024 bytes then
    you should be okay provided you do not combine more than
    four strings (including prefixes), cummulative.  Be particularly
    wary of truncation before any deny known bad types of tests
    (which your system should not rely on anyway).
    Inconsistent truncation may result in the test failing to detect
    unwanted input.
    
    Don't forget to reduce length if you started at other than
    the beginning of the buffer (i.e. for some kind of concatenation
    operation).  Otherwise, copy routines which poke a zero into
    the end of the string, zero fill, or long source strings will
    result in a little surprise.
    
    In the example below, I give various compile time options.
    gracefully.  This function has been modified quite a bit
    since it was extensively tested.
    
    There is something to be said for having a fourth argument "options",
    since it could be handy to choose your options based on context.
    
    Given the number of options, I do suggest looking at the output
    of the C preprocessor with your choice of options for a sanity check.
    
    This function is generally suitable for replacing existing calls
    to strncpy() with safe_strncpy() by global search and replace.
    However, in a very small percentage of cases, you may find
    existing code which depends on the documented ideosyncracies
    of strncpy().  Usually, this is when strncpy() is called
    with a value which will deliberately prevent the trailing null
    from being written (either to write to a non null terminated
    string or to insert into the middle of a string).
    
    If you wish to incorporate the following function in GPLed code,
    send me email.
    
    #include <assert.h>
    #include <errno.h>
    /* safe_srncpy() is Copyright 1998 by Mark Whitis <whitisat_private> */
    /* All Rights reserved. */
    /* use it or distribute it (including for profit) at your own risk */
    /* but please don't GPL it.  Retain this notice.  Modified versions */
    /* should be clearly labeled as such. */
    char *safe_strncpy(char *dst, char *src, int size)
    {
       char *savedst;
    
       savedst = dst;
    
       /* Make sure destination buffer exists and has positive size */
       #ifdef SAFE_STRNCPY_DESTINATION_REQUIRED
          assert(dst);
          assert(size>0);
       #else
          if(!dst) {
             errno = EINVAL;
             return(dst);
          }
          if(size<1) {
             errno = EINVAL;
             return(dst);
          }
       #endif
    
       /* Do something sensible if we receive a NULL pointer */
       #ifdef SAFE_STRNCPY_ABORT_NOSRC
          assert(src);
       #else
          /* treat NULL and "", identically */
          if(!src) {
             dst[0]=0;
             #ifdef SAFE_STRNCPY_ERROR_NOSRC
                errno = EINVAL;
             #endif
             return(dst);
          }
       #endif
    
       size--;  /* leave room for trailing zero */
       while(*src && ( size-- > 0 ) ) {
          #ifdef SAFE_STRNCPY_MOVING_ZERO_TERM
             *(dst+1) = 0;
          #endif
          *dst++ = *src++;
       }
       *dst = 0;
    
       #ifdef SAFE_STRNCPY_ZERO_FILL
          while( size-- > 0 ) {
             *dst++ = 0;
          }
       #endif
    
       /* If we didn't copy the whole string, report error */
       if(*src) {
          errno=ENOSPC;
          #ifdef SAFE_STRNCPY_ALL_OR_NOTHING
             /* return zero length string instead of truncated string */
             /* on truncation error */
             savedst = 0;
          #endif
       }
       #ifdef SAFE_STRNCPY_TRUNC_ABORT
          assert(!*src);
       #endif
    
       return(dst);
    }
    
    
    ---------------------------------------------------------------------------
    ---  Mark Whitis <whitisat_private>     WWW:  http://www.dbd.com/~whitis/ ---
    ---  428-B Moseley Drive; Charlottesville, VA 22903                     ---
    ---------------------------------------------------------------------------
    



    This archive was generated by hypermail 2b30 : Fri Apr 13 2001 - 13:44:14 PDT