More details about gzip...

From: Micha³ Zalewski (lcamtufat_private)
Date: Sat Dec 27 1997 - 08:38:27 PST

  • Next message: woschat_private: "Re: Gzip & segmentation faults"

    This is a multi-part message in MIME format.
    
    ------=_NextPart_000_0038_01BD12EE.3D282740
    Content-Type: text/plain;
            charset="iso-8859-2"
    Content-Transfer-Encoding: quoted-printable
    
    Here is even more detailed report about gzip and it's vunerabilities.
    At the beginning, a short description of typical .gz header (more
    details in rfc#1952). Here's a hex dump of sample archive, Altered.gz,
    which has been attached to my previous letter about gzip:
    
    offset | 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13
    -------|+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    value  | 1F 8B 08 00 00 00 00 00 00 03 95 00 00 00 00 00 00 00 00 00
    
    Offs | Size | Description
    -----+------+-------------
    00   | 2    | Static header of gzip archive
    02   | 1    | Compression method, usually 08 (deflate)
    03   | 1    | Additional flags, 04 if original filename is stored
    04   | 4    | Modification time
    08   | 1    | Additional compression flags (depends on compr. method)
    09   | 1    | Operating system (00 - DOS, 03 - Unix)
    0A   | ?    | If original filename stored - ASCIZ string
    ??   | ?    | Compressed data blocks
    ??   | 4    | CRC-32 checksum
    ??   | 4    | Size of uncompressed file
    
    
    header of compressed data block consist of one bit, which
    is set when this block is the last one, and two next bits,
    which describes compression method:
    
    00 - uncompressed data
    01 - Huffman algorithm with fixed codes
    10 - Huffman with dynamic codes
    11 - (reserved)
    
    After header we can found an important information, which
    is used by gzip to rebuild compression tree, and that's
    the most sensitive point of archive (more information about
    compression tree, deflating (Huffman algorithm) can be found
    in rfc#1951). Hmm, now the most interesting part - any change to
    this information may cause undesirable effects. Usually,
    gzip quits with 'format violated' message... But sometimes attempts
    of decompression ends with segmentation faults, crashes, etc.
    Eg. when compressed file is empty, by changing byte at offset
    0x0B + length of original filename (or at offset 0x0A if there's no
    original filename inside your archive) to 0x95, 0xA5, 0xB5, or
    similar, you will cause an segmentation fault under Linux and
    MS-DOS (or other funny things when you're using win95 :). Behaviour
    of gzip strongly depends on archive contents, so I believe there's
    a way to exploit it. As a proof I attached to this letter another,
    totally different examples - dos-gpf.gz, which should cause an
    General Protection Fault under DOS/Win95, and linux.gz, which works
    properly ("format violated" :) only under MS-DOS.
    
    But this vunerability still isn't well exploited. Any ideas? It's
    really hard work, but just imagine...original gzip routines are
    extactly... ehem, copied into many programs, including viewers,
    compression utilities (WinZip) and other software... And .gz
    files are very often uncompressed transparently...
    
    _______________________________________________________________________
    Michal Zalewski [tel 9690] | finger 2 PGP [lcamtufat_private]
    =3D--------- [ echo "while [ -f \$0 ]; do \$0 &;done" >_;. _ ] =
    ---------=3D
    
    
    
    
    ------=_NextPart_000_0038_01BD12EE.3D282740
    Content-Type: model/vrml,x-world/x-vrml;
            name="Dos-gpf.gz"
    Content-Transfer-Encoding: base64
    Content-Disposition: attachment;
            filename="Dos-gpf.gz"
    
    H4sIAAAAAAAAAxrhAgBCVZ2gAgAAAA==
    
    ------=_NextPart_000_0038_01BD12EE.3D282740
    Content-Type: model/vrml,x-world/x-vrml;
            name="Linux.gz"
    Content-Transfer-Encoding: base64
    Content-Disposition: attachment;
            filename="Linux.gz"
    
    H4sIAAAAAAAAAxwJDXDkAgDuT4nyBQAAAA==
    
    ------=_NextPart_000_0038_01BD12EE.3D282740--
    



    This archive was generated by hypermail 2b30 : Fri Apr 13 2001 - 13:37:38 PDT