Re: CGI.pm and the untrusted-URL problem

From: Kragen Sitaker (kragenat_private)
Date: Mon Feb 14 2000 - 12:48:33 PST

  • Next message: Stephane Aubert: "Windows 2000 installation process weakness"

    supportat_private removed from addressee list because they probably
    don't want to hear the whole conversation; I just want them to fix our
    local CGI.pm so my web pages are safe.  :)
    
    Marc Slemko writes:
    > On Mon, 14 Feb 2000, Kragen Sitaker wrote:
    > > Diagnosis
    > > ---------
    > >
    > > It appears that this happens because the unencoded space is interpreted
    > > by the HTTP server (Apache 1.3.6 in my tests) as separating the URL
    > > from the protocol name.  So the environment variable SERVER_PROTOCOL
    > > gets set to everything following the space, followed by a space and the
    > > actual protocol, such as "HTTP/1.0".
    >
    > Correct, this does appear to be a bug.  I suspect that a lot of such bugs
    > will be found.  Unfortunately.
    >
    > However it is important to note that this does not exploit a bug in
    > Apache.  Apache is choosing to deal with an illegal request in a perfectly
    > legitimate manner.  At least, that is my understanding of what the spec
    > says; I haven't checked it closely WRT this particular issue.
    
    I think you're right.
    
    > Part of Apache's functionality is to pass unknown methods and protocols on
    > to CGIs.  It is be arguable that Apache should explicitly reject any
    > request with more than two unencoded spaces in it.
    
    Well, unknown methods I certainly agree with; but if the protocol is
    completely unknown --- not even a version of HTTP --- how can Apache
    reasonably think it knows what part of the request constitutes the URL,
    or when it has reached the end of the request?
    
    Apache, in this case, constitutes the interface between mutually
    untrusted contexts: a Web browser and a CGI script.  (And, as CERT
    points out, there's a third context involved, trusted by neither of the
    other two --- the URL provider.)  As I see it, part of its purpose in
    life is to restrict the information passed between these contexts to a
    known and unsurprising set of channels.
    
    > > RFC 1738 and RFC 2068 say that only a-z, 0-9, "+", ".",
    > > and "-" are allowed in scheme names.  Accordingly, I suggest the
    > > following change to CGI.pm:
    >
    > Or it could simple properly encode things, as it should do for all data
    > supplied by the user that is output.
    >
    > Filtering is often easier, however, as encoding can be very context
    > sensitive.
    
    I'm not sure what the proper encoding for scheme names would be. :)
    
    self_url does appear to properly encode malicious data inserted in
    other parts of the input URL.
    
    > > The successful exploit requires a remarkable chain of extreme forgiveness:
    > > 1- The web browser must accept an illegal URL from (possibly valid,
    > >    although very unusual) HTML.
    > > 2- The web browser must send an illegal HTTP request with the illegal
    > >    URL, without %-encoding the URL to make it legal.
    >
    > Note that IE appears to be far better in making sure it only makes legal
    > requests.  Good job Microsoft, in this particular situation.
    
    What version of IE is better in this way?  MSIE 3.0 is just as lenient
    as Netscape 4.6 in this sitation.  I don't have any machines with MSIE
    4 installed, because MSIE 4 makes me uncomfortable.
    
    --
    <kragenat_private>       Kragen Sitaker     <http://www.pobox.com/~kragen/>
    The Internet stock bubble didn't burst on 1999-11-08.  Hurrah!
    <URL:http://www.pobox.com/~kragen/bubble.html>
    The power didn't go out on 2000-01-01 either.  :)
    



    This archive was generated by hypermail 2b30 : Fri Apr 13 2001 - 15:34:51 PDT