Re: Webtrends HTTP Server %20 bug (UTF-8)

From: Peter W (peterwat_private)
Date: Fri Jun 08 2001 - 12:40:56 PDT

Next message: Theo de Raadt: "Re: SSH / X11 auth: needless complexity -> security problems?"

Previous message: Casper Dik: "Re: SSH / X11 auth: needless complexity -> security problems?"
In reply to: Glynn Clements: "RE: Webtrends HTTP Server %20 bug"
Next in thread: zsn: "Re: Webtrends HTTP Server %20 bug (UTF-8)"
Reply: zsn: "Re: Webtrends HTTP Server %20 bug (UTF-8)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, Jun 08, 2001 at 04:51:57AM +0100, Glynn Clements wrote:
> 
> Eric Hacker wrote:

> > Conveniently, UTF8 uses the same
> > values as ASCII for ASCII representation. Above the standard ASCII 127
> > character representation, UTF8 uses multi-byte strings beginning with 0xC1.
> 
> No; the sequences for codes 128 to 255 begin with 0xC2 and 0xC3

And encodings for 256 - (2^32 -1) use other values in the first octet.

Two points here:

 1) Eric wrote "As a URL cannot contain spaces or other special characters, 
URL encoding is used to transport them. Thus all UTF8 characters above ASCII 
are supposed to be URL encoded in order to be sent."

It's not at all clear to me a) that UTF-8 sequences are allowed in *any*
HTTP headers (request or response) or b) how a server or client would decide
whether a possible UTF-8 sequence like %C3%B3 is UTF-8 for the single value
0xF3 or the two-character phrase 0xC3 + 0xB3. All indications in the RFCs
(2068, 1738, 1808) suggest that only characters 0x00 - 0xFF are expected in
the various headers, and that no UTF-8, double-byte, or other
representations are allowed.

 2) The UTF-8 rules are kinda funny. 0xFE and 0xFF are illegal everywhere,
and other characters may be illegal depending on their placement, e.g. a
"starting" octet with 2^7 on and 2^6 off, or a "subsequent" octet that
doesn't have 2^7 on and 2^6 off. I wouldn't be surprised if some UTF-8
parsing routines don't handle illegal characters gracefully, or if
applications don't gracefully trap errors reported by the UTF-8 parsing
routines, etc. This might be worth some testing.

-Peter

Next message: Theo de Raadt: "Re: SSH / X11 auth: needless complexity -> security problems?"
Previous message: Casper Dik: "Re: SSH / X11 auth: needless complexity -> security problems?"
In reply to: Glynn Clements: "RE: Webtrends HTTP Server %20 bug"
Next in thread: zsn: "Re: Webtrends HTTP Server %20 bug (UTF-8)"
Reply: zsn: "Re: Webtrends HTTP Server %20 bug (UTF-8)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b30 : Sun Jun 10 2001 - 14:32:04 PDT