H D Moore said: > A url-encoded character is NOT a unicode code character.. > > On Sunday 03 June 2001 05:41 am, Auriemma Luigi wrote: > > The bug is really simple. If the attacker insert an unicode space (%20) Not exactly. A better way of saying it is that URL encoding is not the same as UTF8 encoding of unicode code points. Unicode is a superset of ACSII and thus all ASCII characters are Unicode. UTF8 is a way of encoding unicode code points for transport over the internet in a restricted character set. Conveniently, UTF8 uses the same values as ASCII for ASCII representation. Above the standard ASCII 127 character representation, UTF8 uses multi-byte strings beginning with 0xC1. As a URL cannot contain spaces or other special characters, URL encoding is used to transport them. Thus all UTF8 characters above ASCII are supposed to be URL encoded in order to be sent. Therefore the original unicode code point is both UTF8 encoded and URL encoded. Hopefully this has clarified some of the confusion around the terminology. This is, of course, a summary. For the real deal, check out http://www.unicode.org. As an aside, yes I know that Microsoft's IIS will accept non-URL encoded UTF8 characters as well as UTF8 beginning with 0xC0 (now deprecated). At least that was the case the last time I checked. Eric Hacker, CISSP, GCIA, MCSE, CCSE Network Security Consultant Lucent Technologies Worldwide Services Phone: 781-848-5500 x485 Email: ehackerat_private PGP key: ehackerat_private">http://keyserver.pgp.com/pks/lookup?op=get&search=ehackerat_private PGP Fingerprint: FADB 793E E98A 97BB 04D6 5973 7864 93A1 222B E0C7 "Long gone are the days when one's surname referred to the role one had in the community."
This archive was generated by hypermail 2b30 : Thu Jun 07 2001 - 10:41:51 PDT