Rainer Gerhards wrote: > > > If the base level standard is 7-bit ASCII (not! 8-bit), it is > > really easy to extend it to UTF-8 without breaking stuff. > > Double-byte charset stuff is IMHO evil and should just plain > > be avoided. > > Well, isn't UTF-8 a kind of DBCS encoding? And have you followed the > limited acceptance Unicode receives in Japan. The problem are statements > like yours IMHO. If I were Japanese, I wouldn't like to read the the > encoding I need to use to make things working is "evil". Japanese and chinese writing systems are evil, too :) </flamebait> Hrm, I might have gone a bit overboard there. DBCS using lead bytes might still be easy to use (it doesn't insert NULs, does it?). I was thinking more along the lines of Win32 Unicode, which I do believe is nothing but evil, partly from a storage/protocol point of view, but mostly from a programming point of view. I've been forced to deal with unicode in the past, only to get tripped up by such trivial facts as "how the HELL do you store a unicode string in an SQL database? -- Whoops, can't be done, unless you store it as a blob, and then you can't search on it". UTF-8 doesn't really have such problems. It can be copied/stored/etc with normal string management routines, as long as you keep the string intact and don't truncate it. Is this also the case with DBCS encoding? -- Mikael Olsson, Clavister AB Storgatan 12, Box 393, SE-891 28 ÖRNSKÖLDSVIK, Sweden Phone: +46 (0)660 29 92 00 Mobile: +46 (0)70 26 222 05 Fax: +46 (0)660 122 50 WWW: http://www.clavister.com "Senex semper diu dormit" _______________________________________________ LogAnalysis mailing list LogAnalysisat_private http://lists.shmoo.com/mailman/listinfo/loganalysis
This archive was generated by hypermail 2b30 : Wed Jan 08 2003 - 08:32:08 PST