RE: [logs] Charset selection (Was: Re: EventLog library)

Previous message: marc: "Re: [logs] EventLog library"
Next in thread: Rainer Gerhards: "RE: [logs] Charset selection (Was: Re: EventLog library)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

rgerhardsat_private

> If the base level standard is 7-bit ASCII (not! 8-bit), it is 
> really easy to extend it to UTF-8 without breaking stuff. 
> Double-byte charset stuff is IMHO evil and should just plain 
> be avoided.

Well, isn't UTF-8 a kind of DBCS encoding? And have you followed the
limited acceptance Unicode receives in Japan. The problem are statements
like yours IMHO. If I were Japanese, I wouldn't like to read the the
encoding I need to use to make things working is "evil".

I agree there are issues with DBCS and it is not easy to use. But there
is JIS, S-JIS, EUC and we need to live with that. If we don't, our
standards will probably not of any interest in those markets that have
the need for DBCS. And over the years, these markets will outgrow the
others, at least in number of people involved.

However, I agree that first steps should be taken first. Let's get an
initial version running with ANSI. Then let's think about what we can do
for other encodings. BTW: this is something that beep has already solved
;)

> 
> Just keep in mind that a log receiver that only understands 
> ASCII could potentially parse a message COMPLETELY 
> differently from one that 
> understands UTF-8, since e.g. double quotes can be 
> (mis)represented in alternate UTF-8 encodings. :/  [1]

Agree - there should be an exchange on which charset is to be used... 

> [1] I'm of the view that any UTF-8 generator that uses UTF-8 
> escapes to 
> represent 7-bit ASCII characters is plain b0rken, and an 
> UTF-8 parser should just refuse to listen to it.  It is 
> unfortunate that it is even possible to _do_ this; the spec 
> should have been built so that an 
> encoded \x00 is \x80, but that's too late now. 

Fully agree on that - including that it's too late ;)

Rainer
_______________________________________________
LogAnalysis mailing list
LogAnalysisat_private
http://lists.shmoo.com/mailman/listinfo/loganalysis