RE: [logs] Charset selection (Was: Re: EventLog library)

From: Rainer Gerhards (rgerhardsat_private)
Date: Wed Jan 08 2003 - 00:26:12 PST

  • Next message: Drew, Dale: "RE: [logs] Windows Event Log Analysis"

    > If the base level standard is 7-bit ASCII (not! 8-bit), it is 
    > really easy to extend it to UTF-8 without breaking stuff. 
    > Double-byte charset stuff is IMHO evil and should just plain 
    > be avoided.
    Well, isn't UTF-8 a kind of DBCS encoding? And have you followed the
    limited acceptance Unicode receives in Japan. The problem are statements
    like yours IMHO. If I were Japanese, I wouldn't like to read the the
    encoding I need to use to make things working is "evil".
    I agree there are issues with DBCS and it is not easy to use. But there
    is JIS, S-JIS, EUC and we need to live with that. If we don't, our
    standards will probably not of any interest in those markets that have
    the need for DBCS. And over the years, these markets will outgrow the
    others, at least in number of people involved.
    However, I agree that first steps should be taken first. Let's get an
    initial version running with ANSI. Then let's think about what we can do
    for other encodings. BTW: this is something that beep has already solved
    > Just keep in mind that a log receiver that only understands 
    > ASCII could potentially parse a message COMPLETELY 
    > differently from one that 
    > understands UTF-8, since e.g. double quotes can be 
    > (mis)represented in alternate UTF-8 encodings. :/  [1]
    Agree - there should be an exchange on which charset is to be used... 
    > [1] I'm of the view that any UTF-8 generator that uses UTF-8 
    > escapes to 
    > represent 7-bit ASCII characters is plain b0rken, and an 
    > UTF-8 parser should just refuse to listen to it.  It is 
    > unfortunate that it is even possible to _do_ this; the spec 
    > should have been built so that an 
    > encoded \x00 is \x80, but that's too late now. 
    Fully agree on that - including that it's too late ;)
    LogAnalysis mailing list

    This archive was generated by hypermail 2b30 : Wed Jan 08 2003 - 08:15:45 PST