[logs] Re: on log standards

edward.j.sargisson@private

Actually, I think one of the useful first steps we could attempt is to get 
pressure vendors to document all the possible log messages, what they mean 
and when they might occur.

This is something I've been doing internally in the logs my application 
writes. Every time I change what I do I document it up in a Word document 
so that other people understand what might come out. This is especially 
important as I'm logging using IBM Common Base Event which picks up a lot 
of data and I want to be able to know what kind of data to expect.

(My opinions are my own; not my employer's.)

Cheers,
Edward

Edward Sargisson BSc, BCom
Consultant
IBM Business Consulting Services
Wellington, New Zealand
DDI: + 64-4-462-3586, Mob: + 64-21-576-658
P O Box 38 993, Wellington, NEW ZEALAND
edward.j.sargisson@private

Christina Noren <cfrln@private> 
Sent by: 
loganalysis-bounces+edward.j.sargisson=nz1.ibm.com@private
31/08/2006 05:41 AM

To
LogAnalysis <LogAnalysis@private>
cc

Subject
[logs] Re: on log standards

Morty,

>   Host software is often created in an unstructured way, with ad-hoc
>   logging infrastructure that is added during debugging stages.  Log
>   content standards sound great, but in practice, they would probably
>   not work well with the way software is actually written.
>

I think you hit the nail on the head with this observation.

Logging standardization is wishful thinking on the part of log 
analysis vendors. Log standardization has really only worked out for 
web server access logs and network device SNMP and will never work 
out in the software world.

Most software companies and software developers put in error messages 
on an ad hoc basis. Only a code audit could possibly inventory all 
potential messages. Since most code is proprietary, those writing 
parsers for the *content* rather than *header* portion of log 
messages from a specific application are most often taking educated 
guesses about the messages that a product *will* put out based on the 
evidence of log samples collected from running systems. I've been 
part of concerted efforts to document error logs in both commercial 
software companies and with in-house development teams and they don't 
work.

Not to mention the amount of dependence on legacy software with 
legacy logging.

The challenge is to create useful tools for searching, reporting and 
alerting on logs that don't depend on either standardization or real- 
time interpretation/normalization by the log tool.

Some of the techniques for this are:

- full-text indexing of raw content
- automated machine categorization of event patterns - the machine 
can't know that this content means that a process exited abnormally, 
but it can tell that it's different content than another event saying 
that a batch job completed
- exposing hooks for users to add semantic knowledge as they use the 
log data and have that knowledge added to the raw data incrementally 
- interpretation must be fluid and adaptable in this way
- noticing meta-patterns in a log data set of correlations, 
abnormalities, etc. based on the automated categorizations

On Aug 28, 2006, at 5:39 PM, Mordechai T. Abzug wrote:

> On Fri, Aug 25, 2006 at 10:57:30PM -0500, Anton Chuvakin wrote:
>
>> So I was thinking a lot about log standards and taxonomies and the
>> release of CEF inspired me to finally finish my brief article on log
>> standards - check it out:
>
>> http://chuvakin.blogspot.com/2006/08/on-common-event-format-cef.html
>
> Comments:
>
> * There is one standard for log content that is in widespread use: the
>   MIBs used for SNMP traps.  SNMP traps are not used too widely in the
>   host world, but they're fairly widespread in the network world.  The
>   SNMP world had standardization from day one, so it's useful to look
>   at how standardization has helped SNMP.
>
>   Note: SNMP is more than just traps/events, but we can ignore the
>   other aspects for this discussion.
>
> * Even in the SNMP trap world, the standardizations often prove
>   insufficient.  That is, there are multiple standardized components:
>   there is a standard way to describe SNMP PDUs in machine-readable
>   format, i.e. ASN.1 and SMI to write MIBs, and then there are
>   actually standard MIBs, so that all platforms can express certain
>   common events in a vendor-independent way.  The former is mostly
>   what you would call a form standard, while the latter is mostly what
>   you would call a content standard.  But the content component is, in
>   practice, of limited utility, because most vendors end up wanting
>   various events that are not standard events.
>
>   In more concrete terms:
>
>   * There is a standard way to express "interface down".  Any device,
>     regardless of vendor, can send an SNMP trap saying "interface
>     down", and the NMS (network management station) can understand it
>     without knowing anything about that particular vendor.
>
>   * But if a device wants to say "SONET problem with certain
>     vendor-specific flags", it is likely that no standard trap exists.
>     The vendor is going to utilize a custom trap.  SNMP allows for
>     this.  This custom trap can be defined using a MIB written in
>     standard format, so the NMS station can read in the MIB and then
>     immediately parse it, but actually understanding what to do with
>     it would still require trap-specific handling by the application
>     or by the administrator.
>
>   SNMP provides vendor-specific traps for a very good reason.  While
>   the standard MIBs provide a lot of useful traps, they cannot begin
>   to cover all the possible cases of existing technologies.  New
>   technologies that vendors use to differentiate themselves from each
>   other with necessarily exist and need management before they are
>   standardized.
>
>   The lesson of SNMP trap is that having standard content can be
>   useful, but in practice, sooner or later, you will have to allow for
>   a standard format, and let vendors extend the content.
>
> * In the host software world, life is worse than in the network world.
>   Host software is often created in an unstructured way, with ad-hoc
>   logging infrastructure that is added during debugging stages.  Log
>   content standards sound great, but in practice, they would probably
>   not work well with the way software is actually written.
>
> * The release of "CEF" seems like a non-event.  Looks like a
>   unilateral "standard" issued by one minor vendor without a lot of
>   buy-in from third-party vendors.  They haven't even really released
>   it; they're asking people to send them email to get a copy.  No
>   thanks.
>
> Morty
> _______________________________________________
> LogAnalysis mailing list
> LogAnalysis@private
> http://lists.shmoo.com/mailman/listinfo/loganalysis

_______________________________________________
LogAnalysis mailing list
LogAnalysis@private
http://lists.shmoo.com/mailman/listinfo/loganalysis