Jason Royes wrote: >Databases (w/ good schema) excel when complex >analysis is required. Databases also require index inserts, support for transaction rollback, and all kinda crazy stuff that makes them completely unsuitable as logging systems. We (the collective unconscious "we") keep using them, though, because they're available and can be made to suit the purpose by throwing a bunch of hardware at the problem - which is cheaper, really, than understanding the problem or even thinking about it. There are a lot of techniques that make more sense than using a generic SQL database - storing records in raw syslog files indexed only by offset into the file would save a _HUGE_ amount of space over what a database uses - a time_t and an off_t is all you need. Primary indexes can/should be created on the fly at query time (like with a glimpse database) rather than updated at insert time like they have to be with a commercial database - doing a sorted insert into a b+tree is orders of magnitude faster and more space efficient than a random-ordered insert/query, etc. There's a lot of simplifying assumptions you can make about logs: - they are inserted in event-sequence - they are approximately clustered by time - you seldom (if ever) will need to seek back 20 minutes and delete a single log record - the fields you'll want to search on are either bounded fairly tightly (priority, source, time) or are free-form (regexp or string fragment) - so you'll either want a very compact primary index for the bounded values and a patricia tree or inverted index for the strings I'm not saying that current approaches won't work, because they will. But only 'cuz Moore's law overcomes a lot of the need to understand the problem. ;) If you're thinking of implementing a database solution for searching logs, humor an old curmudgeon by researching text-retrieval systems - keyword-in-context, inverted indexes, and how to do bulk loads of search tables. ;) mjr. ("once a database guy - always a database guy.") --- Marcus J. Ranum http://www.ranum.com Computer and Communications Security mjrat_private _______________________________________________ LogAnalysis mailing list LogAnalysisat_private http://lists.shmoo.com/mailman/listinfo/loganalysis
This archive was generated by hypermail 2b30 : Tue Dec 03 2002 - 09:47:50 PST