This could be a fun project for someone who still codes (not me). Also, I suspect this exists out there but since it has not been mentioned on the list here goes. How about a search and replace tool that helps the user replace key identifying data? It should prompt for things that might be replaced so you don't have to build a list ahead of time. A "keep unchanged" list would prevent prompting for some things (Tx, Rx, Port, Error, ..... or even certain columns or hex code in a column ... ) For example, it would recognize: - IP addresses and prompt the user to "Change all IP addresses in the first column that are from the 192.168.131.0 subnet to 10.10.x.x equivalents and scan other columns for replaced addresses and replace them (to retain matches)". - recognize time stamps and prompt to "change date time stamps +/- hhhhhh:mm:ss" (keeps time stamp relative). - Prompt to replace any word or hex combination not in the [user editable] "keep unchanged" list with a word of my choosing and pad or cut to match length. Change "MyCompany" to: [Hit Enter to leave unchanged] I type: llama output: llama0001 Store a list of mapped values / rules to reduce prompting during future runs. The program should be available on Windoze in a nice GUI with a browse option for filename inputs, UNIX as a command line item and maybe other platforms, like Apple. Ease of use by novices or experts counts. Output into a text file for user review. Instruction on how to attach/cut&paste and e-mail to log-submission-loganalysis@private could pop up on screen at the end of the run. Post on freeware sites as a search and replace tool and log anonymizer. Included info about the loganalysis list and need for log samples in the banner or sign off screen. BTW, does log analysis have to be only on syslogs? How about output from applications (Oracle database log, binary logs, ...)? Still learning, Adam -----Original Message----- From: Jon Stearley [mailto:jrstear@private] Sent: Friday, March 12, 2004 3:28 PM To: Marcus J. Ranum; Rainer Gerhards Cc: loganalysis@private Subject: Re: [logs] Log Samples Requested > Rainer Gerhards wrote: > >Having said this, on to my request: I would appreciate if the list > >members (you!) could send me a few lines of their actual syslog data. > > Rainer - we've been trying to establish a log codex on loganalysis.org > for some time. Getting log data is like pulling teeth. :) Please, people > if you have logs you are willing to share, send them to loganalysis.org > as well. i've repeatedly pondered [1] an anonymizer strong enough to convince people to donate their logs, but weak enough to enable meaningful analysis on the converted data. it can't just hash unique words, 'cause word similarity must be preserved. character-by-character hashing isn't strong enough (right?)... cryptography is too strong (i would think so...) my current top idea is to hash characters to a unique word: ie "foo fum" becomes "foophlegmphlegm fooblehphar" when f->foo, o->phlegm, u->bleh, m->phar. the words would of course be generated in per-run pseudorandom fashion. with the hash, it can be reverse converted (and thus, so could the analysis results for review), but without it: - is it sufficiently obfuscated that people would feel free to share their logs? - would this preserve enough of the original data characteristics to make analysis meaningful? for my analysis approach, the answer to the latter is "yes" (except that i require that whitespace be preserved, ie s/(\S+)/$1/g in perlspeak). for my logs, my answer to the former is "uhm, i think so..." ;) i do (only) time analysis numerically, so the above wouldn't be acceptable for the timestamps (it'd loose the numeric properties i need - how about just converting timestamps to seconds since log start?). but some people probably treat various msg elements numerically... my sense is that someone will have to demonstrate a killer analysis before people will be sufficiently motivated to share their data. ie, impress me with what you can do with your own data - then i'll let you try my data... my 2c -- +--------------------------------------------------------------+ | Jon Stearley (505) 845-7571 (FAX 845-7442) | | Sandia National Laboratories Scalable Systems Integration | +--------------------------------------------------------------+ [1] http://www.securityfocus.com/archive/116/277024/2002-06-14/2002-06-20/2 _______________________________________________ LogAnalysis mailing list LogAnalysis@private http://lists.shmoo.com/mailman/listinfo/loganalysis _______________________________________________ LogAnalysis mailing list LogAnalysis@private http://lists.shmoo.com/mailman/listinfo/loganalysis
This archive was generated by hypermail 2b30 : Fri Mar 12 2004 - 18:55:04 PST