On 27 Aug 2002, Russell Fulton wrote: > Those who are not interested in perl please hit DELETE now. Yet More Perl Ahead > > Try analysing your data and putting your most common cases first, so > > they will match sooner and return before the rest are executed. > > Given that the optimizer is working over multiple statements or > expressions I don't think the order is actually material. I think it would. Imagine you have 3 types of log entries. Message 1 occurs 10% of the time Message 2 occurs 80% of the time Message 3 occurs 10% of the time and that you order your function as follows: return 1 if /msg 1/ return 3 if /msg 3/ return 2 if /msg 2/ then perl has to execute two extraneous (theoretically) pattern matches 80% of the time. I think the upshot is to order the tests in a best-guess order of frequency: return 2 if /msg 2/ return 1 if /msg 1/ return 3 if /msg 3/ This all assumes that you have a good idea of what your data looks like frequency-wise /before you look at it/. I could see this community getting that done by collating a bunch of sanitized logs, coming up with tight REs to match various messages, and then grinding out the various statistics. I would also recommend playing with another "speed variable" -- ordering your regular expressions according to length. RE's with more static text will be faster to match (or mismatch) than those with variability (. [a-z] alternation, etc). Eg. if (/seven/) can fail more quickly against "eight" than can: if (/^....$/) as it can fail on the initial "s" vs "e" as opposed to the character count difference at the end. -jeff -- "Space is big. You just won't believe how vastly, hugely, mind-bogglingly big it is. I mean, you may think it's a long way down the road to the drug store, but that's just peanuts to space." -- The Hitchhiker's Guide to the Galaxy _______________________________________________ LogAnalysis mailing list LogAnalysisat_private http://lists.shmoo.com/mailman/listinfo/loganalysis
This archive was generated by hypermail 2b30 : Wed Aug 28 2002 - 10:17:27 PDT