On Sunday, August 25, 2002, at 10:18 , Russell Fulton wrote: > apply lots of regular expressions (REs) to each line of log files. > Anyone know of any tricks to speed this up since this is the innermost > loop of the process any gains here should be worthwhile. I know the RE > optimizer is pretty smart and that it will do some optimization over > statements but I have never figured out what the limitations are. A couple things come to mind: - use the extended patterns (?:a|b|c) instead of subexpressions (a|b|c) - it's a little cheaper - use the o modifier to compiles the pattern once - that's intended as a hint to tell Perl that you plan on calling it against a lot of data. The only drawback is that you only get variable substitution the first time you call it, so you'll need to leave out the /o on anything you're intending to call with different values of embedded variables. The other thing I'd do is abstract your matching code so it's checking against a list of patterns from a file or a quoted block and keeping some call counts on the patterns so you can order the most frequently matched patterns at the top. Also - perldoc -q 'match many' should pull out the "How do I efficiently match many regular expressions at once?" entry in the Perl FAQ. Chris _______________________________________________ LogAnalysis mailing list LogAnalysisat_private http://lists.shmoo.com/mailman/listinfo/loganalysis
This archive was generated by hypermail 2b30 : Mon Aug 26 2002 - 09:43:06 PDT