The regexp offered as a solution to the various holes in the earlier one: > The following can Fix all of that: > > s/<\s+BODY\s+((([^">]+("(\\.|[^"])*")?)*)ONLOAD)*?\s+/<BODY $1 > DEFANGED-ONLOAD/gi; ... won't work if the tag does not have a space character between the initial `<' and the word `Body', (though the regexp is successfully case insensitive), which leads me to suspect it has not been tested before mailing to this list. Given this is an untested quite complicated regexp (which really ought to use the perl facility for embedding whitespace so it can be split across multiple lines and commented) and given that it's got an obvious error at the fourth character, it seems unsuitable for securing anything. In general this brings to mind famous warnings about trying to sanitize arbitrary input. All the best, James Wetterau "Vitiello, Eric (BHS)" says: > > [From an anti-mail-exploit-procmail-filter-perl-script (see > > http://www.wolfenet.com/~jhardin/procmail-security.html):] > > > s/<BODY\s+(([^">]+("(\\.|[^"])*")?)*)ONLOAD/<BODY $1 > > DEFANGED-ONLOAD/gi; > > > > This Pattern will catch lines like > > <body onload="badthings()"> > > converted to > > <BODY DEFANGED-ONLOAD="badthings()"> > > but not > > <body onload="badthings()" onload="badthings()"> > > converted to > > <BODY onload="badthings()" DEFANGED-ONLOAD="badthings()">] > > So one onload=... will stay and act. > > > > Also things like < body ... > wont be catched. I dont know if > > those are > > leading spaces are proper HTML, but even if not, one should > > not suppose > > every bad HTML to be rejected. > > The following can Fix all of that: > > s/<\s+BODY\s+((([^">]+("(\\.|[^"])*")?)*)ONLOAD)*?\s+/<BODY $1 > DEFANGED-ONLOAD/gi; > > Eric Vitiello > Webmaster^2, Baptist Healthcare System > www.bhsi.com www.westernbaptist.com > www.baptisteast.com www.centralbap.com >
This archive was generated by hypermail 2b30 : Fri Apr 13 2001 - 14:12:06 PDT