SPAM/PORN DETECTED (was Re: CRIME Interesting way around spam filter)

From: SPAM/PORN FILTER (spam_porn_filter@private)
Date: Wed Jun 04 2003 - 00:36:13 PDT

Next message: Jacob Redding: "Re: SPAM/PORN DETECTED (was Re: CRIME Interesting way around spam filter)"

Previous message: Andrew Plato: "RE: CRIME Port scanning from an ISP"
In reply to: Crispin Cowan: "Re: CRIME Interesting way around spam filter"
Next in thread: Jacob Redding: "Re: SPAM/PORN DETECTED (was Re: CRIME Interesting way around spam filter)"
Reply: Jacob Redding: "Re: SPAM/PORN DETECTED (was Re: CRIME Interesting way around spam filter)"
Reply: Crispin Cowan: "Re: SPAM/PORN DETECTED (was Re: CRIME Interesting way around spam filter)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This note has been flagged as 
	Likely PORN
	Possibly SPAM

The following test(s) were positive
	the word(s)
		'penis' 
	followed by the phrase(s) 
		'move', 'back and forth across', and 'forcing'

        Canadian Grammar/Phrasing/Spelling

The results of the test(s) show that this is
	77% likely PORN (23/30)
	81% likely SPAM (27/33)


On Tue, 2003-06-03 at 13:39, Crispin Cowan wrote:
> Shaun Savage wrote:
> 
> > It looks at raw text. The tokens are found using a fixed set of
> > delimiters.  The reason for this is the mozilla spam filter uses the
> > html tags to help determine spam, alot of spam uses 'color' font.  Also
> > ~ one of the delimiters is '<' '>'  so it can't determine what is a html
> > tag. 
> 
> Thanks!
> 
> Unfortunate that it is only looking at raw text. There is valuable info 
> in the formatted text, precisely because of this hack of splitting words 
> with HTML comments, so that word-recognizing filters like Bayes won't 
> recognize "pe<!-- interruption -->nis" as "penis". The spammer can move 
> the interruption back and forth across the word, put arbitrarily clean 
> text (e.g. from project Gutenberg) in the "interruption", forcing 10X 
> training time on the Bayesian filter.
> 
> Crispin

Next message: Jacob Redding: "Re: SPAM/PORN DETECTED (was Re: CRIME Interesting way around spam filter)"
Previous message: Andrew Plato: "RE: CRIME Port scanning from an ISP"
In reply to: Crispin Cowan: "Re: CRIME Interesting way around spam filter"
Next in thread: Jacob Redding: "Re: SPAM/PORN DETECTED (was Re: CRIME Interesting way around spam filter)"
Reply: Jacob Redding: "Re: SPAM/PORN DETECTED (was Re: CRIME Interesting way around spam filter)"
Reply: Crispin Cowan: "Re: SPAM/PORN DETECTED (was Re: CRIME Interesting way around spam filter)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b30 : Wed Jun 04 2003 - 01:22:46 PDT