You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Jeff Heinen <je...@inherent.com> on 2004/02/05 22:24:23 UTC

Ways around Bayes filters?

With the moving of the list and a few sick days, I'm a little behind. So I'm
not sure if this has been brought up or not. My boss sent me this BBC
article this morning and suggested I send it along.

http://news.bbc.co.uk/2/hi/technology/3458457.stm

To (over) summarize, the article states something that we here already know,
or should already know. That, given time and training, there will be certain
words that the filter learns as hammy, no matter what the situation. Names
of Businesses, street addresses, and staff members seem to be a likely
target as they are used daily in ham messages and learned as such. I'm sure
for many of us here, we can find things like 'spamassassin', 'bayes' and
'procmail' scoring low somewhere in our own bayes databases.

To some extent, we are already seeing this. At least here, there as been
reports of the random gibberish words being replaced with 'technical' terms
or excepts from novels. So I've been asked if there are any suggestions to
combat, or at least keep up with this current spam mutation. Or are we
reaching a point where the effectiveness of the current systems fall behind
and we are forced to the next step, whatever that may be.

-Jeff