You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2006/10/24 21:02:51 UTC

Re: RFC: spam trapping with policyd-weight and DNSBLs?

mouss writes:
> Justin Mason wrote:
> > Hey --
> >
> > just to turn the tables for a bit ;), I've recently been considering a
> > problem and a possible solution, and could do with SpamAssassin users'
> > advice.
> >
> > These days, I've been forced to use SBL/XBL as an upfront anti-spam check,
> > rejecting spam at RCPT TO: time during the SMTP transaction. (Previously
> > I'd been running it from SpamAssassin in the usual manner.) That's great,
> > and it works well, rejecting a *lot* of spam and saving a lot of CPU time
> > by not running SpamAssassin. ;)
> >
> > However: it's important for SpamAssassin developers and mass-checkers to
> > get a "representative" feed of spam -- with all kinds of spam included --
> > so that the rules are measured against something close to reality.  This,
> > unfortunately, implies that discarding mails that hit SBL/XBL is a bad
> > thing, since those mails won't get into the mass-checked corpora -- and
> > what will be mass-checked from that point on is just the 25% of spam that
> > evades those rules.
> >
> > Bug 5096 suggests that we replace some of the mass-check corpora with
> > pure-spamtrap feeds to fix this.  Bit of a heavy fix :(
> >
> > There's another way, though.  If it were possible to change the SMTP
> > transaction flowchart to include this:
> >
> >   - is IP listed in SBL/XBL?
> >     - if not listed, deliver as normal;
> >     - else if listed, continue SMTP transaction as if normal delivery is
> >       underway, but deliver to a spamtrap mbox file or maildir.
> >   
> 
> CAVEAT: just because the client is listed on sbl-xbl does not mean the 
> message is spam. In particular:
> - a legit user may be sending through a listed server.
> - a spammer may "corpus-corrupt" you by sending ham messages (slightly 
> modified copies from mailing lists)
> 
> you can of course consider that the first is not a critical issue 
> (statistically talking at least). but if spammers know what you're 
> doing, the second point may become an issue (this is true with 
> spamtraps, I don't know why spammers don't do it...).

yeah.  generally we've been able to detect bad stuff creeping in due to
odd rules firing in mass-checks, so I'm not too worried about that.

--j.