You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Daniel Quinlan <qu...@pathname.com> on 2004/06/08 04:22:22 UTC

SURBL works well on envelope sender

Here's a test on my last 4 days of ham and spam.  The T_DNS_FROM* rules
are using the envelope sender (that is, the MAIL FROM added by my MTA to
the message headers).  They have a reasonably good hit rate (6.43% of
spam hit one of tested SURBL zones) and a 0% FP rate in this test.  Only
7 out of 103 of those did not hit one of the URIBL rules, but they did
do it with zero FPs (I get FPs mostly because people discuss spam
domains in some of the ham tested here.  That's another issue, though.)

Maybe it would be worthwhile factoring this into future development.
That is, also list known spammer envelope senders domains, maybe get
SpamCop to provide lists for that too, I suspect there's some overlap in
the other direction as well.

OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
   2598     1602      996    0.617   0.00    0.00  (all messages)
100.000  61.6628  38.3372    0.617   0.00    0.00  (all messages as %)
  3.580   5.8052   0.0000    1.000   1.00    0.01  T_DNS_FROM_SURBL_WS
  0.885   1.4357   0.0000    1.000   0.67    0.01  T_DNS_FROM_SURBL_SC
  9.161  14.5443   0.5020    0.967   0.56    1.00  URIBL_BE_SURBL
 36.066  57.5531   1.5060    0.974   0.44    1.00  URIBL_WS_SURBL
  0.423   0.6866   0.0000    1.000   0.33    0.01  T_DNS_FROM_SURBL_BE
 29.908  47.5655   1.5060    0.969   0.11    1.00  URIBL_SC_SURBL

How's the multi roll-out going?  It would definitely be handy for this
test (the code to support it already exists).

Daniel

-- 
Daniel Quinlan
http://www.pathname.com/~quinlan/

Re: [SURBL-Discuss] SURBL works well on envelope sender

Posted by Jeff Chan <je...@surbl.org>.
On Monday, June 7, 2004, 7:22:22 PM, Daniel Quinlan wrote:
> Here's a test on my last 4 days of ham and spam.  The T_DNS_FROM* rules
> are using the envelope sender (that is, the MAIL FROM added by my MTA to
> the message headers).  They have a reasonably good hit rate (6.43% of
> spam hit one of tested SURBL zones) and a 0% FP rate in this test.  Only
> 7 out of 103 of those did not hit one of the URIBL rules, but they did
> do it with zero FPs (I get FPs mostly because people discuss spam
> domains in some of the ham tested here.  That's another issue, though.)

> Maybe it would be worthwhile factoring this into future development.
> That is, also list known spammer envelope senders domains, maybe get
> SpamCop to provide lists for that too, I suspect there's some overlap in
> the other direction as well.

> OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
>    2598     1602      996    0.617   0.00    0.00  (all messages)
> 100.000  61.6628  38.3372    0.617   0.00    0.00  (all messages as %)
>   3.580   5.8052   0.0000    1.000   1.00    0.01  T_DNS_FROM_SURBL_WS
>   0.885   1.4357   0.0000    1.000   0.67    0.01  T_DNS_FROM_SURBL_SC
>   9.161  14.5443   0.5020    0.967   0.56    1.00  URIBL_BE_SURBL
>  36.066  57.5531   1.5060    0.974   0.44    1.00  URIBL_WS_SURBL
>   0.423   0.6866   0.0000    1.000   0.33    0.01  T_DNS_FROM_SURBL_BE
>  29.908  47.5655   1.5060    0.969   0.11    1.00  URIBL_SC_SURBL

Thanks for the results.  While SURBLs are not intended to be used
with domains in envelopes, I can see how they could apply to a
small percentage of spams, as you found.  I'm glad there are
zero false positives, though I'm somewhat concerned about the
1.5% FPs using SURBLs with urirhsbl.  Hopefully those are from
spam under discussion in the corpora, as you propose.

Regarding adding sender domains to SURBLs, I'm 99% sure we're
not going to specifically go out and do that.  Our focus will
remain on message bodies (spamvertised sites).

On the other hand I can see the value in being able to handle
sender domains without needing to DNS resolve them into IP
addresses.  That advantage would be similar to the one SURBLs
have over the numeric URIBLS.

However another reason to not do it is that sender domains
are routinely forged, so eventually they would probably become
an unreliable predictor of spam.  On the other hand a forging
message body URI would do spammers little good since it would
take away traffic from their spam web site.  Accordingly, I
think we'll stick with URI domains for SURBLs.  :D

> How's the multi roll-out going?  It would definitely be handy for this
> test (the code to support it already exists).

Thanks for the prompting; we got around to doing it at last,
as seen in the other messages.

Jeff C.