You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Daniel Quinlan <qu...@pathname.com> on 2004/04/26 23:51:32 UTC

AHBL: promote or delete?

Using current NET results from quinlan, jm, parkerm, theo, daf:

OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
 275491   240492    34999    0.873   0.00    0.00  (all messages)
100.000  87.2958  12.7042    0.873   0.00    0.00  (all messages as %)
 11.191  12.8079   0.0829    0.994   0.90    0.01  T_RCVD_IN_AHBL_PROXY
 16.714  19.1183   0.1914    0.990   0.90    0.00  __RCVD_IN_AHBL
  5.643   6.4489   0.1057    0.984   0.87    0.01  T_RCVD_IN_AHBL_SPAM
  7.986   9.1130   0.2400    0.974   0.85    0.01  T_RCVD_IN_AHBL_RHSBL
 [the rest is not noteworthy]
  0.225   0.2524   0.0343    0.880   0.62    0.01  T_RCVD_IN_AHBL_SPAM_SUPPORT
  0.042   0.0478   0.0029    0.944   0.76    0.01  T_RCVD_IN_AHBL_UNKNOWN_1
  0.370   0.0316   2.6972    0.012   0.88   -0.01  T_RCVD_IN_AHBL_EXEMPT_T
  0.249   0.0237   1.7943    0.013   0.87   -0.01  T_RCVD_IN_AHBL_EXEMPT_O
  0.015   0.0166   0.0000    1.000   0.90    0.01  T_RCVD_IN_AHBL_CMPR_DDOS
  0.013   0.0154   0.0000    1.000   0.90    0.01  T_RCVD_IN_AHBL_CMPR_RELAY
  0.010   0.0116   0.0000    1.000   0.90    0.01  T_RCVD_IN_AHBL_CMPR_VIRUS
  0.002   0.0021   0.0000    1.000   0.90    0.01  T_RCVD_IN_AHBL_SMTP
  0.000   0.0000   0.0000    0.500   0.11    0.01  T_RCVD_IN_AHBL_5XXI
 [more zeroes left out]

so, it's mostly good for PROXY and perhaps also RHSBL and SPAM.  The
rest is pretty much noise.

------------------------------------------------------------------------

PROXY vs. XBL, DSBL, and other open proxy blacklists:

 65.624  75.1177   0.3886    0.995   0.99    1.00  RCVD_IN_XBL
 57.331  65.5776   0.6686    0.990   0.97    1.10  RCVD_IN_DSBL
  1.911   2.1872   0.0143    0.994   0.89    1.62  RCVD_IN_SORBS_SOCKS
  0.373   0.4270   0.0029    0.993   0.89    2.90  RCVD_IN_SORBS_WEB
 10.356  11.8345   0.2000    0.983   0.88    1.20  RCVD_IN_SORBS_MISC
 13.778  15.7394   0.3029    0.981   0.88    1.20  RCVD_IN_NJABL_PROXY
  9.198  10.5060   0.2143    0.980   0.87    1.20  RCVD_IN_SORBS_HTTP
  0.301   0.3435   0.0114    0.968   0.82    2.70  RCVD_IN_SORBS_ZOMBIE

vs.

 11.191  12.8079   0.0829    0.994   0.90    0.01  T_RCVD_IN_AHBL_PROXY

and overlap:

28412   0.981   0.188   T_RCVD_IN_AHBL_PROXY,RCVD_IN_DSBL
28208   0.974   0.176   T_RCVD_IN_AHBL_PROXY,__RCVD_IN_SORBS
26397   0.912   0.135   T_RCVD_IN_AHBL_PROXY,__RCVD_IN_SBL_XBL
26363   0.911   0.148   T_RCVD_IN_AHBL_PROXY,RCVD_IN_XBL
25413   0.878   0.411   T_RCVD_IN_AHBL_PROXY,__RCVD_IN_NJABL
24282   0.839   0.691   T_RCVD_IN_AHBL_PROXY,RCVD_IN_NJABL_PROXY
22344   0.772   0.841   T_RCVD_IN_AHBL_PROXY,RCVD_IN_SORBS_MISC
19433   0.671   0.826   T_RCVD_IN_AHBL_PROXY,RCVD_IN_SORBS_HTTP
...

------------------------------------------------------------------------

SPAM and overlap:

13855   0.980   0.071   T_RCVD_IN_AHBL_SPAM,__RCVD_IN_SBL_XBL
13321   0.943   0.675   T_RCVD_IN_AHBL_SPAM,RCVD_IN_SBL
10167   0.719   0.063   T_RCVD_IN_AHBL_SPAM,__RCVD_IN_SORBS
8336    0.590   0.075   T_RCVD_IN_AHBL_SPAM,RCVD_IN_BL_SPAMCOP_NET
7746    0.548   0.125   T_RCVD_IN_AHBL_SPAM,__RCVD_IN_NJABL
...

------------------------------------------------------------------------

AHBL vs. other multi-result blacklists:


 36.047  41.2629   0.2086    0.995   0.95    2.55  RCVD_IN_SORBS_DUL
  1.911   2.1872   0.0143    0.994   0.89    1.62  RCVD_IN_SORBS_SOCKS
  0.373   0.4270   0.0029    0.993   0.89    2.90  RCVD_IN_SORBS_WEB
 10.356  11.8345   0.2000    0.983   0.88    1.20  RCVD_IN_SORBS_MISC
  9.198  10.5060   0.2143    0.980   0.87    1.20  RCVD_IN_SORBS_HTTP
  0.792   0.9048   0.0143    0.984   0.86    1.20  RCVD_IN_SORBS_SMTP
  0.301   0.3435   0.0114    0.968   0.82    2.70  RCVD_IN_SORBS_ZOMBIE
  0.000   0.0000   0.0000    0.500   0.11    0.00  RCVD_IN_SORBS_BLOCK

Hmmm.... SORBS seems a bit better due to the huge DUL hit rate.

  5.355   6.1333   0.0057    0.999   0.91    0.62  RCVD_IN_NJABL_DIALUP
  3.210   3.6750   0.0171    0.995   0.90    0.74  RCVD_IN_NJABL_SPAM
 13.778  15.7394   0.3029    0.981   0.88    1.20  RCVD_IN_NJABL_PROXY
  0.142   0.1597   0.0200    0.889   0.63    1.41  RCVD_IN_NJABL_RELAY
  0.000   0.0000   0.0000    0.500   0.11    0.10  RCVD_IN_NJABL_CGI
  0.000   0.0000   0.0000    0.500   0.11    0.10  RCVD_IN_NJABL_MULTI

Hmmm.... pretty even.

------------------------------------------------------------------------

RHSBL and overlap:

17934   0.871   0.092   T_RCVD_IN_AHBL_RHSBL,__RCVD_IN_SBL_XBL
14374   0.698   0.090   T_RCVD_IN_AHBL_RHSBL,__RCVD_IN_SORBS
13134   0.638   0.074   T_RCVD_IN_AHBL_RHSBL,RCVD_IN_XBL
10697   0.520   0.071   T_RCVD_IN_AHBL_RHSBL,RCVD_IN_DSBL
10612   0.516   0.096   T_RCVD_IN_AHBL_RHSBL,RCVD_IN_BL_SPAMCOP_NET
7454    0.362   0.079   T_RCVD_IN_AHBL_RHSBL,RCVD_IN_SORBS_DUL

Lower overlap than the others, this might be worth keeping (and it's a
separate query anyway).

Maybe we should let the perceptron take a whack at it.  So, why aren't
we running the perceptron on nightly/weekly results?  ;-)

Daniel

-- 
Daniel Quinlan                     anti-spam (SpamAssassin), Linux,
http://www.pathname.com/~quinlan/    and open source consulting

Re: AHBL: promote or delete?

Posted by Daniel Quinlan <qu...@pathname.com>.
Daniel Quinlan wrote:

>> I don't think it's CPU time that's the issue...

Kelsey Cummings <kg...@sonic.net> writes:

> Has the box been sluggish?
>
> I can probably prop something up underneath it if you need more juice.

*grin*

I meant that it's people time.  Thanks, though.  :-)

Daniel

Re: AHBL: promote or delete?

Posted by Kelsey Cummings <kg...@sonic.net>.
On Mon, Apr 26, 2004 at 03:52:48PM -0700, Daniel Quinlan wrote:
> Justin Mason <jm...@jmason.org> writes:
> 
> > go for it!  bugzilla.spamassassin.org is crying out for new uses for
> > CPU time ;)
> 
> I don't think it's CPU time that's the issue...

Has the box been sluggish?

I can probably prop something up underneath it if you need more juice.

-- 
Kelsey Cummings - kgc@sonic.net           sonic.net, inc.
System Administrator                      2260 Apollo Way
707.522.1000 (Voice)                      Santa Rosa, CA 95407
707.547.2199 (Fax)                        http://www.sonic.net/
Fingerprint = D5F9 667F 5D32 7347 0B79  8DB7 2B42 86B6 4E2C 3896

Re: AHBL: promote or delete?

Posted by Daniel Quinlan <qu...@pathname.com>.
Justin Mason <jm...@jmason.org> writes:

> go for it!  bugzilla.spamassassin.org is crying out for new uses for
> CPU time ;)

I don't think it's CPU time that's the issue...

Daniel

-- 
Daniel Quinlan                     anti-spam (SpamAssassin), Linux,
http://www.pathname.com/~quinlan/    and open source consulting