You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by da...@chaosreigns.com on 2011/10/18 00:03:14 UTC

DNSWL.org enforcement of free usage limits

http://www.dnswl.org/news/archives/24-Abusive-use-of-dnswl.org-infrastructure-enforcing-limits.html

This came up in the "Spam email many have RCVD_IN_DNSWL_MED" thread.
DNSWL.org made an announcement about it with more details.  

Basically, free use only allows 100,000 queries per organization per day.
Go over that enough, and you may get "RCVD_IN_DNSWL_HI" hitting all your
email.

If you're handling more than 100,000 emails a day, and don't want to pay
for dnswl.org data, add to your spamassassin config:

score RCVD_IN_DNSWL_HI 0
score RCVD_IN_DNSWL_MED 0
score RCVD_IN_DNSWL_LOW 0
score RCVD_IN_DNSWL_NONE 0

Disclaimer:  I'm a dnswl.org admin.


More discussion of network test free usage limits here:
http://www.spamtips.org/2011/01/usage-limits-of-spamassassin-network.html

Yes, it would still be nice if spamassassin had an option to just disable
all of these.  Maybe just commented out options in a config file?
Something like this, based on that last link:

# spamhaus.org
score DKIMDOMAIN_IN_DWL 0 
score DKIMDOMAIN_IN_DWL_UNKNOWN 0 
score RCVD_IN_CSS 0 
score RCVD_IN_PBL 0 
score RCVD_IN_SBL 0 
score RCVD_IN_XBL 0 
score URIBL_DBL_ERROR 0 
score URIBL_DBL_SPAM 0 
score URIBL_SBL 0 

# Others
set RCVD_IN_PSBL 0
set RCVD_IN_BL_SPAMCOP_NET 0
set RCVD_IN_BRBL_LASTEXT 0
set DNS_FROM_AHBL_RHSBL 0

# Sorbs.net
set RCVD_IN_SORBS_HTTP    0
set RCVD_IN_SORBS_SOCKS   0
set RCVD_IN_SORBS_MISC    0
set RCVD_IN_SORBS_SMTP    0
set RCVD_IN_SORBS_WEB     0
set RCVD_IN_SORBS_BLOCK   0
set RCVD_IN_SORBS_ZOMBIE  0
set RCVD_IN_SORBS_DUL     0

# NJABL.org
set RCVD_IN_NJABL_RELAY 0
set RCVD_IN_NJABL_SPAM 0
set RCVD_IN_NJABL_MULTI 0
set RCVD_IN_NJABL_CGI 0
set RCVD_IN_NJABL_PROXY 0

# rfc-ignorant.org
DNS_FROM_RFC_DSN
DNS_FROM_RFC_BOGUSMX

# DNSWL.org
score RCVD_IN_DNSWL_HI 0
score RCVD_IN_DNSWL_MED 0
score RCVD_IN_DNSWL_LOW 0
score RCVD_IN_DNSWL_NONE 0

# ReturnPath.net
set RCVD_IN_RP_CERTIFIED 0
set RCVD_IN_RP_RNBL 0
set RCVD_IN_RP_SAFE 0

# SuretyMail / isipp.com
set RCVD_IN_IADB_VOUCHED 0
set RCVD_IN_IADB_DK 0
set RCVD_IN_IADB_DOPTIN 0
set RCVD_IN_IADB_DOPTIN_GT50 0
set RCVD_IN_IADB_DOPTIN_LT50 0
set RCVD_IN_IADB_EDDB 0
set RCVD_IN_IADB_EPIA 0
set RCVD_IN_IADB_GOODMAIL 0
set RCVD_IN_IADB_LISTED 0
set RCVD_IN_IADB_LOOSE 0
set RCVD_IN_IADB_MI_CPEAR 0
set RCVD_IN_IADB_MI_CPR_30 0
set RCVD_IN_IADB_MI_CPR_MAT 0
set RCVD_IN_IADB_ML_DOPTIN 0
set RCVD_IN_IADB_NOCONTROL 0
set RCVD_IN_IADB_OOO 0
set RCVD_IN_IADB_OPTIN 0
set RCVD_IN_IADB_OPTIN_GT50 0
set RCVD_IN_IADB_OPTIN_LT50 0
set RCVD_IN_IADB_OPTOUTONLY 0
set RCVD_IN_IADB_RDNS 0
set RCVD_IN_IADB_SENDERID 0
set RCVD_IN_IADB_SPF 0
set RCVD_IN_IADB_UNVERIFIED_1 0
set RCVD_IN_IADB_UNVERIFIED_2 0
set RCVD_IN_IADB_UT_CPEAR 0
set RCVD_IN_IADB_UT_CPR_30 0
set RCVD_IN_IADB_UT_CPR_MAT 0

# SURBL.org
set URIBL_SC_SURBL 0
set URIBL_WS_SURBL 0
set URIBL_PH_SURBL 0
set URIBL_OB_SURBL 0
set URIBL_AB_SURBL 0
set URIBL_JP_SURBL 0

# DCC
set DCC_CHECK 0
set DCC_REPUT_00_12 0
set DCC_REPUT_70_89 0
set DCC_REPUT_90_94 0
set DCC_REPUT_95_98 0
set DCC_REPUT_99_100 0

-- 
"The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself.  Therefore all progress
depends on the unreasonable man." - George Bernard Shaw
http://www.ChaosReigns.com

Re: DNSWL.org enforcement of free usage limits

Posted by Benny Pedersen <me...@junc.org>.
On Tue, 18 Oct 2011 21:55:11 -0400, David F. Skoll wrote:
> X-CanIt-Geo: No geolocation information available for 192.168.10.23

bill me for that one :-)

> My original measurements and script are here:
> 
> http://article.gmane.org/gmane.mail.spam.spamassassin.general/132047/match=cache

bind can use syslog, so its possible to make perl parse logs in live 
time, or simply rndc querylog, lots of logging dns, default disabled

Re: DNSWL.org enforcement of free usage limits

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Tue, 2011-10-18 at 21:55 -0400, David F. Skoll wrote:
> On Wed, 19 Oct 2011 03:12:34 +0200, Karsten Bräckelmann wrote:
> 
> > > That's true, though caching is much less effective than you may
> > > suppose.  In real-life measurements on real mail servers, I found a
> > > very low cache hit rate for common DNS{B,W}Ls, on the order of only
> > > 25-50% hits.
> 
> > As in cache hits? That's quite substantial.
> 
> I didn't think so.  It means that between 50-75% of DNS lookups must
> go all the way to the authoritative name server.

With more than 90% spam of the mail volume (according to almost any
published stats), even 25% cache hits mean, that caching does not only
work for ham, but spam, too.

Anyway, it means that the volume of messages before hitting the free
usage limit is 25-50% higher than the commonly perceived and frequently
incorrectly claimed limit (where one message does equal one query for IP
based lists). These numbers tell differently -- up to half the query
limit in addition in terms of mail.


> > Also, is this overall, somehow a mix of both black and white-lists, as
> > well as different types (IP vs URI)?
> 
> My measurements were against IP blacklists.
> 
> > Given the very different TTL for different types of lists, I suspect
> > actual cache hit rates vary a lot.
> 
> Not without pretty high TTLs, in our experience.  And DNSBL operators

I was talking about different *types*. As in IP vs URI. Where TTLs do
vary a lot -- 3 minutes for SURBL, 12 hours for DNSWL.

> have two motivations for having relatively low TTLs: One is to make
> sure the data is fresh, and two is to detect high-volume users so they
> can be billed.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: DNSWL.org enforcement of free usage limits

Posted by "David F. Skoll" <df...@roaringpenguin.com>.
On Wed, 19 Oct 2011 03:12:34 +0200
Karsten Bräckelmann <gu...@rudersport.de> wrote:

> > That's true, though caching is much less effective than you may
> > suppose.  In real-life measurements on real mail servers, I found a
> > very low cache hit rate for common DNS{B,W}Ls, on the order of only
> > 25-50% hits.

> As in cache hits? That's quite substantial.

I didn't think so.  It means that between 50-75% of DNS lookups must
go all the way to the authoritative name server.

> Also, is this overall, somehow a mix of both black and white-lists, as
> well as different types (IP vs URI)?

My measurements were against IP blacklists.

> Given the very different TTL for different types of lists, I suspect
> actual cache hit rates vary a lot.

Not without pretty high TTLs, in our experience.  And DNSBL operators
have two motivations for having relatively low TTLs: One is to make
sure the data is fresh, and two is to detect high-volume users so they
can be billed.

My original measurements and script are here:
http://article.gmane.org/gmane.mail.spam.spamassassin.general/132047/match=cache

Regards,

David.

Re: DNSWL.org enforcement of free usage limits

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Tue, 2011-10-18 at 20:24 -0400, David F. Skoll wrote:
> On Tue, 18 Oct 2011 23:55:41 +0200, Karsten Bräckelmann wrote:
> 
> > The DNS TTL appears to be 12 hours, and a good share of mail
> > (definitely true for ham, only partly for spam) is received from a
> > rather limited number of distinct SMTP servers, only. With a local,
> > caching DNS server the number of mail a system can handle per day
> > before exceeding the free usage limit is *much* higher.
> 
> > number of mail != number of DNS lookups
> 
> That's true, though caching is much less effective than you may
> suppose.  In real-life measurements on real mail servers, I found a
> very low cache hit rate for common DNS{B,W}Ls, on the order of only
> 25-50% hits.

As in cache hits? That's quite substantial.

Also, is this overall, somehow a mix of both black and white-lists, as
well as different types (IP vs URI)? Given the very different TTL for
different types of lists, I suspect actual cache hit rates vary a lot.
Your users and their peers can make a huge difference, too.

And of course other related filtering, like blocking at SMTP level.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: DNSWL.org enforcement of free usage limits

Posted by "David F. Skoll" <df...@roaringpenguin.com>.
On Tue, 18 Oct 2011 23:55:41 +0200
Karsten Bräckelmann <gu...@rudersport.de> wrote:

> The DNS TTL appears to be 12 hours, and a good share of mail
> (definitely true for ham, only partly for spam) is received from a
> rather limited number of distinct SMTP servers, only. With a local,
> caching DNS server the number of mail a system can handle per day
> before exceeding the free usage limit is *much* higher.

> number of mail != number of DNS lookups

That's true, though caching is much less effective than you may
suppose.  In real-life measurements on real mail servers, I found a
very low cache hit rate for common DNS{B,W}Ls, on the order of only
25-50% hits.

Regards,

David.

Re: DNSWL.org enforcement of free usage limits

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Tue, 2011-10-18 at 23:55 +0200, Karsten Bräckelmann wrote:
> The DNS TTL appears to be 12 hours, and a good share of mail (definitely
> true for ham, only partly for spam) is received from a rather limited
> number of distinct SMTP servers, only. With a local, caching DNS server
> the number of mail a system can handle per day before exceeding the free
> usage limit is *much* higher.

Oops -- higher, though not that much higher. Unless your local, caching
DNS also does negative caching...


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: DNSWL.org enforcement of free usage limits

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Tue, 2011-10-18 at 23:55 +0200, Karsten Bräckelmann wrote:
> > Basically, free use only allows 100,000 queries per organization per day.
> > If you're handling more than 100,000 emails a day,
> 
> That's a theoretical lower bound, and incorrect in real life.
> 
> The DNS TTL appears to be 12 hours, and a good share of mail (definitely
> true for ham, only partly for spam) is received from a rather limited
> number of distinct SMTP servers, only. With a local, caching DNS server
> the number of mail a system can handle per day before exceeding the free
> usage limit is *much* higher.
> 
> number of mail != number of DNS lookups

... at the dnswl.org DNS mirror infrastructure I mean, obviously.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: DNSWL.org enforcement of free usage limits

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Mon, 2011-10-17 at 18:03 -0400, darxus@chaosreigns.com wrote:
> http://www.dnswl.org/news/archives/24-Abusive-use-of-dnswl.org-infrastructure-enforcing-limits.html

> Basically, free use only allows 100,000 queries per organization per day.

> If you're handling more than 100,000 emails a day,

That's a theoretical lower bound, and incorrect in real life.

The DNS TTL appears to be 12 hours, and a good share of mail (definitely
true for ham, only partly for spam) is received from a rather limited
number of distinct SMTP servers, only. With a local, caching DNS server
the number of mail a system can handle per day before exceeding the free
usage limit is *much* higher.

number of mail != number of DNS lookups


> and don't want to pay
> for dnswl.org data, add to your spamassassin config:
> 
> score RCVD_IN_DNSWL_HI 0
> score RCVD_IN_DNSWL_MED 0
> score RCVD_IN_DNSWL_LOW 0
> score RCVD_IN_DNSWL_NONE 0

You missed the eval rule actually doing the DNS lookup...

  meta __RCVD_IN_DNSWL  0


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}