You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spamassassin.apache.org by "Daryl C. W. O'Shea" <sp...@dostech.ca> on 2007/09/06 04:16:21 UTC

net mass-checks triggering (URI)DNSBL provider blocks?

Random thoughts on frequent re-scoring mass-checks...

If we do more frequent --net mass-checks we may individually run the 
chance of being blocked by the providers of the (URI)DNSBLs such as 
Spamhaus.

Has anyone been blocked to date?  Probably not given the once a week 
frequency.

Are the hit-rates of the lists high enough that the results that aren't 
cached by the use of --reuse low enough to fall under the block 
triggering level?  Either way, I guess we should get around to figuring 
out a way of caching the non-hits.  I'm thinking of a method that 
assumes you ran the rules (based on the SA version in the message 
header) unless you've specifically told it you don't run a particular rule.

Should we look at getting zone transfers from the various providers and 
hosting a copy on the zone that committers could use?


Daryl

Re: net mass-checks triggering (URI)DNSBL provider blocks?

Posted by Theo Van Dinter <fe...@apache.org>.

On Thu, Sep 06, 2007 at 08:29:24AM -0400, Matt Kettler wrote:
> I don't think the intent was to allow <the_world>, merely <the_committers>.

If the idea is to help people doing the weekly/net runs, then it's not
<the_committers>, it's <anyone_who_does_the_weekly/net_runs>.

And that list is an open-ended list of anyone who asks to do it.  Most of whom
probably don't have static IPs, which means either keeping the ACL updated
limiting access, or opening to <the_world> or some subset thereof.

-- 
Randomly Selected Tagline:
"4. Alan Greenspan thanks you for ending the recession."
         - Top Ten Clues You Have Been Spending Too Much Time Shopping On-line

Re: net mass-checks triggering (URI)DNSBL provider blocks?

Posted by Matt Kettler <mk...@verizon.net>.

Theo Van Dinter wrote:
>> queries for a particular domain to a the zone machine.
>>     
> If so, it would be really easy to add a forwarding zone to forward all
>
> Yes.  Of course it's possible to forward the requests to the zone machine.
>
> But that's not really a solution.  The ASF folks already have problems
> with us using so many resources, they're not going to be happy with
> <the_world> using the machine for DNS. 
I don't think the intent was to allow <the_world>, merely <the_committers>.

Going back to the top of the thread:

"Should we look at getting zone transfers from the various providers and
hosting a copy on the zone that committers could use? "

>  From a sysadmin perspective,
> that's horrible.  I also don't think all these places would let us allow
> open access to their zone data.  
I don't think that was the intent, and I agree they wouldn't allow that.

Re: net mass-checks triggering (URI)DNSBL provider blocks?

Posted by Theo Van Dinter <fe...@apache.org>.

On Thu, Sep 06, 2007 at 12:26:03AM -0400, Matt Kettler wrote:
> > That's great if we use the zone machine for DNS, that doesn't really work for
> > individuals running on our own machines...  ;)
> 
> Do you run a simple caching named on your machine?

No, I run a full multi domain named on my machine.  But it does caching. :)

>  If so, it would be really easy to add a forwarding zone to forward all
> queries for a particular domain to a the zone machine.

Yes.  Of course it's possible to forward the requests to the zone machine.

But that's not really a solution.  The ASF folks already have problems
with us using so many resources, they're not going to be happy with
<the_world> using the machine for DNS.  From a sysadmin perspective,
that's horrible.  I also don't think all these places would let us allow
open access to their zone data.  So I'd expect us to limit the usage to
only the local machine.

> But I think --reuse should suffice. However, we should be on the lookout
> for the fact that spamhaus is auto-detecting and auto-blacklisting sites
> making lots of queries. That could dramatically change the scoring of
> the rules.

Perhaps we should talk to these services and figure out a way to make it all
work?

> Which also brings up a second issue. Should we disable Spamhaus by
> default as we've done in the past for razor and DCC? They're no longer
> "free for everyone", and actually even reasonably small networks can't
> use them for free (100 user limit).

Sounds reasonable to me.

-- 
Randomly Selected Tagline:
"Windows 98 -- Go for the bloat!"      - Theo Van Dinter

Re: net mass-checks triggering (URI)DNSBL provider blocks?

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.

Matt Kettler wrote:

> Which also brings up a second issue. Should we disable Spamhaus by
> default as we've done in the past for razor and DCC? They're no longer
> "free for everyone", and actually even reasonably small networks can't
> use them for free (100 user limit).

This has been on my mind for a while.  I'd hate to lose accurate scoring 
on a set of rules that hits such a large proportion of mail.

Also on my mind... a documentation project for somebody.  To help to 
prevent abuse of the public (URI)DNSBL servers, we should maintain a 
user friendly list of all the network resources SA uses along with rsync 
access contact info.

Daryl

Re: net mass-checks triggering (URI)DNSBL provider blocks?

Posted by Matt Kettler <mk...@verizon.net>.

Theo Van Dinter wrote:
>
> That's great if we use the zone machine for DNS, that doesn't really work for
> individuals running on our own machines...  ;)
>   

Do you run a simple caching named on your machine?

 If so, it would be really easy to add a forwarding zone to forward all
queries for a particular domain to a the zone machine.

A quick named.conf example would be something like this:

options {
    forwarders { <INSERT ISP DNS SERVERS HERE>};
    forward only;
};

zone "example.com" IN {

    type forward;
    forward only;
    forwarders {<INSERT ZONE MACHINE HERE>;};
};

But I think --reuse should suffice. However, we should be on the lookout
for the fact that spamhaus is auto-detecting and auto-blacklisting sites
making lots of queries. That could dramatically change the scoring of
the rules.

Which also brings up a second issue. Should we disable Spamhaus by
default as we've done in the past for razor and DCC? They're no longer
"free for everyone", and actually even reasonably small networks can't
use them for free (100 user limit).

Re: net mass-checks triggering (URI)DNSBL provider blocks?

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.

Theo Van Dinter wrote:
> On Wed, Sep 05, 2007 at 10:16:21PM -0400, Daryl C. W. O'Shea wrote:
>> If we do more frequent --net mass-checks we may individually run the 
>> chance of being blocked by the providers of the (URI)DNSBLs such as 
>> Spamhaus.
>>
>> Has anyone been blocked to date?  Probably not given the once a week 
>> frequency.
> 
> If done correctly, this isn't an issue.  This is another benefit of
> --reuse.  :)
> 
>> Are the hit-rates of the lists high enough that the results that aren't 
>> cached by the use of --reuse low enough to fall under the block 
>> triggering level?  Either way, I guess we should get around to figuring 
> 
> You want as much as possible to be able to use --reuse.
> 
>> out a way of caching the non-hits.  I'm thinking of a method that 
> 
> It does this now, doesn't it?  IIRC, --reuse says that if there is a X-Spam-Status
> header, it's assumed all the net rules were run and so they're not run again.

Well sort of... (as below) new rules can't be distinguished from no hit 
or never tried.  Not sure why I was thinking a rule was run if there was 
no indication of it hitting before.

>> assumes you ran the rules (based on the SA version in the message 
>> header) unless you've specifically told it you don't run a particular rule.
> 
> I started working on, but never fully implemented, the NetCache plugin.
> The idea is that all network requests and responses (or lack thereof)
> would be stored as a header in the message.  Then on the mass-check run, that
> data would be used for responses.  This way, even some new rules could use
> this information depending on what they're looking for...

I had remembered you wanting to do this and had forgotten all about the 
NetCache plugin.

>> Should we look at getting zone transfers from the various providers and 
>> hosting a copy on the zone that committers could use?
> 
> That's great if we use the zone machine for DNS, that doesn't really work for
> individuals running on our own machines...  ;)

Well of course.  You'd have to forward those zones in your local caching 
server (like anyone else using rbldnsd), or transfer/rsync the zones to 
your own machine for it to be of any use.  Pretty much a non-issue 
though given that --reuse doesn't allow the queries like I was thinking.


Daryl

Re: net mass-checks triggering (URI)DNSBL provider blocks?

Posted by Theo Van Dinter <fe...@apache.org>.

On Wed, Sep 05, 2007 at 10:16:21PM -0400, Daryl C. W. O'Shea wrote:
> If we do more frequent --net mass-checks we may individually run the 
> chance of being blocked by the providers of the (URI)DNSBLs such as 
> Spamhaus.
> 
> Has anyone been blocked to date?  Probably not given the once a week 
> frequency.

If done correctly, this isn't an issue.  This is another benefit of
--reuse.  :)

> Are the hit-rates of the lists high enough that the results that aren't 
> cached by the use of --reuse low enough to fall under the block 
> triggering level?  Either way, I guess we should get around to figuring 

You want as much as possible to be able to use --reuse.

> out a way of caching the non-hits.  I'm thinking of a method that 

It does this now, doesn't it?  IIRC, --reuse says that if there is a X-Spam-Status
header, it's assumed all the net rules were run and so they're not run again.

> assumes you ran the rules (based on the SA version in the message 
> header) unless you've specifically told it you don't run a particular rule.

I started working on, but never fully implemented, the NetCache plugin.
The idea is that all network requests and responses (or lack thereof)
would be stored as a header in the message.  Then on the mass-check run, that
data would be used for responses.  This way, even some new rules could use
this information depending on what they're looking for...

> Should we look at getting zone transfers from the various providers and 
> hosting a copy on the zone that committers could use?

That's great if we use the zone machine for DNS, that doesn't really work for
individuals running on our own machines...  ;)

-- 
Randomly Selected Tagline:
"A leader leads from in front, by the power of example. A ruler pushes
 from behind, by means of the club, the whip, the power of fear."
         - Edward Abbey

Re: net mass-checks triggering (URI)DNSBL provider blocks?

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.

Michael Parker wrote:
> Daryl C. W. O'Shea wrote:
>> Michael Parker wrote:
>>
>>> Maybe we should add a --force-reuse that would ignore any msgs that
>>> can't be reused.
>> I'm thinking that should be the only option for reuse.
>>
> 
> This is how it originally worked, but a large portion of the spam traps
> had never been run through SA so it wasn't all that feasible.  So I
> added in a hack that would swap in and out the regular/reuse configs
> depending on if the X-Spam-Status header exists.

Ah, I didn't realize it enabled the checks if no X-Spam-Status header 
was present.  This is probably what I was recalling when I started this 
whole thread.

> Are folks now running all their spamtrap mail through SA?  Maybe we
> should remove it.

I am, but I don't think jm is (I know he was dealing with load issues on 
his box).  Not sure about anyone else.

Daryl

Re: net mass-checks triggering (URI)DNSBL provider blocks?

Posted by Michael Parker <pa...@pobox.com>.

Daryl C. W. O'Shea wrote:
> Michael Parker wrote:
> 
>> Maybe we should add a --force-reuse that would ignore any msgs that
>> can't be reused.
> 
> I'm thinking that should be the only option for reuse.
> 

This is how it originally worked, but a large portion of the spam traps
had never been run through SA so it wasn't all that feasible.  So I
added in a hack that would swap in and out the regular/reuse configs
depending on if the X-Spam-Status header exists.

Are folks now running all their spamtrap mail through SA?  Maybe we
should remove it.

Michael

Re: net mass-checks triggering (URI)DNSBL provider blocks?

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.

Michael Parker wrote:
> Daryl C. W. O'Shea wrote:

>> Are the hit-rates of the lists high enough that the results that aren't
>> cached by the use of --reuse low enough to fall under the block
>> triggering level?  Either way, I guess we should get around to figuring
>> out a way of caching the non-hits.  I'm thinking of a method that
>> assumes you ran the rules (based on the SA version in the message
>> header) unless you've specifically told it you don't run a particular rule.

> --reuse should take care of this.  Everyone should save their X-Spam-*
> headers in their corpus msgs. Reuse sets the rule score to zero so for
> msgs that it didn't hit, and still have their X-Spam-Status header
> present we shouldn't be doing any sort of lookup.

Ah, that's right, thanks.  For some reason I was thinking that any 
message that didn't previously have a hit recorded would have the tests run.

> Maybe we should add a --force-reuse that would ignore any msgs that
> can't be reused.

I'm thinking that should be the only option for reuse.


Daryl

Re: net mass-checks triggering (URI)DNSBL provider blocks?

Posted by Michael Parker <pa...@pobox.com>.

Daryl C. W. O'Shea wrote:
> Random thoughts on frequent re-scoring mass-checks...
> 
> If we do more frequent --net mass-checks we may individually run the
> chance of being blocked by the providers of the (URI)DNSBLs such as
> Spamhaus.
> 
> Has anyone been blocked to date?  Probably not given the once a week
> frequency.
> 
> Are the hit-rates of the lists high enough that the results that aren't
> cached by the use of --reuse low enough to fall under the block
> triggering level?  Either way, I guess we should get around to figuring
> out a way of caching the non-hits.  I'm thinking of a method that
> assumes you ran the rules (based on the SA version in the message
> header) unless you've specifically told it you don't run a particular rule.
> 
> Should we look at getting zone transfers from the various providers and
> hosting a copy on the zone that committers could use?
> 

--reuse should take care of this.  Everyone should save their X-Spam-*
headers in their corpus msgs. Reuse sets the rule score to zero so for
msgs that it didn't hit, and still have their X-Spam-Status header
present we shouldn't be doing any sort of lookup.

Maybe we should add a --force-reuse that would ignore any msgs that
can't be reused.

Michael

Re: net mass-checks triggering (URI)DNSBL provider blocks?

Posted by Jeff Chan <je...@surbl.org>.

Quoting Matthias Leisi <ma...@leisi.net>:

> > I couldn't say for sure what a safe level of queries would be. If I were
> > to guess, I'd say somewhere under 20,000 per 24 hour period.
>
> Speaking for dnswl.org, we consider sites[*] with > 100k queries per 24
> hour as "heavy users" who should rsync our data and run a local mirror.

This pretty much agrees with SURBL policies.  However IMO blacklist operators
should probably make exceptions for open source anti-spam research purposes
such as the SA mass checks and allow rsync access for your test servers' local
DNS servers.  Spamhaus ought to see the value in that too.

Cheers,

Jeff C.

P.S. SURBL is starting to ask for donations from large ISPs and anti-spam
vendors.   We intend to use that to offset some of our time and expenses while
continuing to provide free DNS access for smaller users, i.e., something like
the previous Spamhaus model.

Re: net mass-checks triggering (URI)DNSBL provider blocks?

Posted by Matthias Leisi <ma...@leisi.net>.

> I couldn't say for sure what a safe level of queries would be. If I were
> to guess, I'd say somewhere under 20,000 per 24 hour period.

Speaking for dnswl.org, we consider sites[*] with > 100k queries per 24
hour as "heavy users" who should rsync our data and run a local mirror.
20k will only show on the third or fourth page of the report which we
hardly ever look at.

A local mirror for net mass-checks may make sense for speed and
reliability reasons; rsync access for dnswl.org is free. Since our data is
rather small (at least when compared to a typical blocklist), it can also
be handled as a BIND zone file (ie no rbldnsd is required, which makes
handling local zones a bit easier).

[*] We consider both individual querying IP addresses and the sum from
queries from /24s.

-- Matthias

Re: net mass-checks triggering (URI)DNSBL provider blocks?

Posted by Duane Hill <d....@yournetplus.com>.

On Wed, 5 Sep 2007 at 22:16 -0400, spamassassin@dostech.ca confabulated:

> Has anyone been blocked to date?  Probably not given the once a week 
> frequency.

Not speaking of mass-checks, I'm guessing it's a pretty high ratio. Our 
servers are blocked. The average 5xx rejections in a given 24 hour period 
were in the numbers of around 1.75 to 2 million at the MTA level.

We are at present looking into the fee for the zone transfer.

I couldn't say for sure what a safe level of queries would be. If I were 
to guess, I'd say somewhere under 20,000 per 24 hour period.

According their web site, the blacklists are free for light traffic. They 
consider light traffic where a server is filtering under 100 mailboxes.

------
   _|_
  (_| |