You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Andy Dills <an...@xecu.net> on 2010/06/11 16:42:31 UTC

Should Spamhaus default to disabled?

After recently upgrading to a new mail cluster with SA 3.3.1, we were 
contacted (at every imaginable POC address) with a solicitation to 
purchase access to utilize the Spamhaus blacklists, or they'll stop 
answering our queries.

We felt the amount of money being asked for was unreasonable, as we felt 
we likely wouldn't see an increase in spam if we turned them off.

So, local.cf got:

score URIBL_DBL_SPAM 0
score URIBL_DBL_ERROR 0
score RCVD_IN_ZEN 0

I think those are the only queries that generate lookups against Spamhaus, 
but I'm not positive.

Regardless, we noticed no increase in spam after disabling these tests. 
I imagine there's lots of overlap on the blacklists.

I think the maintainers of SA should strongly consider defaulting Spamhaus 
to "off". At the very least, it should be better documented how to entire 
disable Spamhaus queries.

They have the right to charge for their data, but I question whether it's 
appropriate for an open-source project to generate sales leads in this 
manner.

Andy

---
Andy Dills
Xecunet, Inc.
www.xecu.net
301-682-9972
---

Re: Should Spamhaus default to disabled?

Posted by Yet Another Ninja <sa...@alexb.ch>.
On 2010-06-11 16:42, Andy Dills wrote:
> After recently upgrading to a new mail cluster with SA 3.3.1, we were 
> contacted (at every imaginable POC address) with a solicitation to 
> purchase access to utilize the Spamhaus blacklists, or they'll stop 
> answering our queries.
> 
> We felt the amount of money being asked for was unreasonable, as we felt 
> we likely wouldn't see an increase in spam if we turned them off.
> 
> So, local.cf got:
> 
> score URIBL_DBL_SPAM 0
> score URIBL_DBL_ERROR 0
> score RCVD_IN_ZEN 0
> 
> I think those are the only queries that generate lookups against Spamhaus, 
> but I'm not positive.
> 
> Regardless, we noticed no increase in spam after disabling these tests. 
> I imagine there's lots of overlap on the blacklists.
> 
> I think the maintainers of SA should strongly consider defaulting Spamhaus 
> to "off". At the very least, it should be better documented how to entire 
> disable Spamhaus queries.
> 
> They have the right to charge for their data, but I question whether it's 
> appropriate for an open-source project to generate sales leads in this 
> manner.

this horse is very dead...  Your traffic generated the sales lead, not SA.





Re: Should Spamhaus default to disabled?

Posted by Joseph Brennan <br...@columbia.edu>.
Andy Dills <an...@xecu.net> wrote:

> We felt the amount of money being asked for was unreasonable, as we felt
> we likely wouldn't see an increase in spam if we turned them off.


We're paying customers of Spamhaus. Their lists account for about 85%
of our spam rejects. I agree it's not cheap, but it's really effective
and very accurate.

But our strategy is to check Spamhaus and SURBL first, and run SA on
what passes those tests. Since those are cheap and fast tests, and
running SA takes more time, we think we win by running SA on only
the remaining 15% of incoming. Even if you are right that SA would
catch pretty much the same messages, we'd need significantly more
hardware to do it only with SA.

I realize this is separate from the question of whether SA should run
Spamhaus tests by default. I just want to make a point about Spamhaus.

Joseph Brennan
Columbia University Information Technology



Re: Should Spamhaus default to disabled?

Posted by Ted Mittelstaedt <te...@ipinc.net>.

On 6/11/2010 8:00 AM, Matus UHLAR - fantomas wrote:
> On 11.06.10 10:42, Andy Dills wrote:
>> After recently upgrading to a new mail cluster with SA 3.3.1, we were
>> contacted (at every imaginable POC address) with a solicitation to
>> purchase access to utilize the Spamhaus blacklists, or they'll stop
>> answering our queries.
>
> You apparently generate too much of traffic for them.>  I think the maintainers of SA should strongly consider defaulting Spamhaus
>> to "off". At the very least, it should be better documented how to entire
>> disable Spamhaus queries.
>
> They have some limits into which most of companies will fit, but you will
> not. As any service, they may have their usage policy which some
> companies won't fullfill. But that's not reason why it should not be
> defaulted to on.
>
>> They have the right to charge for their data, but I question whether it's
>> appropriate for an open-source project to generate sales leads in this
>> manner.
>

Just one thought - on our mailservers SA is only run on mail that
makes it past antivirus scanning, greylisting, and a bunch of other
spam checks.  The majority of spam or junk is peeled off the incoming
mail stream before SA gets it.  I realize this increases CPU processing
of mail but hardware is dirt-cheap these days.  Just a thought.

Ted

Re: Should Spamhaus default to disabled?

Posted by Matus UHLAR - fantomas <uh...@fantomas.sk>.
On 11.06.10 10:42, Andy Dills wrote:
> After recently upgrading to a new mail cluster with SA 3.3.1, we were 
> contacted (at every imaginable POC address) with a solicitation to 
> purchase access to utilize the Spamhaus blacklists, or they'll stop 
> answering our queries.

You apparently generate too much of traffic for them.> I think the maintainers of SA should strongly consider defaulting Spamhaus 
> to "off". At the very least, it should be better documented how to entire 
> disable Spamhaus queries.

They have some limits into which most of companies will fit, but you will
not. As any service, they may have their usage policy which some
companies won't fullfill. But that's not reason why it should not be
defaulted to on.

> They have the right to charge for their data, but I question whether it's 
> appropriate for an open-source project to generate sales leads in this 
> manner.

SA does not generate sales for SpamHaus. It just uses their free service
which won't be a problem fod majority of SA users.

-- 
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
We are but packets in the Internet of life (userfriendly.org)

Re: Should Spamhaus default to disabled?

Posted by "corpus.defero" <co...@idnet.com>.
> On Fri, 11 Jun 2010 10:42:31 -0400 (EDT)
> Andy Dills <an...@xecu.net> wrote:
> 
> 
>  I think the maintainers of SA should strongly consider defaulting
>  Spamhaus to "off". At the very least, it should be better documented
>  how to entire disable Spamhaus queries.

I think the maintainers of SA should strongly consider turning the
Return Path accreditations rules to off by default, but it ain't gonna
happen ;-)


Re: Should Spamhaus default to disabled?

Posted by RW <rw...@googlemail.com>.
On Fri, 11 Jun 2010 10:42:31 -0400 (EDT)
Andy Dills <an...@xecu.net> wrote:


> I think the maintainers of SA should strongly consider defaulting
> Spamhaus to "off". At the very least, it should be better documented
> how to entire disable Spamhaus queries.

IMO defaults should be set for the benefit of SOHO users. Professional
admins are paid not to be clueless about the software they administer.


Re: Should Spamhaus default to disabled?

Posted by Benny Pedersen <me...@junc.org>.
On Fri 11 Jun 2010 04:42:31 PM CEST, Andy Dills wrote

> After recently upgrading to a new mail cluster with SA 3.3.1, we were
> contacted (at every imaginable POC address) with a solicitation to
> purchase access to utilize the Spamhaus blacklists, or they'll stop
> answering our queries.

mail clusters have there own local dns cache so it should not generate  
that much dns trafic for that cluster, or i just have not seen or  
understand the real problem, ask them to extend ttl for hits so local  
dns cache can cache more, and generate less trafic to there to few  
servers :(

have low ttl does not always pay out

my own dns hoster always use 12H, why ?, SOA will get updated on  
updates, if spamhaus cant be cool there its there own problem :=)

-- 
xpoint http://www.unicom.com/pw/reply-to-harmful.html


Re: Should Spamhaus default to disabled?

Posted by Nataraj <in...@rjl.com>.
Andy Dills wrote:
> That's fair. Except, we're not a "large organization" by any stretch of 
> the imagination.
>   
Have you checked that you are legitimately exceeding the free query 
limit and that something isn't misconfigured and causing multiple 
lookups of the same IP address when it could be accessed from a properly 
configured caching DNS?  If the spamassasin code is working correctly, 
checking blocklists that are part of a larger  composite list such as 
Zen should not cause multiple queries, however checking the spamhaus 
URIBL "DBL" is a seperate query from Zen, so you might be able to access 
either Zen or DBL, but not both and still stay within the limit.

I would run tcpdump and monitor the actual queries being sent to 
spamhaus to see if it's all working correctly.

Nataraj


Re: Should Spamhaus default to disabled?

Posted by RW <rw...@googlemail.com>.
On Sat, 12 Jun 2010 18:30:08 -0700
Ted Mittelstaedt <te...@ipinc.net> wrote:

> I can't see as how the CEO of Spamhaus is making out like the
> CEO of your typical public company, so knock it off.
> 
> There is nothing wrong with a for-profit organization running an
> open source division and making sales calls into users of the products
> of that division. 

It's the other way around. Spamhaus is a non-profit organisation run
by volunteers. SpamTEQ is allowed to market Spamhaus's data and, in
return, provides infrastructure to the Spamhaus project.

Spamhaus is not an open-source division of a commercial company, or
any kind of loss-leader marketing ploy. 

Re: Should Spamhaus default to disabled?

Posted by Ted Mittelstaedt <te...@ipinc.net>.

On 6/12/2010 7:09 AM, Andy Dills wrote:
> On Sat, 12 Jun 2010, Yet Another Ninja wrote:
>
>> On 2010-06-12 15:20, Andy Dills wrote:
>>> 300,000 queries per day...per server? per CIDR? What is the delimiter?
>>>
>>> Because there is certainly no single IP generating 300,000 queries per day.
>>
>> That is probably your problem... use a central DNS resolver and your query
>> count will instantly decrease
>>
>> I bet you're querying from:
>>
>> 216.127.136.200 dns02.xecu.net
>> 216.127.136.247 mail-out07.xecu.net
>> 216.127.136.242 mail-out02.xecu.net
>> 216.127.136.246 mail-out06.xecu.net
>> 216.127.136.196 mg6.xecu.net
>> 216.127.136.241 mail-out01.xecu.net
>> 216.127.136.245 mail-out05.xecu.net
>> 216.127.136.243 mail-out03.xecu.net
>> 216.127.136.244 mail-out04.xecu.net
>
> Those and a few others.
>
> That's why I'm asking how the limits are designed. In the past I had
> problems a certain other blacklist wanting money. We were using a central
> resolver. Their thresholds were based on queries per IP, not network.
>
> Using a central resolver put us over their threshold. Distributing out to
> the individual servers put us under their threshold. I pointed out the
> silliness of this, as it actually increased overall traffic, but they
> weren't interested in my opinion, just my money. I would prefer to just
> rsync the data, resolve it locally and save everybody the hassle. But
> nooooo, that costs even more! Because remember, this isn't about defraying
> costs (reasonable), this is about generating revenue (reasonable, but not
> for a default-enabled option in free software).

Andy, grow up.

While it would be great if every open source/free project out there had 
a sugar daddy, not all do.  I can't speak for either this company you
were snookering or for Spamhaus as to what their cash flow is but 
somebody is paying the bill for a machine, somewhere, in each of those
orgs, and those orgs are doing the best they can to recoup their costs.
I can't see as how the CEO of Spamhaus is making out like the CEO
of your typical public company, so knock it off.

There is nothing wrong with a for-profit organization running an
open source division and making sales calls into users of the products
of that division.  This is a legitimate business model, one that
IMHO gives far more value to the community than some company like
Microsoft, which is almost 100% closed source, and has a long history
of using code and standards developed by the free community when
it suits their purpose.  Microsoft used the BSD TCP/IP networking
stack in their code and never contributed a spec of code back into
the BSD community, nor have they contributed any usable code to any
open source community except that which requires the users to use
their products.  Do you want all software producing organizations
to be like that?

You can simply politely tell the salesperson making the call that
your not interested and be done with it.  You might also consider
that it costs Spamhaus money to pay the salary of that salesperson
so they have an incentive NOT to contact users that they have a good
idea won't buy their stuff.

> I really just wish the various policies of the pseudo-free blacklists were
> all well-documented, so that sites can evaluate how best to conform, or if
> not, how to disable queries.
>

This is IMHO something that YOU could do, yourself, with a few hours of
time.  You could then contribute this documentation back to the
SpamAssassin maintainers for inclusion into SA.

> But then again, if it's well documented, they don't get a chance to
> generate sales leads!
>

Incorrect, actually it HELPS them, because ANY press at all, good or
bad, is good advertising.

This thread you started as a matter of fact is probably going to
result in a few more sales to Spamhaus.

I would guess this, Andy, if you sent the transcript of this thread
to Xecunet, Inc.'s salesmanager, I would guess he or she would set you 
straight.

Ted

> Andy
>
> ---
> Andy Dills
> Xecunet, Inc.
> www.xecu.net
> 301-682-9972
> ---

Re: Should Spamhaus default to disabled?

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Sat, 2010-06-12 at 10:09 -0400, Andy Dills wrote:
> On Sat, 12 Jun 2010, Yet Another Ninja wrote:

> > > Because there is certainly no single IP generating 300,000 queries per day.
> > 
> > That is probably your problem... use a central DNS resolver and your query
> > count will instantly decrease

> Those and a few others.
> 
> That's why I'm asking how the limits are designed. In the past I had 

You want to ask Spamhaus the question.

Btw, you did not answer my question *what* Spamhaus asked you about for
feedback. We cannot even tell if you're actually giving feedback
(publicly, without directing it to Spamhaus) or just venting opinions.


> problems a certain other blacklist wanting money. We were using a central 
> resolver. Their thresholds were based on queries per IP, not network.
> 
> Using a central resolver put us over their threshold. Distributing out to 
> the individual servers put us under their threshold. I pointed out the 
> silliness of this, as it actually increased overall traffic, but they 
> weren't interested in my opinion, just my money. I would prefer to just 

Well, Spamhaus uses the term "you". IIRC they are smart about usage, and
identifying users. As opposed to IPs.

Anyway, so you just said, that you deliberately traded off caching, to
fly under the free-usage terms of another service. In order not to pay.
Now this bites, because it generates more queries for another service.
Got to love that irony!


> rsync the data, resolve it locally and save everybody the hassle. But 
> nooooo, that costs even more! Because remember, this isn't about defraying 
> costs (reasonable), this is about generating revenue (reasonable, but not 
> for a default-enabled option in free software).

You are exclusively using free as in free beer here. However, SA also is
free as in speech. You got the source (without paying a dime), and you
are allowed to modify the code. Please do so. We do not guarantee
anything. In particular, we do not guarantee that you can use all
supported features, enabled by default or not, without any further cost.


> I really just wish the various policies of the pseudo-free blacklists were 
> all well-documented, so that sites can evaluate how best to conform, or if 
> not, how to disable queries.

This is open source. Feel like contributing back something to the
project you are using? Like, maybe, some docs how to selectively disable
BLs, once you got your head wrapped around it...


> But then again, if it's well documented, they don't get a chance to 
> generate sales leads!

Spamhaus does not use SA to generate sales. SA does not generate sales
for Spamhaus. Please stop repeating this claim.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: Should Spamhaus default to disabled?

Posted by Michelle Konzack <li...@tamay-dogan.net>.
Hello Andy Dills,

Am 2010-06-12 10:09:03, hacktest Du folgendes herunter:
> That's why I'm asking how the limits are designed. In the past I had 
> problems a certain other blacklist wanting money. We were using a central 
> resolver. Their thresholds were based on queries per IP, not network.
> 
> Using a central resolver put us over their threshold. Distributing out to 
> the individual servers put us under their threshold. I pointed out the 
> silliness of this, as it actually increased overall traffic,

Ehm, I get per day arround 60.000 legitimate messages and around  15 mio
spams using 8 inbound servers and do not exceed the  limit  of  Spamhaus
using a central caching DNS...  How can this be?

> but they 
> weren't interested in my opinion, just my money. I would prefer to just 
> rsync the data, resolve it locally and save everybody the hassle. But 
> nooooo, that costs even more! Because remember, this isn't about defraying 
> costs (reasonable), this is about generating revenue (reasonable, but not 
> for a default-enabled option in free software).

Sorry, but you must have a weird setup...

Thanks, Greetings and nice Day/Evening
    Michelle Konzack

-- 
##################### Debian GNU/Linux Consultant ######################
   Development of Intranet and Embedded Systems with Debian GNU/Linux

itsystems@tdnet France EURL       itsystems@tdnet UG (limited liability)
Owner Michelle Konzack            Owner Michelle Konzack

Apt. 917 (homeoffice)
50, rue de Soultz                 Kinzigstraße 17
67100 Strasbourg/France           77694 Kehl/Germany
Tel: +33-6-61925193 mobil         Tel: +49-177-9351947 mobil
Tel: +33-9-52705884 fix

<http://www.itsystems.tamay-dogan.net/>  <http://www.flexray4linux.org/>
<http://www.debian.tamay-dogan.net/>         <http://www.can4linux.org/>

Jabber linux4michelle@jabber.ccc.de
ICQ    #328449886

Linux-User #280138 with the Linux Counter, http://counter.li.org/

Re: Should Spamhaus default to disabled?

Posted by Andy Dills <an...@xecu.net>.
On Sat, 12 Jun 2010, Yet Another Ninja wrote:

> On 2010-06-12 15:20, Andy Dills wrote:
> > 300,000 queries per day...per server? per CIDR? What is the delimiter?
> > 
> > Because there is certainly no single IP generating 300,000 queries per day.
> 
> That is probably your problem... use a central DNS resolver and your query
> count will instantly decrease
> 
> I bet you're querying from:
> 
> 216.127.136.200 dns02.xecu.net
> 216.127.136.247 mail-out07.xecu.net
> 216.127.136.242 mail-out02.xecu.net
> 216.127.136.246 mail-out06.xecu.net
> 216.127.136.196 mg6.xecu.net
> 216.127.136.241 mail-out01.xecu.net
> 216.127.136.245 mail-out05.xecu.net
> 216.127.136.243 mail-out03.xecu.net
> 216.127.136.244 mail-out04.xecu.net

Those and a few others.

That's why I'm asking how the limits are designed. In the past I had 
problems a certain other blacklist wanting money. We were using a central 
resolver. Their thresholds were based on queries per IP, not network.

Using a central resolver put us over their threshold. Distributing out to 
the individual servers put us under their threshold. I pointed out the 
silliness of this, as it actually increased overall traffic, but they 
weren't interested in my opinion, just my money. I would prefer to just 
rsync the data, resolve it locally and save everybody the hassle. But 
nooooo, that costs even more! Because remember, this isn't about defraying 
costs (reasonable), this is about generating revenue (reasonable, but not 
for a default-enabled option in free software).

I really just wish the various policies of the pseudo-free blacklists were 
all well-documented, so that sites can evaluate how best to conform, or if 
not, how to disable queries.

But then again, if it's well documented, they don't get a chance to 
generate sales leads!

Andy

---
Andy Dills
Xecunet, Inc.
www.xecu.net
301-682-9972
---

Re: More large spam....

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Sun, 2010-06-13 at 11:35 -0400, Charles Gregory wrote:
> On Sat, 12 Jun 2010, Karsten Bräckelmann wrote:

> > There are just a very few rules "scanning" non-textual parts of a mail.
> > Large-ish binary attachments don't have much of an impact on
> > performance. Large-ish textual attachments potentially do.
> 
> Now THAT is a curious comment. All the usage guidelines I have ever read 
> implied or outright stated that scanning mails over a certain size was a 
> significant degradation to system performance. Am I confusing the 

Well, a large message internally of course needs more memory and
slightly more time for parsing.

However, most RE rules, which account or the bulk of the load, are
operating on headers and rendered textual parts. They won't be run
against images, zip files, etc.


> guidelines for antivirus programs with those for SA? Would it be 'safe' to 
> run SA on messages with larger attachments? Anyone ever tested this?

Mind trying it yourself? If you're using spamc, just save such a message
and feed it to spamc with an appropriately large -s option. Does it take
significantly longer, or is it just about any other spam?

Also, do that test with ham. This is important, since, as you said, you
are merely getting less than one of these as spam. How many hams that
size do you get?


As a general thought -- though I believe I stated this before -- how
many messages are affected anyway? Both ham and spam. How many messages
larger than 500k and, say, less than 1M do you get in total? In percent
of your mail stream? Are you really afraid your system cannot cope with
a hand full of larger mail per week?

Or, to put it in other words: Even if processing such a mail does take
twice or three times as long burning your CPU, at the end of the week,
would you even notice the increased load?


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: More large spam....

Posted by Charles Gregory <cg...@hwcn.org>.
On Sat, 12 Jun 2010, Karsten Bräckelmann wrote:
> Please do not hijack a thread. Please do not hit Reply, if you do not
> intend to reply and contribute to that thread. Removing all quoted text
> and changing the Subject does *not* make it a new thread or post.
> (Hint: In-Reply-To and References headers.)

(grumble grumble) Stupid mail programs.... (grumble grumble)
Yeah.... okay. Not so stupid. I'll comply....

Footnote: and I was refraining from commenting on another thread on how 
people 'complain' about features of SA that don't work in ways that match 
*their* style of thinking.... Oh, the irony.... :)

>> Has there been any progress...
> No changes since this has been asked the last time.

(nod) Alright. So far this is still a less than once a week phenomenon, 
for me personally. I just raise it occasionally to put a data point into 
the archives. If my inquiry had shaken lose a bunch of 'me too' comments, 
it might have led somewhere. But it hasn't, so the issue remains on the 
far back burner.... :)

> There are just a very few rules "scanning" non-textual parts of a mail.
> Large-ish binary attachments don't have much of an impact on
> performance. Large-ish textual attachments potentially do.

Now THAT is a curious comment. All the usage guidelines I have ever read 
implied or outright stated that scanning mails over a certain size was a 
significant degradation to system performance. Am I confusing the 
guidelines for antivirus programs with those for SA? Would it be 'safe' to 
run SA on messages with larger attachments? Anyone ever tested this?

- C

Re: More large spam....

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
Please do not hijack a thread. Please do not hit Reply, if you do not
intend to reply and contribute to that thread. Removing all quoted text
and changing the Subject does *not* make it a new thread or post.

(Hint: In-Reply-To and References headers.)


On Sat, 2010-06-12 at 09:50 -0400, Charles Gregory wrote:
> I got another 1MB spam today.
> 
> I still don't want to kill my system by attempting to scan every large 
> mail that comes in.

How many messages between 500k and 1M do you get per day?

> Has there been any progress on an 'option' to scan only text portions of 
> mail past a certain size limit and/or scan only the first X bytes? The 
> former is preferable because it avoids any issues with incomplete mail, or 
> text sections being last....

No changes since this has been asked the last time. There are features
for this in 3.3, used by Amavis. This is not used by spamc.

There are just a very few rules "scanning" non-textual parts of a mail.
Large-ish binary attachments don't have much of an impact on
performance. Large-ish textual attachments potentially do.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


More large spam....

Posted by Charles Gregory <cg...@hwcn.org>.
I got another 1MB spam today.

I still don't want to kill my system by attempting to scan every large 
mail that comes in.

Has there been any progress on an 'option' to scan only text portions of 
mail past a certain size limit and/or scan only the first X bytes? The 
former is preferable because it avoids any issues with incomplete mail, or 
text sections being last....

- Charles

Re: Should Spamhaus default to disabled?

Posted by Yet Another Ninja <sa...@alexb.ch>.
On 2010-06-12 15:20, Andy Dills wrote:
> 300,000 queries per day...per server? per CIDR? What is the delimiter?
> 
> Because there is certainly no single IP generating 300,000 queries per 
> day.

That is probably your problem... use a central DNS resolver and your 
query count will instantly decrease

I bet you're querying from:

216.127.136.200 dns02.xecu.net
216.127.136.247 mail-out07.xecu.net
216.127.136.242 mail-out02.xecu.net
216.127.136.246 mail-out06.xecu.net
216.127.136.196 mg6.xecu.net
216.127.136.241 mail-out01.xecu.net
216.127.136.245 mail-out05.xecu.net
216.127.136.243 mail-out03.xecu.net
216.127.136.244 mail-out04.xecu.net

Re: Should Spamhaus default to disabled?

Posted by Andy Dills <an...@xecu.net>.
On Sat, 12 Jun 2010, Karsten Br�ckelmann wrote:

> On Sat, 2010-06-12 at 00:19 -0400, Andy Dills wrote:
> > On Fri, 11 Jun 2010, Karsten Bräckelmann wrote:
> > > The most important argument for me to keep it enabled by default is
> > > simple. Small organizations and home users DO NOT have the knowledge and
> > > admin power to care about all that stuff themselves. For them, SA should
> > > work as good a possible out of the box. On the other hand, large
> > > organizations that generate a *substantial* amount of BL queries per day
> > > DO have the required power to tweak SA according to their specific needs
> > > and environment.
> > 
> > That's fair. Except, we're not a "large organization" by any stretch of 
> > the imagination.
> 
> More than 300.000 queries per day. And a "mail cluster", as you stated
> in your OP.

300,000 queries per day...per server? per CIDR? What is the delimiter?

Because there is certainly no single IP generating 300,000 queries per 
day.

Andy

---
Andy Dills
Xecunet, Inc.
www.xecu.net
301-682-9972
---

Re: Should Spamhaus default to disabled?

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Sat, 2010-06-12 at 14:07 +0100, RW wrote:
> On Sat, 12 Jun 2010 13:06:23 +0200
> Karsten Bräckelmann <gu...@rudersport.de> wrote:
> 
> > No need to stretch the term "large". That's a throughput of more than
> > 1 mail per second -- 100k SMTP connections per day. And that is
> > without any local caching at all. With caching, the throughput would
> > be considerably higher, before you ever cross the threshold and get on
> > their heavy-user radar.
> 
> I think it's worth pointing-out that SA does deep-checking on zen to
> catch spammers in SBL that are relaying though other people's servers.
> 
> If you reject on zen at the SMTP level you not only do fewer lookups,
> but you should also get a higher hit-rate at the DNS cache.

True -- just doesn't effect the math above. :)

For the numbers I used the "100k SMTP connections" limit for Spamhaus
free usage, rather than the "300k queries". So there's room left.


Pointing out deep-parsing for SBL is a good one, though. While there's a
single limit, there are multiple lists and query styles. PBL and XBL is
a single query per mail. SBL does deep-parsing, and DBL is RHS -- these
are most likely to result in more queries per mail. Without caching...


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: Should Spamhaus default to disabled?

Posted by RW <rw...@googlemail.com>.
On Sat, 12 Jun 2010 13:06:23 +0200
Karsten Bräckelmann <gu...@rudersport.de> wrote:


> No need to stretch the term "large". That's a throughput of more than
> 1 mail per second -- 100k SMTP connections per day. And that is
> without any local caching at all. With caching, the throughput would
> be considerably higher, before you ever cross the threshold and get on
> their heavy-user radar.

I think it's worth pointing-out that SA does deep-checking on zen to
catch spammers in SBL that are relaying though other people's servers.

If you reject on zen at the SMTP level you not only do fewer lookups,
but you should also get a higher hit-rate at the DNS cache.

Re: Should Spamhaus default to disabled?

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Sat, 2010-06-12 at 00:19 -0400, Andy Dills wrote:
> On Fri, 11 Jun 2010, Karsten Bräckelmann wrote:
> > The most important argument for me to keep it enabled by default is
> > simple. Small organizations and home users DO NOT have the knowledge and
> > admin power to care about all that stuff themselves. For them, SA should
> > work as good a possible out of the box. On the other hand, large
> > organizations that generate a *substantial* amount of BL queries per day
> > DO have the required power to tweak SA according to their specific needs
> > and environment.
> 
> That's fair. Except, we're not a "large organization" by any stretch of 
> the imagination.

More than 300.000 queries per day. And a "mail cluster", as you stated
in your OP.

No need to stretch the term "large". That's a throughput of more than 1
mail per second -- 100k SMTP connections per day. And that is without
any local caching at all. With caching, the throughput would be
considerably higher, before you ever cross the threshold and get on
their heavy-user radar.


> To be fair, they've contacted me asking for feedback, which I figured I 
> would give publically:

May I ask what the question was? Feedback to what?

> As much as I respect that people should get compensated for their 
> contributions, that doesn't negate the economics of value. What they're 
> charging is unreasonable for the utility it provides. 

You are free to disable BLs, enabled by default in the free product you
use -- SpamAssassin. Free as in both, beer and speech.

You are running a cluster dedicated to mail. You do have paid staff
caring about the machines and software. You didn't pay for SA, so it
isn't unreasonable for us to expect you to dedicate part of your paid
staff to tweak SA according to your specific needs and environment.

Actually, I can tell you did. There is pretty much no way a vanilla SA
would fit your environment and load without configuration changes. And
most likely, other services.


> DCC is a great example of how I think it should be handled. He has a free 
> (to all) service, and a paid (to all) service. The free service in fact 
> generates the data from which he determines the reputations of sending 
> IPs, which is the basis of the paid service, so it's a win-win. The more 
> people he has querying the free product, the more his paying customers 
> benefit.

Spamhaus operates entirely different, and what you just outlined doesn't
apply at all. Not to mention it is irrelevant in the context of BLs
enabled by default.

  guenther


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: Should Spamhaus default to disabled?

Posted by Andy Dills <an...@xecu.net>.
On Fri, 11 Jun 2010, Karsten Br�ckelmann wrote:

> On Fri, 2010-06-11 at 10:42 -0400, Andy Dills wrote:
> > score URIBL_DBL_SPAM 0
> > score URIBL_DBL_ERROR 0
> > score RCVD_IN_ZEN 0
> > 
> > I think those are the only queries that generate lookups against Spamhaus, 
> > but I'm not positive.
> 
> IIRC that doesn't disable all DNS lookups against ZEN. You'd also need
> to disable the non-scoring eval() that does the actual lookup.
> 
>   meta __RCVD_IN_ZEN  (0)
> 
> You also missed XBL, PBL, and URIBL_SBL.

I misunderstood the documentation.

http://wiki.apache.org/spamassassin/DnsBlocklists
---
At present, the query trigger rule for SpamHaus looks like this: 
header RCVD_IN_ZEN eval:check_rbl('zen', 'zen.spamhaus.org.') 

So to disable it you'd use: 
score RCVD_IN_ZEN 0
---

I grepped the ruleset before I even googled, so I misunderstood that to 
mean that by setting that rule score, I was disabling the meta, and thus 
disabling the other rules that query zen.spamhaus.org (which to me seems 
like a reasonable design choice, so I didn't question it). Afterall, 
there's no longer a rule "RCVD_IN_ZEN", and I've yet to have any need to 
address meta rules.

Which is why I didn't include any of the other tests you mention, because 
they all query zen. Clearly they do not, and I've explicitly scored 
everything to 0.

> The most important argument for me to keep it enabled by default is
> simple. Small organizations and home users DO NOT have the knowledge and
> admin power to care about all that stuff themselves. For them, SA should
> work as good a possible out of the box. On the other hand, large
> organizations that generate a *substantial* amount of BL queries per day
> DO have the required power to tweak SA according to their specific needs
> and environment.

That's fair. Except, we're not a "large organization" by any stretch of 
the imagination.

To be fair, they've contacted me asking for feedback, which I figured I 
would give publically:

As much as I respect that people should get compensated for their 
contributions, that doesn't negate the economics of value. What they're 
charging is unreasonable for the utility it provides. 


DCC is a great example of how I think it should be handled. He has a free 
(to all) service, and a paid (to all) service. The free service in fact 
generates the data from which he determines the reputations of sending 
IPs, which is the basis of the paid service, so it's a win-win. The more 
people he has querying the free product, the more his paying customers 
benefit.

We've run a (free) DCC peer for many years now, and I can't remember 
Vernon ever pushing his paid service. I bet it's great, I've read about it 
and considered it in the past, and if I find that disabling the spamhaus 
queries affects FN rates, I'd more likely consider paying him to add his 
reputation-based scoring tool (which is certainly more valuable than "just 
another blacklist").

> That said, better documentation on this issue would not hurt. However...

Yeah, that's probably the root of the issue.

Andy

---
Andy Dills
Xecunet, Inc.
www.xecu.net
301-682-9972
---

Re: Should Spamhaus default to disabled?

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Fri, 2010-06-11 at 10:42 -0400, Andy Dills wrote:
> score URIBL_DBL_SPAM 0
> score URIBL_DBL_ERROR 0
> score RCVD_IN_ZEN 0
> 
> I think those are the only queries that generate lookups against Spamhaus, 
> but I'm not positive.

IIRC that doesn't disable all DNS lookups against ZEN. You'd also need
to disable the non-scoring eval() that does the actual lookup.

  meta __RCVD_IN_ZEN  (0)

You also missed XBL, PBL, and URIBL_SBL.


> I think the maintainers of SA should strongly consider defaulting Spamhaus 
> to "off". At the very least, it should be better documented how to entire 
> disable Spamhaus queries.

Strong -1.

This topic has been discussed a few times before, so you are free to
check bugzilla and the list archives for full discussions.

The most important argument for me to keep it enabled by default is
simple. Small organizations and home users DO NOT have the knowledge and
admin power to care about all that stuff themselves. For them, SA should
work as good a possible out of the box. On the other hand, large
organizations that generate a *substantial* amount of BL queries per day
DO have the required power to tweak SA according to their specific needs
and environment.


That said, better documentation on this issue would not hurt. However...

Asking google already yields quite a lot of results. Including hints and
discussions that your above local.cf changes are insufficient.

Grepping for spamhaus in the default rule-set also trivially shows, you
missed a check_rbl() eval rule that generates queries.


You didn't come here to complain and ask for better docs without doing
some research, did you?

  guenther


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: Should Spamhaus default to disabled?

Posted by Michael Scheidell <sc...@secnap.net>.
On 6/11/10 10:42 AM, Andy Dills wrote:
>
> We felt the amount of money being asked for was unreasonable, as we felt
> we likely wouldn't see an increase in spam if we turned them off.
>
>    
should I mention that the (optional) reputation based dcc.pm (which is 
supported in SA 3.3.x) is really CHEAP?
(no, I don't work for them, don't own part of them, don't get 
commissions from them, and yes, you  need to download and recompile a 
new dccd client/and/or server)
If you do less than 100K in queries per day (100K passing through SA), 
you don't even need to run your own servers. > 100K, the bandwidth would 
suggest that you would want to run one or more caching DCC servers in 
your network.

I would use spamhaus if their costs were similar to rhyolites for the 
commercial dcc.

<http://www.rhyolite.com/dcc/reputations.html>

(yes, spamhaus, when you told us we were in non-compliance, we tried, 
several times to contact you to try to license it, and we got ignored.  
we only know of the costs via third parties who reported it.  we admin 
we have no direct knowledge of the costs because you never responded to 
our requests for pricing)

look at the DCC rules for a hint at the new features.


-- 
Michael Scheidell, CTO
Phone: 561-999-5000, x 1259
 > *| *SECNAP Network Security Corporation

    * Certified SNORT Integrator
    * 2008-9 Hot Company Award Winner, World Executive Alliance
    * Five-Star Partner Program 2009, VARBusiness
    * Best Anti-Spam Product 2008, Network Products Guide
    * King of Spam Filters, SC Magazine 2008

______________________________________________________________________
This email has been scanned and certified safe by SpammerTrap(r). 
For Information please see http://www.secnap.com/products/spammertrap/
______________________________________________________________________