You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Warren Togami <wt...@redhat.com> on 2009/10/05 01:42:06 UTC

Spam Eating Monkey?

http://spameatingmonkey.com

Anyone have any experience using these DNSBL and URIBL's?

Is anyone from this site on this list?

I wonder if we should add these rules to the sandbox for masschecks as well.

Warren Togami
wtogami@redhat.com

Re: Spam Eating Monkey?

Posted by Marc Perkel <ma...@perkel.com>.

Warren Togami wrote:
> http://spameatingmonkey.com
>
> Anyone have any experience using these DNSBL and URIBL's?
>
> Is anyone from this site on this list?
>
> I wonder if we should add these rules to the sandbox for masschecks as 
> well.
>
> Warren Togami
> wtogami@redhat.com
>

I've been using them for a few weeks and I don't have stats but seems 
like a high quality list. Definitely worth testing.


Re: Spam Eating Monkey?

Posted by Blaine Fleming <gr...@digital-z.com>.
Warren Togami wrote:
> http://spameatingmonkey.com/usage.html
> 
> Are these URI rules really valid syntax?  They don't look right, and
> spamassassin lint rejects them.

I'm using all of those rules except for the backscatter one with no
problems.  They also lint fine for me.  Are you watching for line wrap?

--Blaine

Re: Spam Eating Monkey?

Posted by Benny Pedersen <me...@junc.org>.
On fre 09 okt 2009 23:23:06 CEST, Warren Togami wrote
> http://spameatingmonkey.com/usage.html
> Are these URI rules really valid syntax?  They don't look right, and  
> spamassassin lint rejects them.

no lint error here

-- 
xpoint


Re: Spam Eating Monkey?

Posted by Mark Martinec <Ma...@ijs.si>.
> Rules are alright. What I can see is that build/mkrules intentionally
> does not load plugins (except for the Plugin::Check), which means
> the 'urirhssub' directive in your .cf file is not recognized.

Actually, the proper solution is probably just to enclose your
rules between:

ifplugin Mail::SpamAssassin::Plugin::URIDNSBL
...
endif


  Mark

Re: Spam Eating Monkey?

Posted by Mark Martinec <Ma...@ijs.si>.
Warren,

> http://spameatingmonkey.com/usage.html
> 
> Are these URI rules really valid syntax?  They don't look right, and
> spamassassin lint rejects them.

>rulesrc/sandbox/wtogami/20_unsafe.cf: 0 active rules, 5 other
>lint: config: failed to parse line, skipping, in "rules/70_sandbox.cf": 
urirhssub SEM_FRESH fresh.spameatingmonkey.net. A 2 at build/mkrules line 253.
>lint: config: failed to parse line, skipping, in "rules/70_sandbox.cf": 
urirhssub SEM_URI uribl.spameatingmonkey.net. A 2 at build/mkrules line 253.
>lint: config: failed to parse line, skipping, in "rules/70_sandbox.cf": 
urirhssub SEM_URIRED urired.spameatingmonkey.net. A 2 at build/mkrules line 
253.

Rules are alright. What I can see is that build/mkrules intentionally
does not load plugins (except for the Plugin::Check), which means
the 'urirhssub' directive in your .cf file is not recognized.

An quick-and-dirty hack (without understanding the consequences) is:

--- build/mkrules       (revision 823750)
+++ build/mkrules       (working copy)
@@ -239,7 +239,7 @@
       # debug => 1,
       local_tests_only => 1,
       dont_copy_prefs => 1,
-      config_text => $pretext.$text
+    # config_text => $pretext.$text
   });

   my $errors = 0;


I don't know what the proper solution is, and where actually
lies the core of a problem.

  Mark

Re: Spam Eating Monkey?

Posted by Warren Togami <wt...@redhat.com>.
http://spameatingmonkey.com/usage.html

Are these URI rules really valid syntax?  They don't look right, and 
spamassassin lint rejects them.

Warren

Re: consolidating DNSBLs into a single query (was Spam Eating Monkey?)

Posted by Royce Williams <ro...@gmail.com>.
On Tue, Oct 6, 2009 at 8:19 PM, Rob McEwen <ro...@invaluement.com> wrote:
> Warren Togami wrote:
>> You are misunderstanding the question.  A single DNS query could
>> respond different numbers meaning they are hits on different lists.
>> Your lists that are subsets or supersets of other lists can easily use
>> this.  The querying software need only to know what each result means.
>
> Not saying that this is a bad idea, but it does have its limitations.
> For example, some lists are into the hundreds of megabytes large, and
> getting the whole file rsncned and updated can take more than several
> minutes. Often, such lists update only once or twice per hour, if even
> that often.

Hmm ... interesting.  If implemented via rbldnsd, each list could be
maintained in a separate file, and since rbldnsd can be configured to
build a single zone using multiple files on the back end, different
lists could be refreshed at different rates.

Your comments about tradeoffs and bitmasking still stand, of course.

Royce

Re: consolidating DNSBLs into a single query (was Spam Eating Monkey?)

Posted by Rob McEwen <ro...@invaluement.com>.
Mike Cardwell wrote:
> I don't understand the logic of that. Ie, why you'd need to use
> bitmasking? zen.spamhaus.org is a combination of various different
> lists and returns multiple values like this:<SNIP>

If every list is an "outright block" list, then you are correct. My
point applies to situations where some lists are used in scoring mode,
and where there is a desire to be able to calculate a score based on
exactly which lists hit on a particular sending IP.

But even if someone tries this with all "outright block lists", and uses
rbldnsd's built in ability to consolidate lists, then there are still
two problems:

(a) for auditing purposes, there'd be no way to tell *which* lists hit
on that IP since many use the same return codes

(b) some hundreds-of-MB-large lists which previously could have used the
lower-memory "ip4tset" would have to revert back to slower and
higher-memory-usage "ip4set", fwiw

Again, not saying these problems can't be solved, only pointing them out
so that anyone who cares to try can know what they need to do, or need
to expect.

-- 
Rob McEwen
http://dnsbl.invaluement.com/
rob@invaluement.com
+1 (478) 475-9032



Re: consolidating DNSBLs into a single query (was Spam Eating Monkey?)

Posted by Mike Cardwell <sp...@lists.grepular.com>.
On 07/10/2009 05:19, Rob McEwen wrote:

> Also, this loses the ability to *score* on multiple lists... unless you
> use a bitmasked scoring system whereby one list gets assigned ".2",
> another ".4", another ".8", on to ".128". But that leaves a maximum of
> only 7 lists. Sure, you can add more than 7 by employing other octets in
> the "answer IP", but that only severely complicates matters.
>
> And as it stands, you'd also have the complexity of getting the spam
> filter to parse, understand, and react properly to those bitmasks.

I don't understand the logic of that. Ie, why you'd need to use 
bitmasking? zen.spamhaus.org is a combination of various different lists 
and returns multiple values like this:

mike@haven:~$ host -t a 2.0.0.127.zen.spamhaus.org
2.0.0.127.zen.spamhaus.org      A       127.0.0.4
2.0.0.127.zen.spamhaus.org      A       127.0.0.10
2.0.0.127.zen.spamhaus.org      A       127.0.0.2
mike@haven:~$

It's perfectly easy for SpamAssassin to see that three different values 
have been returned, so 127.0.0.2 is on three separate lists and that an 
extra score should be applied for each of those three.

It's also quite easy to do it in Exim, eg if I wanted to block an email 
in Exim if the sending ip is on both sbl.spamhaus.org and 
xbl.spamhaus.org I could either do two dns lookups like this:

deny dnslists = sbl.spamhaus.org
      dnslists = xbl.spamhaus.org

Or I could do it with a single dns lookup like this:

deny dnslists = zen.spamhaus.org=127.0.0.2
      dnslists = zen.spamhaus.org=127.0.0.4

You can be 100% backwards compatible by leaving all of your lists as 
they are, but then adding another one which is a combined version of all 
of them...

-- 
Mike Cardwell - IT Consultant and LAMP developer
Cardwell IT Ltd. (UK Reg'd Company #06920226) http://cardwellit.com/

consolidating DNSBLs into a single query (was Spam Eating Monkey?)

Posted by Rob McEwen <ro...@invaluement.com>.
Warren Togami wrote:
> You are misunderstanding the question.  A single DNS query could
> respond different numbers meaning they are hits on different lists. 
> Your lists that are subsets or supersets of other lists can easily use
> this.  The querying software need only to know what each result means.

Not saying that this is a bad idea, but it does have its limitations.
For example, some lists are into the hundreds of megabytes large, and
getting the whole file rsncned and updated can take more than several
minutes. Often, such lists update only once or twice per hour, if even
that often.

In contrast, some lists are smaller and faster reacting and update every
few minutes.

Trying to merge all such lists into a single lists every several minutes
is no trivial task in terms of having enough CPU cycles and RAM to get
that done correctly and within a reasonably short time.

Likewise, doing the merge hourly loses the benefit of some of the
smaller-footprint faster-reacting lists which can react to emerging spam
threats faster.

Not saying such a consolidation can't be done... and maybe a few
tradeoffs here are worthwhile? But if these issues are not dealt with
smartly and competently, then one could easily find themselves with that
all-in-one comprehensive DNSBL has not being as effective as querying
them separately.

Also, this loses the ability to *score* on multiple lists... unless you
use a bitmasked scoring system whereby one list gets assigned ".2",
another ".4", another ".8", on to ".128". But that leaves a maximum of
only 7 lists. Sure, you can add more than 7 by employing other octets in
the "answer IP", but that only severely complicates matters.

And as it stands, you'd also have the complexity of getting the spam
filter to parse, understand, and react properly to those bitmasks.

-- 
Rob McEwen
http://dnsbl.invaluement.com/
rob@invaluement.com
+1 (478) 475-9032



Re: Spam Eating Monkey?

Posted by Warren Togami <wt...@redhat.com>.
On 10/06/2009 11:15 PM, Blaine Fleming wrote:
> Warren Togami wrote:
>> I'll add your existing rules to the Sandbox for testing.
>
> Thank you!
>
>> But have you considered putting all the DNSBL's and URIBL's into
>> aggregated zones so you can cut down on redundant queries?
>
> Actually, the uri red list is an aggregate zone of my uri black, red and
> yellow lists.  The main reason I haven't merged the black list with any
> of the other IP zones is because I haven't had enough user response on
> the other lists yet.

You are misunderstanding the question.  A single DNS query could respond 
different numbers meaning they are hits on different lists.  Your lists 
that are subsets or supersets of other lists can easily use this.  The 
querying software need only to know what each result means.

Warren

Re: Spam Eating Monkey?

Posted by Blaine Fleming <gr...@digital-z.com>.
Warren Togami wrote:
> I'll add your existing rules to the Sandbox for testing.

Thank you!

> But have you considered putting all the DNSBL's and URIBL's into
> aggregated zones so you can cut down on redundant queries?

Actually, the uri red list is an aggregate zone of my uri black, red and
yellow lists.  The main reason I haven't merged the black list with any
of the other IP zones is because I haven't had enough user response on
the other lists yet.

Basically, the relevant zones are the SEM-URIRED and SEM-BLACK and each
of them needs to be it's own query because of the two completely
different datasets.

--Blaine


Re: Spam Eating Monkey?

Posted by Warren Togami <wt...@redhat.com>.
On 10/04/2009 09:32 PM, Blaine Fleming wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Warren Togami wrote:
>> http://spameatingmonkey.com
>>
>> Anyone have any experience using these DNSBL and URIBL's?
>>
>> Is anyone from this site on this list?
>>
>> I wonder if we should add these rules to the sandbox for masschecks as
>> well.
>
> Since someone is bound to ask I figure I'll state right now that I have
> no objections to the SEM lists being included in the masschecks.  In
> fact, I'm quite curious.
>
> I would also recommend adding AnonWhois.org to the list.
>

I'll add your existing rules to the Sandbox for testing.

But have you considered putting all the DNSBL's and URIBL's into 
aggregated zones so you can cut down on redundant queries?

http://wiki.junkemailfilter.com/index.php/Spam_DNS_Lists
For example, one DNSBL lookup here can respond with 127.0.0.[1-5] 
depending on which list it is.

Warren Togami
wtogami@redhat.com

Re: Spam Eating Monkey?

Posted by Blaine Fleming <gr...@digital-z.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Warren Togami wrote:
> http://spameatingmonkey.com
> 
> Anyone have any experience using these DNSBL and URIBL's?
> 
> Is anyone from this site on this list?
> 
> I wonder if we should add these rules to the sandbox for masschecks as
> well.

Since someone is bound to ask I figure I'll state right now that I have
no objections to the SEM lists being included in the masschecks.  In
fact, I'm quite curious.

I would also recommend adding AnonWhois.org to the list.

- --Blaine Fleming
SEM Admin
http://spameatingmonkey.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)

iEYEARECAAYFAkrJTLYACgkQLp9/dJH6k+Mc4ACeII1l3SSA2y2hz30A7ulqzp1Q
yWIAnjxIj63wAbqYDdzrU0DW/Rsj1eSz
=X6Nx
-----END PGP SIGNATURE-----