You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spamassassin.apache.org by qu...@apache.org on 2004/05/28 21:39:05 UTC

svn commit: rev 20548 - in incubator/spamassassin/trunk: lib/Mail/SpamAssassin rules

Author: quinlan
Date: Fri May 28 12:39:04 2004
New Revision: 20548

Modified:
   incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Conf.pm
   incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Dns.pm
   incubator/spamassassin/trunk/rules/20_dnsbl_tests.cf
   incubator/spamassassin/trunk/rules/70_testing.cf
Log:
revise and update the DNSBL documentation
change SenderBase tests to be prefixed with "sb:" instead of relying on
  the set name to distinguish them
lower magnitude cut-off for SB_NSP_VOLUME_SPIKE based on test rule results
  and remove those test rules


Modified: incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Conf.pm
==============================================================================
--- incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Conf.pm	(original)
+++ incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Conf.pm	Fri May 28 12:39:04 2004
@@ -2128,50 +2128,62 @@
 a method on the C<Mail::SpamAssassin::EvalTests> object.  C<arguments>
 are optional arguments to the function call.
 
-=item header SYMBOLIC_TEST_NAME eval:check_rbl('set', 'zone')
+=item header SYMBOLIC_TEST_NAME eval:check_rbl('set', 'zone' [, 'sub-test'])
 
-Check a DNSBL (DNS blacklist), also known as RBLs (realtime blacklists).  This
-will retrieve Received headers from the mail, parse the IP addresses, select
-which ones are 'untrusted' based on the C<trusted_networks> logic, and query
-that blacklist.  There's a few things to note:
+Check a DNSBL (a DNS blacklist or whitelist).  This will retrieve Received:
+headers from the message, extract the IP addresses, select which ones are
+'untrusted' based on the C<trusted_networks> logic, and query that DNSBL
+zone.  There's a few things to note:
 
 =over 4
 
-=item Duplicated or reserved IPs
+=item duplicated or reserved IPs
 
-These are stripped, and the DNSBLs will not be queried for them.  Reserved IPs
-are those listed in <http://www.iana.org/assignments/ipv4-address-space>,
-<http://duxcw.com/faq/network/privip.htm>, or
-<http://duxcw.com/faq/network/autoip.htm>.
+Duplicated IPs are only queried once and reserved IPs are not queried.
+Reserved IPs are those listed in
+<http://www.iana.org/assignments/ipv4-address-space>,
+<http://duxcw.com/faq/network/privip.htm>,
+<http://duxcw.com/faq/network/autoip.htm>, or
+<ftp://ftp.rfc-editor.org/in-notes/rfc3330.txt>
 
-=item The first argument, 'set'
+=item the 'set' argument
 
-This is used as a 'zone ID'.  If you want to look up a multi-meaning zone like
-relays.osirusoft.com, you can then query the results from that zone using it;
+This is used as a 'zone ID'.  If you want to look up a multiple-meaning zone
+like NJABL or SORBS, you can then query the results from that zone using it;
 but all check_rbl_sub() calls must use that zone ID.
 
-Also, if an IP gets a hit in one lookup in a zone using that ID, any further
-hits in other rules using that zone ID will *not* be added to the score.
+Also, if more than one IP address gets a DNSBL hit for a particular rule, it
+does not affect the score because rules only trigger once per message.
 
-=item Selecting all IPs except for the originating one
+=item the 'zone' argument
 
-This is accomplished by naming the set 'foo-notfirsthop'.  Useful for querying
-against DNS lists which list dialup IP addresses; the first hop may be a
-dialup, but as long as there is at least one more hop, via their outgoing
-SMTP server, that's legitimate, and so should not gain points.  If there
-is only one hop, that will be queried anyway, as it should be relaying
-via its outgoing SMTP server instead of sending directly to your MX.
+This is the root zone of the DNSBL, ending in a period.
 
-=item Selecting IPs by whether they are trusted
+=item the 'sub-test' argument
+
+This optional argument behaves the same as the sub-test argument in
+C<check_rbl_sub()> below.
+
+=item selecting all IPs except for the originating one
+
+This is accomplished by placing '-notfirsthop' at the end of the set name.
+This is useful for querying against DNS lists which list dialup IP
+addresses; the first hop may be a dialup, but as long as there is at least
+one more hop, via their outgoing SMTP server, that's legitimate, and so
+should not gain points.  If there is only one hop, that will be queried
+anyway, as it should be relaying via its outgoing SMTP server instead of
+sending directly to your MX (mail exchange).
+
+=item selecting IPs by whether they are trusted
 
 When checking a 'nice' DNSBL (a DNS whitelist), you cannot trust the IP
-addresses in Received headers that were not added by trusted relays.  To test
-the first IP address that can be trusted, name the set 'foo-firsttrusted'.
-That should test the IP address of the relay that connected to the most remote
-trusted relay.
+addresses in Received headers that were not added by trusted relays.  To
+test the first IP address that can be trusted, place '-firsttrusted' at the
+end of the set name.  That should test the IP address of the relay that
+connected to the most remote trusted relay.
 
-In addition, you can test all untrusted IP addresses by naming the set
-'foo-untrusted'.
+In addition, you can test all untrusted IP addresses by placing '-untrusted'
+at the end of the set name.
 
 Note that this requires that SpamAssassin know which relays are trusted.  For
 simple cases, SpamAssassin can make a good estimate.  For complex cases, you
@@ -2192,7 +2204,12 @@
 using the zone ID from the original query.  The sub-test may either be an
 IPv4 dotted address for RBLs that return multiple A records or a
 non-negative decimal number to specify a bitmask for RBLs that return a
-single A record containing a bitmask of results.
+single A record containing a bitmask of results, a SenderBase test
+beginning with "sb:", or (if none of the preceding options seem to fit) a
+regular expression.
+
+Note: the set name must be exactly the same for as the main query rule,
+including selections like '-notfirsthop' appearing at the end of the set name.
 
 =cut
 

Modified: incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Dns.pm
==============================================================================
--- incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Dns.pm	(original)
+++ incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Dns.pm	Fri May 28 12:39:04 2004
@@ -206,7 +206,7 @@
       $self->dnsbl_hit($rule, $question, $answer);
     }
     # senderbase
-    elsif ($set =~ /^senderbase/) {
+    elsif ($subtest =~ s/^sb://) {
       $rdatastr =~ s/^"?\d+-//;
       $rdatastr =~ s/"$//;
       my %sb = ($rdatastr =~ m/(?:^|\|)(\d+)=([^|]+)/g);

Modified: incubator/spamassassin/trunk/rules/20_dnsbl_tests.cf
==============================================================================
--- incubator/spamassassin/trunk/rules/20_dnsbl_tests.cf	(original)
+++ incubator/spamassassin/trunk/rules/20_dnsbl_tests.cf	Fri May 28 12:39:04 2004
@@ -184,12 +184,12 @@
 # http://www.senderbase.org/dnsresponses.html
 # sa.senderbase.org for SpamAssassin queries
 # query.senderbase.org for other queries
-header __SENDERBASE eval:check_rbl_txt('senderbase', 'sa.senderbase.org.')
+header __SENDERBASE eval:check_rbl_txt('sb', 'sa.senderbase.org.')
 tflags __SENDERBASE net
 
 # S23 = domain daily magnitude
 # S25 = date of first message from this domain
-header SB_NEW_BULK		eval:check_rbl_sub('senderbase', 'S23 > 6.2 && (time - S25 < 120*86400)')
+header SB_NEW_BULK		eval:check_rbl_sub('sb', 'sb:S23 > 6.2 && (time - S25 < 120*86400)')
 describe SB_NEW_BULK		Sender domain is new and very high volume
 tflags SB_NEW_BULK		net
 
@@ -197,14 +197,14 @@
 # S40 = IP daily magnitude
 # S41 = IP monthly magnitude
 # note: accounting for rounding, "> 0.3" means at least a 59% volume spike
-header SB_NSP_VOLUME_SPIKE	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.2 && S40 - S41 > 0.3')
+header SB_NSP_VOLUME_SPIKE	eval:check_rbl_sub('sb', 'sb:S5 =~ /NSP/ && S41 > 3.8 && S40 - S41 > 0.3')
 describe SB_NSP_VOLUME_SPIKE	Sender IP hosted at NSP has a volume spike
 tflags SB_NSP_VOLUME_SPIKE	net
 
 # S2 = organization daily magnitude
 # S9 = IP addresses used by this organization
 # note: this rule does not work as well on older mail
-header SB_HIGH_VOLUME_PER_IP	eval:check_rbl_sub('senderbase', '(S2 / S9) > 5.00')
+header SB_HIGH_VOLUME_PER_IP	eval:check_rbl_sub('sb', 'sb:(S2 / S9) > 5.00')
 describe SB_HIGH_VOLUME_PER_IP	Sender organization has high volume per IP
 tflags SB_HIGH_VOLUME_PER_IP	net
 

Modified: incubator/spamassassin/trunk/rules/70_testing.cf
==============================================================================
--- incubator/spamassassin/trunk/rules/70_testing.cf	(original)
+++ incubator/spamassassin/trunk/rules/70_testing.cf	Fri May 28 12:39:04 2004
@@ -71,50 +71,6 @@
 
 ########################################################################
 
-# possible replacements for SB_NSP_VOLUME_SPIKE
-# accounting for rounding, "> 0.3" means at least a 59% volume spike
-header T_NSP_S41_38_03	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 3.8 && S40 - S41 > 0.3')
-header T_NSP_S41_39_03	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 3.9 && S40 - S41 > 0.3')
-header T_NSP_S41_40_03	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.0 && S40 - S41 > 0.3')
-header T_NSP_S41_41_03	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.1 && S40 - S41 > 0.3')
-header T_NSP_S41_42_03	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.2 && S40 - S41 > 0.3')
-header T_NSP_S41_43_03	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.3 && S40 - S41 > 0.3')
-header T_NSP_S41_44_03	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.4 && S40 - S41 > 0.3')
-header T_NSP_S41_45_03	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.5 && S40 - S41 > 0.3')
-header T_NSP_S41_46_03	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.6 && S40 - S41 > 0.3')
-
-# accounting for rounding, "> 0.4" means at least a 251% volume spike
-header T_NSP_S41_38_04	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 3.8 && S40 - S41 > 0.4')
-header T_NSP_S41_39_04	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 3.9 && S40 - S41 > 0.4')
-header T_NSP_S41_40_04	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.0 && S40 - S41 > 0.4')
-header T_NSP_S41_41_04	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.1 && S40 - S41 > 0.4')
-header T_NSP_S41_42_04	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.2 && S40 - S41 > 0.4')
-header T_NSP_S41_43_04	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.3 && S40 - S41 > 0.4')
-header T_NSP_S41_44_04	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.4 && S40 - S41 > 0.4')
-header T_NSP_S41_45_04	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.5 && S40 - S41 > 0.4')
-header T_NSP_S41_46_04	eval:check_rbl_sub('senderbase', 'S5 =~ /NSP/ && S41 > 4.6 && S40 - S41 > 0.4')
-
-tflags T_NSP_S41_38_03 net
-tflags T_NSP_S41_39_03 net
-tflags T_NSP_S41_40_03 net
-tflags T_NSP_S41_41_03 net
-tflags T_NSP_S41_42_03 net
-tflags T_NSP_S41_43_03 net
-tflags T_NSP_S41_44_03 net
-tflags T_NSP_S41_45_03 net
-tflags T_NSP_S41_46_03 net
-tflags T_NSP_S41_38_04 net
-tflags T_NSP_S41_39_04 net
-tflags T_NSP_S41_40_04 net
-tflags T_NSP_S41_41_04 net
-tflags T_NSP_S41_42_04 net
-tflags T_NSP_S41_43_04 net
-tflags T_NSP_S41_44_04 net
-tflags T_NSP_S41_45_04 net
-tflags T_NSP_S41_46_04 net
-
-########################################################################
-
 # let's see how this works
 meta T_SPF_PASS_NO_SBL		(SPF_PASS && !RCVD_IN_SBL)
 meta T_SPF_HELO_PASS_NO_SBL	(SPF_HELO_PASS && !RCVD_IN_SBL)