You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2008/06/23 17:50:54 UTC
[Bug 5930] New: Shortcircuiting before DNS lookups
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5930
Summary: Shortcircuiting before DNS lookups
Product: Spamassassin
Version: SVN Trunk (Latest Devel Version)
Platform: Other
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P4
Component: Plugins
AssignedTo: dev@spamassassin.apache.org
ReportedBy: hege@hege.li
Created an attachment (id=4342)
--> (https://issues.apache.org/SpamAssassin/attachment.cgi?id=4342)
Hacky test patch
Shortcircuiting ALL_TRUSTED, USER_IN_WHITELIST etc could use tiny improvement.
Currently DNS queries are sent and message is decoded before these simple rules
are evaluated, wasting bandwidth and CPU.
I made a quick hack for myself. I set these rules at priority -10000 and run
them manually before run_rbl_eval_tests.
I couldn't come up with any elegant solution for general use. Perhaps we could
allocate some priority-range that are run before DNS? But that would mean users
could accidently prioritize rules that need decoded message, and decoding
should be done only after DNS launching. Ideas?
--
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
[Bug 5930] Shortcircuiting before DNS lookups
Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5930
--- Comment #12 from AXB <ax...@gmail.com> ---
depending on how many thousand q/s you make it can make a massive difference.
the concept of shortcircuiting is to avoid any other rule apply a score.
if network lookups are NOT bypassed, network hits with high custom scores or
metas with lookup rules could "break" for example, a whitelist_from_rcvd.
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 5930] Shortcircuiting before DNS lookups
Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5930
--- Comment #6 from Henrik Krohns <he...@hege.li> ---
Here you go, I've been running these patches for years without hiccups.
Then you just need something like:
priority USER_IN_BLACKLIST -10050
Anything <= -10000 is shortcircuited before lookups are launched.
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 5930] Shortcircuiting before DNS lookups
Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5930
--- Comment #7 from Henrik Krohns <he...@hege.li> ---
For quick reference:
===================================================================
--- Check.pm (revision 1626743)
+++ Check.pm (working copy)
@@ -77,10 +77,8 @@
# rbl calls.
$pms->extract_message_metadata();
- # Here, we launch all the DNS RBL queries and let them run while we
- # inspect the message
- $self->run_rbl_eval_tests($pms);
my $needs_dnsbl_harvest_p = 1; # harvest needs to be run
+ my $rbls_running = 0;
my $decoded = $pms->get_decoded_stripped_body_text_array();
my $bodytext = $pms->get_decoded_body_text_array();
@@ -108,6 +106,13 @@
last;
}
+ # Here, we launch all the DNS RBL queries and let them run while we
+ # inspect the message
+ if (!$rbls_running && $priority > -10000) {
+ $rbls_running = 1;
+ $self->run_rbl_eval_tests($pms);
+ }
+
my $timer = $self->{main}->time_method("tests_pri_".$priority);
dbg("check: running tests for priority: $priority");
--- URIDNSBL.pm (revision 1626743)
+++ URIDNSBL.pm (working copy)
@@ -327,18 +327,21 @@
return $self;
}
-# this is just a placeholder; in fact the results are dealt with later
-sub check_uridnsbl {
- return 0;
-}
-
# ---------------------------------------------------------------------------
# once the metadata is parsed, we can access the URI list. So start off
# the lookups here!
-sub parsed_metadata {
- my ($self, $opts) = @_;
- my $pms = $opts->{permsgstatus};
+
+#
+# only parse and run after first check_uridnsbl call
+#
+sub check_uridnsbl {
+ my ($self, $pms, @args) = @_;
+
+ # only parse once
+ return 0 if $pms->{uridnsbl_done};
+ $pms->{uridnsbl_done} = 1;
+
my $conf = $pms->{conf};
return 0 if $conf->{skip_uribl_checks};
@@ -494,7 +497,7 @@
# and query
$self->query_hosts_or_domains($pms, \%hostlist);
- return 1;
+ return 0;
}
# Accepts argument in one of the following forms: m, n1-n2, or n/m,
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 5930] Shortcircuiting before DNS lookups
Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5930
Henrik Krohns <he...@hege.li> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #4342|0 |1
is obsolete| |
CC| |hege@hege.li
--- Comment #4 from Henrik Krohns <he...@hege.li> ---
Created attachment 5239
--> https://issues.apache.org/SpamAssassin/attachment.cgi?id=5239&action=edit
Patch to run rbls after priority -10000
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 5930] Shortcircuiting before DNS lookups
Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5930
--- Comment #3 from AXB <ax...@gmail.com> ---
while testing Henrik's TLD patches with blacklist_from, etc, I noticed that net
lookups were still happening - not really necessary.
Could we reactivate this "discussion"
I see quite a bit of potential to save many cycles and network traffic if we
could implement this idea.
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 5930] Shortcircuiting before DNS lookups
Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5930
--- Comment #8 from Kevin A. McGrail <km...@pccc.com> ---
I don't use short-circuiting nor does the priority really have good
documentation at this time as shown by discussion with Philip Prindeville re:
bug 7060.
What's the path forward you recommend?
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 5930] Shortcircuiting before DNS lookups
Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5930
--- Comment #10 from Henrik Krohns <he...@hege.li> ---
Given these defaults already exist, we could just come up with arbitrary
priority like -100 after which RBLs launch..
60_shortcircuit.cf:priority USER_IN_WHITELIST -1000
60_shortcircuit.cf:priority USER_IN_DEF_WHITELIST -1000
60_shortcircuit.cf:priority USER_IN_ALL_SPAM_TO -1000
60_shortcircuit.cf:priority SUBJECT_IN_WHITELIST -1000
60_shortcircuit.cf:priority ALL_TRUSTED -950
60_shortcircuit.cf:priority SUBJECT_IN_BLACKLIST -900
60_shortcircuit.cf:priority USER_IN_BLACKLIST_TO -900
60_shortcircuit.cf:priority USER_IN_BLACKLIST -900
60_shortcircuit.cf:priority BAYES_99 -400
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 5930] Shortcircuiting before DNS lookups
Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5930
Kevin A. McGrail <km...@pccc.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |kmcgrail@pccc.com
--- Comment #9 from Kevin A. McGrail <km...@pccc.com> ---
Following my own comment: with years of testing sounds like we get it
committed, get it documented and get a few rules that use it. If you can do
that, it'll make 3.4.1rc1 for sure.
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 5930] Shortcircuiting before DNS lookups
Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5930
RW <rw...@googlemail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rwmaillists@googlemail.com
--- Comment #11 from RW <rw...@googlemail.com> ---
The gratuitous DNS requests are not necessarily wasted because they bring the
results into cache. A relatively small number of slow lookups taken off the
critical path would make this a win in terms of throughput.
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 5930] Shortcircuiting before DNS lookups
Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5930
--- Comment #5 from Henrik Krohns <he...@hege.li> ---
Created attachment 5240
--> https://issues.apache.org/SpamAssassin/attachment.cgi?id=5240&action=edit
Patch to launch uribl lookups only after check_uridnsbl is actually called
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 5930] Shortcircuiting before DNS lookups
Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5930
--- Comment #2 from Michael Parker <pa...@pobox.com> 2008-06-23 09:39:05 PST ---
You could create a new Check plugin, subclass really, and override the
check_main to do what you want.
--
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
[Bug 5930] Shortcircuiting before DNS lookups
Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5930
--- Comment #1 from Justin Mason <jm...@jmason.org> 2008-06-23 09:31:15 PST ---
(In reply to comment #0)
> I couldn't come up with any elegant solution for general use. Perhaps we could
> allocate some priority-range that are run before DNS?
That would work for me, I think.
> But that would mean users
> could accidently prioritize rules that need decoded message, and decoding
> should be done only after DNS launching. Ideas?
If message decoding is "lazily evaluated" -- which it is -- then that wouldn't
really matter...
--
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.