You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@spamassassin.apache.org on 2020/06/07 09:46:30 UTC
[Bug 7822] New: HashBL not examining all addresses in a message
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7822
Bug ID: 7822
Summary: HashBL not examining all addresses in a message
Product: Spamassassin
Version: 3.4.4
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: P2
Component: Plugins
Assignee: dev@spamassassin.apache.org
Reporter: mgrant@grant.org
Target Milestone: Undefined
HashBL isn't examining all email addresses in a message.
On line 344 of HashBO.pm, it gets the stripped body of lines back in an array
reference. The join is returning an array and not a string in this case. (see
https://www.perlmonks.org/?node_id=222012). So all we need to do is deref it
into an array.
Here's the fix:
*** /usr/share/perl5/Mail/SpamAssassin/Plugin/HashBL.pm 2020-06-07
05:38:02.517917366 -0400
--- /usr/share/perl5/Mail/SpamAssassin/Plugin/HashBL.pm.orig 2019-11-10
23:09:44.000000000 -0500
***************
*** 341,347 ****
}
}
}
! my $body = join('', @{$pms->get_decoded_stripped_body_text_array()});
if ($opts =~ /\bnouri\b/) {
# strip urls with possible emails inside
$body =~ s#<?https?://\S{0,255}(?:\@|%40)\S{0,255}# #gi;
--- 341,347 ----
}
}
}
! my $body = join('', $pms->get_decoded_stripped_body_text_array());
if ($opts =~ /\bnouri\b/) {
# strip urls with possible emails inside
$body =~ s#<?https?://\S{0,255}(?:\@|%40)\S{0,255}# #gi;
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 7822] HashBL not examining all addresses in a message
Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7822
--- Comment #8 from Henrik Krohns <ap...@hege.li> ---
Multiple emails now logged.
Sending spamassassin-3.4/lib/Mail/SpamAssassin/Plugin/HashBL.pm
Sending trunk/lib/Mail/SpamAssassin/Plugin/HashBL.pm
Transmitting file data ..done
Committing transaction...
Committed revision 1878574.
Note that 3.4 requires also Revision 1878572, this doesn't work without
check_cleanup callback.
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 7822] HashBL not examining all addresses in a message
Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7822
Michael Grant <mg...@grant.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |mgrant@grant.org
--- Comment #1 from Michael Grant <mg...@grant.org> ---
(sorry that was a "reverse patch" but you see what I did)
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 7822] HashBL not examining all addresses in a message
Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7822
Henrik Krohns <ap...@hege.li> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|NEW |RESOLVED
CC| |apache@hege.li
--- Comment #2 from Henrik Krohns <ap...@hege.li> ---
Thanks fixed.
Sending spamassassin-3.4/lib/Mail/SpamAssassin/Plugin/HashBL.pm
Sending trunk/lib/Mail/SpamAssassin/Plugin/HashBL.pm
Transmitting file data ..done
Committing transaction...
Committed revision 1878559.
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 7822] HashBL not examining all addresses in a message
Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7822
--- Comment #9 from Henrik Krohns <ap...@hege.li> ---
(In reply to Henrik Krohns from comment #8)
> Multiple emails now logged.
>
> Sending spamassassin-3.4/lib/Mail/SpamAssassin/Plugin/HashBL.pm
> Sending trunk/lib/Mail/SpamAssassin/Plugin/HashBL.pm
> Transmitting file data ..done
> Committing transaction...
> Committed revision 1878574.
>
> Note that 3.4 requires also Revision 1878572, this doesn't work without
> check_cleanup callback.
FYI this change was reverted in 3.4, due to Bug 7897.
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 7822] HashBL not examining all addresses in a message
Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7822
--- Comment #6 from Michael Grant <mg...@grant.org> ---
Not shouting at you. With such a long report, I am merely trying to draw
attention to what I thought was the relavant part of the message, as in bold.
Thanks for pointing out that those were not deprecated, that's my misreading of
the docs then.
I did spend about 3 full days digging into these things. I'm no expert at
spamassassin modules. I hope at least I shedded some light on what's going on
or at minimum make it clear that there seems to be problems that I couldn't fix
and someone with better knowledge/experience with these modules can take it
further.
At minimum, with the changes I did, the module *seems* usable. (my emphasis,
not shouting!)
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 7822] HashBL not examining all addresses in a message
Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7822
--- Comment #4 from Henrik Krohns <ap...@hege.li> ---
(In reply to Michael Grant from comment #3)
> HashBL uses this bgsend_and_start_lookup function which seems to be
> DEPRECATED.
What makes you think this? It's not.
In trunk the workings were streamlined and register_async_rule_* was deprecated
as redundant there.
> It then calls _fininsh_query() which calls got_hit() which I think
> may be not causing it to collect up any further evidence.
All DNS queries are independent and each of them will hit _finish_query() in
the end, nothing stopping the rest from hitting.
> PLEASE REPORT ALL ADDRESSES that get hits in the hashbl.
Shouting doesn't help, but I'll try to have a look.
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 7822] HashBL not examining all addresses in a message
Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7822
--- Comment #7 from Henrik Krohns <ap...@hege.li> ---
(In reply to Michael Grant from comment #6)
>
> At minimum, with the changes I did, the module *seems* usable. (my emphasis,
> not shouting!)
Your logic is in the right direction, one can save all the hits without calling
got_hit() and only later check if there was a hit. But it needs to be done in
slightly more correct way, I'll see if I can make it today..
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 7822] HashBL not examining all addresses in a message
Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7822
--- Comment #5 from Henrik Krohns <ap...@hege.li> ---
(In reply to Henrik Krohns from comment #4)
>
> Shouting doesn't help, but I'll try to have a look.
I've created Bug 7824, but it's not a simple fix. So it will only be
implemented in 4.0+ if someone has time for it.
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 7822] HashBL not examining all addresses in a message
Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7822
--- Comment #3 from Michael Grant <mg...@grant.org> ---
While debugging this, I also see that HashBL.pm is only reporting the first
hit.
HashBL.pm looks though a message and extracts out all the email addresses in
the message headers and body.
* 0.1 HASHBL_EMAIL Message contains email address found on
* the EBL
* [user[at]example.com]
Here's what's going on. On lines 646,647:
646│ $ent = $pms->{async}->bgsend_and_start_lookup($lookup, $type, undef,
$ent,
647│ sub { my ($ent, $pkt) = @_; $self->_finish_query($pms, $ent, $pkt); },
HashBL uses this bgsend_and_start_lookup function which seems to be DEPRECATED.
It then calls _fininsh_query() which calls got_hit() which I think may be not
causing it to collect up any further evidence.
One could argue that if a message contains at least one address on the hashbl,
why continue? However, we have already made the queries for each address (or
gotten them from the cache). If you don't put them in the headers, you obscure
why valuable information in downstream spam reporting as to why this message
was flagged as spam. PLEASE REPORT ALL ADDRESSES that get hits in the hashbl.
Here's a larger fix for that, but I don't know if I did this the best, cleanest
way. Probably updating this module not to use the deprecated
bgsend_and_start_lookup() function and rewriting whatever replacement for the
finish_query() function is probably the better way to fix this, but for now,
this at least shows me all the addresses that get hits in a , separated list.
I did not know how to get the addresses to come out on separate lines, and
adding a space after the , (as in ", ") seems to force a separate line in a
strange way.
*** /usr/share/perl5/Mail/SpamAssassin/Plugin/HashBL.pm.orig 2019-11-10
23:09:44.000000000 -0500
--- /usr/share/perl5/Mail/SpamAssassin/Plugin/HashBL.pm 2020-06-07
06:53:06.459383712 -0400
***************
*** 150,155 ****
--- 150,157 ----
$self->register_eval_rule("check_hashbl_uris");
$self->register_eval_rule("check_hashbl_bodyre");
$self->set_config($mailsa->{conf});
+ $self->{outstanding_queries}=0;
+ $self->{hits}="";
return $self;
}
***************
*** 341,347 ****
}
}
}
! my $body = join('', $pms->get_decoded_stripped_body_text_array());
if ($opts =~ /\bnouri\b/) {
# strip urls with possible emails inside
$body =~ s#<?https?://\S{0,255}(?:\@|%40)\S{0,255}# #gi;
--- 343,349 ----
}
}
}
! my $body = join('', @{$pms->get_decoded_stripped_body_text_array()});
if ($opts =~ /\bnouri\b/) {
# strip urls with possible emails inside
$body =~ s#<?https?://\S{0,255}(?:\@|%40)\S{0,255}# #gi;
***************
*** 643,648 ****
--- 645,651 ----
value => $value,
subtest => $subtest,
};
+ $self->{outstanding_queries}++;
$ent = $pms->{async}->bgsend_and_start_lookup($lookup, $type, undef, $ent,
sub { my ($ent, $pkt) = @_; $self->_finish_query($pms, $ent, $pkt); },
master_deadline => $pms->{master_deadline}
***************
*** 665,676 ****
if ($rr->address =~ $dnsmatch) {
dbg("$ent->{rulename}: $ent->{zone} hit '$ent->{value}'");
$ent->{value} =~ s/\@/[at]/g;
! $pms->test_log($ent->{value});
$pms->got_hit($ent->{rulename}, '', ruletype => 'eval');
$pms->register_async_rule_finish($ent->{rulename});
- return;
- }
}
}
# Version features
--- 668,684 ----
if ($rr->address =~ $dnsmatch) {
dbg("$ent->{rulename}: $ent->{zone} hit '$ent->{value}'");
$ent->{value} =~ s/\@/[at]/g;
! if ($self->{hits}) { $self->{hits} .= ","; }
! $self->{hits} .= $ent->{value};
! }
! }
! $self->{outstanding_queries}--;
! if ($self->{outstanding_queries}==0) {
! $pms->test_log($self->{hits});
$pms->got_hit($ent->{rulename}, '', ruletype => 'eval');
$pms->register_async_rule_finish($ent->{rulename});
}
+ return;
}
# Version features
--
You are receiving this mail because:
You are the assignee for the bug.