You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Amir Caspi <ce...@3phase.com> on 2015/10/19 20:07:04 UTC

Misbehaving HEADER_HOST_IN_BLACKLIST? And no SPF on SA list host?

Hi,

	I didn't realize this until now but it looks like, for at least the last 6 months or so, a few emails from users@spamassassin have been dropped into my spam folder due to what I perceive to be a bug in the HEADER_HOST_IN_BLACKLIST rule.  Specifically, I've got some blacklist_uri_host rules, but because I don't want those to be poison pills, I've adjusted URI_HOST_IN_BLACKLIST to score only 3 points nominally (technically, 4  3.5  4  3, but for me that's almost always "3").  That score redef works fine, but then HEADER_HOST_IN_BLACKLIST hits with 100 points, even though the blacklisted URI is NOT in the headers, and in any case I haven't blacklisted a header host, only a URI host.
	I've ALSO got a whitelist_from_spf rule for the SA list host, but that doesn't seem to be hitting at all.

To be specific, here's the relevant excerpt from local.cf:

# Don't scan the SA mailing list
whitelist_from_spf *@spamassassin.apache.org

# Add spamminess to blacklisted hosts but not poison-pill
ifplugin Mail::SpamAssassin::Plugin::WLBLEval
score URI_HOST_IN_BLACKLIST 4  3.5  4  3
blacklist_uri_host [redacted for brevity]
[repeat above a few times]
endif

And, here are a couple of spamples that have hit this problem, one from April 2015, one from just last week:
http://pastebin.com/vpXAVjaH
http://pastebin.com/B3kFg4Xn

In #1, note that the blacklisted URI appears only in the body, NOT in the Headers.  In #2, note that the blacklisted URI appears in a _quoted_ header in the body, but again not in a real header.  In neither email does my SPF whitelist rule hit... for #2 that appears to be because the mail came from nike.apache.org (with SPF pass), but for #1 (and, in fact, ALL recent messages), it appears the mail came from mail.apache.org aka hermes.apache.org, with _NO_ SPF pass.

(1) What's going on with HEADER_HOST_IN_BLACKLIST?  This appears to be a bug, to me...
(2) Why is there no SPF pass on the SA mailing list host @apache, and does this mean I need to use a whitelist_to instead of whitelist_from_spf ?

Thanks.

--- Amir


Re: Misbehaving HEADER_HOST_IN_BLACKLIST? And no SPF on SA list host?

Posted by RW <rw...@googlemail.com>.
On Wed, 21 Oct 2015 18:34:27 -0700
Kevin A. McGrail wrote:


> On October 20, 2015 11:39:36 AM PDT, Amir Caspi <ce...@3phase.com>
> wrote:
> >On Oct 19, 2015, at 1:16 PM, RW <rw...@googlemail.com> wrote:
> >  
> >> body   URI_HOST_IN_BLACKLIST    eval:check_uri_host_in_blacklist()
> >> header HEADER_HOST_IN_BLACKLIST eval:check_uri_host_listed('BLACK')
> >> 
> >> These appear to be the same thing. The first call is just a
> >> shorthand form for the second. I don't see where headers come into
> >> it. I think  
> >the  
> >> second rule is probably just a mistake.  
> >
> >So, following up on this... do any of the main devs see the second
> >rule as a problem?  It seems to be that a header rule shouldn't be
> >checking URI hosts, but even if so, it absolutely shouldn't be
> >hitting when those hosts aren't even in the headers (per the two
> >spamples I posted).

> I want to run the samples you provided and see if I can duplicate the
> issue but it definitely sounds odd. Regards,
> KAM

See

 https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7256

If you look at the code it's fairly obvious that they are the same
rule.  check_uri_host_in_blacklist() just passes BLACK to
check_uri_host_listed(). 

All the work is done in the first invocation of
check_uri_host_listed();  and the cached matches are only indexed by the
list name (BLACK) without any header/body distinction. 

From  a cursory look at  _check_uri_host_listed() it appears to be
doing what the name implies - it checks URIs.

Re: Misbehaving HEADER_HOST_IN_BLACKLIST? And no SPF on SA list host?

Posted by Amir Caspi <ce...@3phase.com>.
On Oct 21, 2015, at 7:34 PM, Kevin A. McGrail <KM...@PCCC.com> wrote:

> I want to run the samples you provided and see if I can duplicate the issue but it definitely sounds odd.

I've got four more of them, if you want.  (Includes a reply to one of the spamples, a separate two-message thread, and another single message.)

--- Amir


Re: Misbehaving HEADER_HOST_IN_BLACKLIST? And no SPF on SA list host?

Posted by "Kevin A. McGrail" <KM...@PCCC.com>.
I want to run the samples you provided and see if I can duplicate the issue but it definitely sounds odd.
Regards,
KAM

On October 20, 2015 11:39:36 AM PDT, Amir Caspi <ce...@3phase.com> wrote:
>On Oct 19, 2015, at 1:16 PM, RW <rw...@googlemail.com> wrote:
>
>> body   URI_HOST_IN_BLACKLIST    eval:check_uri_host_in_blacklist()
>> header HEADER_HOST_IN_BLACKLIST eval:check_uri_host_listed('BLACK')
>> 
>> These appear to be the same thing. The first call is just a shorthand
>> form for the second. I don't see where headers come into it. I think
>the
>> second rule is probably just a mistake.
>
>So, following up on this... do any of the main devs see the second rule
>as a problem?  It seems to be that a header rule shouldn't be checking
>URI hosts, but even if so, it absolutely shouldn't be hitting when
>those hosts aren't even in the headers (per the two spamples I posted).
>
>Kevin, John, others?
>
>Obviously this is only causing a few rare FPs, and presumably it would
>most likely affect this or some other spam-discussion list... but it
>appears to be a bug, no?
>
>Thanks!
>
>--- Amir

Re: Misbehaving HEADER_HOST_IN_BLACKLIST? And no SPF on SA list host?

Posted by RW <rw...@googlemail.com>.
On Tue, 20 Oct 2015 11:58:11 -0700 (PDT)
John Hardin wrote:

> On Tue, 20 Oct 2015, Amir Caspi wrote:
> 
> > On Oct 19, 2015, at 1:16 PM, RW <rw...@googlemail.com> wrote:
> >
> >> body   URI_HOST_IN_BLACKLIST    eval:check_uri_host_in_blacklist()
> >> header HEADER_HOST_IN_BLACKLIST eval:check_uri_host_listed('BLACK')
> >>
> >> These appear to be the same thing. The first call is just a
> >> shorthand form for the second. I don't see where headers come into
> >> it. I think the second rule is probably just a mistake.
> >
> > So, following up on this... do any of the main devs see the second
> > rule as a problem?  It seems to be that a header rule shouldn't be
> > checking URI hosts, but even if so, it absolutely shouldn't be
> > hitting when those hosts aren't even in the headers (per the two
> > spamples I posted).
> 
> My default assumption for the behavior of a header eval() rule would
> be that it only checks message headers. If that's not the case (as
> you describe) then I'd agree the rule is a problem, especially if it
> leads to duplicate hits.
> 
> Whether that's a bug in the documentation, or a bug in the rules, or
> a bug in eval(), or a bug in the implementation of check_uri_host_*,
> I can't really say at this point.

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7256

Re: Misbehaving HEADER_HOST_IN_BLACKLIST? And no SPF on SA list host?

Posted by John Hardin <jh...@impsec.org>.
On Tue, 20 Oct 2015, Amir Caspi wrote:

> On Oct 19, 2015, at 1:16 PM, RW <rw...@googlemail.com> wrote:
>
>> body   URI_HOST_IN_BLACKLIST    eval:check_uri_host_in_blacklist()
>> header HEADER_HOST_IN_BLACKLIST eval:check_uri_host_listed('BLACK')
>>
>> These appear to be the same thing. The first call is just a shorthand
>> form for the second. I don't see where headers come into it. I think the
>> second rule is probably just a mistake.
>
> So, following up on this... do any of the main devs see the second rule 
> as a problem?  It seems to be that a header rule shouldn't be checking 
> URI hosts, but even if so, it absolutely shouldn't be hitting when those 
> hosts aren't even in the headers (per the two spamples I posted).

My default assumption for the behavior of a header eval() rule would be 
that it only checks message headers. If that's not the case (as you 
describe) then I'd agree the rule is a problem, especially if it leads to 
duplicate hits.

Whether that's a bug in the documentation, or a bug in the rules, or a bug 
in eval(), or a bug in the implementation of check_uri_host_*, I can't 
really say at this point.

Speculation: If the check_uri_host_* eval()s are looking only at the URI 
list regardless of the rule type (i.e. it always behaves as if it was a 
uri rule) then I'd say that needs to be documented clearly (if it isn't 
documented by more than just an example uri rule) and the rules fixed to 
remove the duplicate hits. If the intent of the eval()s was to respect the 
rule type, it's apparently not doing that.

I don't have time at the moment to dig around in the code to see what it's 
doing and whether it's a documentation/rule issue or an eval() code issue.

> Kevin, John, others?
>
> Obviously this is only causing a few rare FPs, and presumably it would 
> most likely affect this or some other spam-discussion list... but it 
> appears to be a bug, no?
>
> Thanks!
>
> --- Amir
>

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   You cannot bring about prosperity by discouraging thrift. You
   cannot help small men by tearing down big men. You cannot
   strengthen the weak by weakening the strong. You cannot lift the
   wage-earner by pulling down the wage-payer. You cannot help the
   poor man by destroying the rich. You cannot keep out of trouble by
   spending more than your income. You cannot further the brotherhood
   of man by inciting class hatred. You cannot establish security on
   borrowed money. You cannot build character and courage by taking
   away men's initiative and independence. You cannot help men
   permanently by doing for them what they could and should do for
   themselves.                               -- William J. H. Boetcker
-----------------------------------------------------------------------

Re: Misbehaving HEADER_HOST_IN_BLACKLIST? And no SPF on SA list host?

Posted by Amir Caspi <ce...@3phase.com>.
On Oct 19, 2015, at 1:16 PM, RW <rw...@googlemail.com> wrote:

> body   URI_HOST_IN_BLACKLIST    eval:check_uri_host_in_blacklist()
> header HEADER_HOST_IN_BLACKLIST eval:check_uri_host_listed('BLACK')
> 
> These appear to be the same thing. The first call is just a shorthand
> form for the second. I don't see where headers come into it. I think the
> second rule is probably just a mistake.

So, following up on this... do any of the main devs see the second rule as a problem?  It seems to be that a header rule shouldn't be checking URI hosts, but even if so, it absolutely shouldn't be hitting when those hosts aren't even in the headers (per the two spamples I posted).

Kevin, John, others?

Obviously this is only causing a few rare FPs, and presumably it would most likely affect this or some other spam-discussion list... but it appears to be a bug, no?

Thanks!

--- Amir


Re: Misbehaving HEADER_HOST_IN_BLACKLIST? And no SPF on SA list host?

Posted by RW <rw...@googlemail.com>.
On Mon, 19 Oct 2015 13:26:09 -0600
Amir Caspi wrote:

> On Oct 19, 2015, at 1:16 PM, RW <rw...@googlemail.com> wrote:
> > 
> > IIWY I wouldn't try to rescore the blacklisted URIs. I'd create a
> > separate list for the TLDs
> 
> Why? It might avoid this issue but IMHO the second rule is a bug, 

Yes

> so that's a band-aid rather than a solution.

I wasn't suggesting it as a workaround. You can have as many lists as
you like, BLACK and WHITE are just 2 predefined labels. If it were me
I'd rather keep the blacklist for scoring hosts/domains that are
controlled by spammers - even if I didn't currently have anything to
put in it.

Re: Misbehaving HEADER_HOST_IN_BLACKLIST? And no SPF on SA list host?

Posted by Amir Caspi <ce...@3phase.com>.
On Oct 19, 2015, at 1:16 PM, RW <rw...@googlemail.com> wrote:
> 
> IIWY I wouldn't try to rescore the blacklisted URIs. I'd create a
> separate list for the TLDs

Why? It might avoid this issue but IMHO the second rule is a bug, so that's a band-aid rather than a solution. I don't want a 100-point poison pill in general, hence the restoring. Much easier than multiple-line rules.

Thanks.

--- Amir
thumbed via iPhone

Re: Misbehaving HEADER_HOST_IN_BLACKLIST? And no SPF on SA list host?

Posted by RW <rw...@googlemail.com>.
On Mon, 19 Oct 2015 12:07:04 -0600
Amir Caspi wrote:

> Hi,
> 
> 	I didn't realize this until now but it looks like, for at
> least the last 6 months or so, a few emails from users@spamassassin
> have been dropped into my spam folder due to what I perceive to be a
> bug in the HEADER_HOST_IN_BLACKLIST rule.  Specifically, I've got
> some blacklist_uri_host rules, but because I don't want those to be
> poison pills, I've adjusted URI_HOST_IN_BLACKLIST to score only 3
> points nominally (technically, 4  3.5  4  3, but for me that's almost
> always "3").  That score redef works fine, but then
> HEADER_HOST_IN_BLACKLIST hits with 100 points, even though the
> blacklisted URI is NOT in the headers, 

I haven't really paid much attention to uri host rules, so I'm not
certain what's supposed to be happening but the definitions are:

body   URI_HOST_IN_BLACKLIST    eval:check_uri_host_in_blacklist()
header HEADER_HOST_IN_BLACKLIST eval:check_uri_host_listed('BLACK')

These appear to be the same thing. The first call is just a shorthand
form for the second. I don't see where headers come into it. I think the
second rule is probably just a mistake.



IIWY I wouldn't try to rescore the blacklisted URIs. I'd create a
separate list for the TLDs

enlist_uri_host (NEW_TLDS) science xxx 
...

body   URI_NEW_TLDS    eval:check_uri_host_listed('NEW_TLDS')