You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Paul Pace <pa...@mostlybsd.com> on 2022/05/07 14:42:59 UTC

Spamhaus spurious positives - how does SpamAssassin check Spamhaus?

I have set up SpamAssassin with the following in 
/etc/spamassassin/mycustomscores.cf:

score RCVD_IN_SBL	10.0
score RCVD_IN_XBL	10.0
score RCVD_IN_PBL	10.0
score RCVD_IN_SBL_CSS	10.0
score URIBL_SBL		10.0
score URIBL_CSS		10.0
score URIBL_CSS_A	10.0
score URIBL_SBL_A	10.0

I do not otherwise block using Spamhaus at the MTA or elsewhere.

I occasionally see false positives because of these scores and it is 
when a domain is in the body of a message. When I check the Spamhaus 
website[1], the domain is not there. Each time this has occurred, it has 
been for a website currently in the news and usually something to do 
with politics.

A few days ago I happened to be on my computer exactly when one of these 
false positives came in[2]. I immediately went and checked the Spamhaus 
site and the domain was not listed. I checked several times throughout 
the day and never saw the domain there.

So I am trying to figure out why there is a disparity between what 
SpamAssassin reports and the Spamhaus website reports, but I'm not clear 
how SpamAssassin checks Spamhaus, and since these are usually domains I 
rarely have in a message any place, I don't have a good feel for whether 
or not this is some regular problem.

If anyone can point me to how this check is performed, that would be 
very helpful.

Thank you,


Paul

[1] https://check.spamhaus.org/
[2] Scores:
	*   10 URIBL_SBL_A Contains URL's A record listed in the Spamhaus SBL
	*      blocklist
	*      [URIs: wikileaksdotorg]
	*   10 URIBL_SBL Contains an URL's NS IP listed in the Spamhaus SBL
	*      blocklist
	*      [URIs: wikileaksdotorg]

Re: Spamhaus spurious positives - how does SpamAssassin check Spamhaus?

Posted by Bill Cole <sa...@billmail.scconsult.com>.
On 2022-05-07 at 10:42:59 UTC-0400 (Sat, 07 May 2022 07:42:59 -0700)
Paul Pace <pa...@mostlybsd.com>
is rumored to have said:

> I have set up SpamAssassin with the following in /etc/spamassassin/mycustomscores.cf:
>
> score RCVD_IN_SBL	10.0
> score RCVD_IN_XBL	10.0
> score RCVD_IN_PBL	10.0
> score RCVD_IN_SBL_CSS	10.0

Not entirely unreasonable. Cheaper to do most of that in the MTA, unless you have complex whitelisting needs.

> score URIBL_SBL		10.0
> score URIBL_CSS		10.0
> score URIBL_CSS_A	10.0
> score URIBL_SBL_A	10.0

I'm surprised that this is anywhere near usable.

> I do not otherwise block using Spamhaus at the MTA or elsewhere.
>
> I occasionally see false positives because of these scores and it is when a domain is in the body of a message.

So: the URIBL_* rules.

> When I check the Spamhaus website[1], the domain is not there. Each time this has occurred, it has been for a website currently in the news and usually something to do with politics.
>
> A few days ago I happened to be on my computer exactly when one of these false positives came in[2]. I immediately went and checked the Spamhaus site and the domain was not listed. I checked several times throughout the day and never saw the domain there.

The Spamhaus SBL will never show any domain name as listed because it does not list domain names. It lists IP addresses.

> So I am trying to figure out why there is a disparity between what SpamAssassin reports and the Spamhaus website reports, but I'm not clear how SpamAssassin checks Spamhaus, and since these are usually domains I rarely have in a message any place, I don't have a good feel for whether or not this is some regular problem.
>
> If anyone can point me to how this check is performed, that would be very helpful.
>
> Thank you,
>
>
> Paul
>
> [1] https://check.spamhaus.org/
> [2] Scores:
> 	*   10 URIBL_SBL_A Contains URL's A record listed in the Spamhaus SBL
> 	*      blocklist
> 	*      [URIs: wikileaksdotorg]
> 	*   10 URIBL_SBL Contains an URL's NS IP listed in the Spamhaus SBL
> 	*      blocklist
> 	*      [URIs: wikileaksdotorg]


Read the rule descriptions carefully. Also see the rule definitions and ` perldoc Mail::SpamAssassin::Plugin::URIDNSBL`

SBL, including its CSS component, lists IP addresses, NOT domain names. In these cases, as documented, SA looks up a specific record type (A, NS, or MX) for a name extracted from an URL to get one or more IP addresses, and then those IP addresses are checked against the DNSBL.

-- 
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire

Re: Spamhaus spurious positives - how does SpamAssassin check Spamhaus?

Posted by Pedro David Marco <pe...@yahoo.com>.
 To me it looks like a a DNS cache times issue...
Paul, what resolver are you using?
is your server under heavy load when this happens? if it is Linux, run    netstat -suna    and check for any errors in the Udp area. In FreeBSD  netstat -sa
----Pedro.

   On Saturday, May 7, 2022, 06:36:43 PM GMT+2, Paul Pace <pa...@mostlybsd.com> wrote:  
 
 >On 2022-05-07 07:53, Benny Pedersen wrote:
> On 2022-05-07 16:42, Paul Pace wrote:
>> I have set up SpamAssassin with the following in
>> /etc/spamassassin/mycustomscores.cf:
> 
>>     *  10 URIBL_SBL Contains an URL's NS IP listed in the Spamhaus SBL
>>     *      blocklist
>>     *      [URIs: wikileaksdotorg]
> 
> add to /etc/spamassassin/mycustomskipuribl.cf:
> 
> skip_uribl_domains wikileaksdotorg

>The problem with this solution is I don't know which domain is going to 
>be next, plus I'm not so much looking for a solution to this specific 
>result, but rather I want to understand why there is a disparity between 
>what SpamAssassin is reporting and what the Spamhaus website is 
>reporting.

> 
> or reduce spamhaus score

>With this I will get more spam in my inbox, especially spam sent from 
>compromised accounts which usually have lots of positive modifiers.  

Re: Spamhaus spurious positives - how does SpamAssassin check Spamhaus?

Posted by Paul Pace <pa...@mostlybsd.com>.
On 2022-05-07 10:37, Matija Nalis wrote:
> On Sat, May 07, 2022 at 09:35:31AM -0700, Paul Pace wrote:
>> On 2022-05-07 07:53, Benny Pedersen wrote:
>> > On 2022-05-07 16:42, Paul Pace wrote:
>> > > 	*   10 URIBL_SBL Contains an URL's NS IP listed in the Spamhaus SBL
>> > > 	*      blocklist
>> > > 	*      [URIs: wikileaksdotorg]
>> 
>> The problem with this solution is I don't know which domain is going 
>> to be
>> next, plus I'm not so much looking for a solution to this specific 
>> result,
>> but rather I want to understand why there is a disparity between what
>> SpamAssassin is reporting and what the Spamhaus website is reporting.
> 
> If you do:
> 
> grep -r URIBL_SBL /var/lib/spamassassin/
> you'll see it does this:
> 
> /var/lib/spamassassin/3.004006/updates_spamassassin_org/25_uribl.cf:uridnssub
>       URIBL_SBL        zen.spamhaus.org.       A   127.0.0.2
> /var/lib/spamassassin/3.004006/updates_spamassassin_org/25_uribl.cf:body
>            URIBL_SBL        eval:check_uridnsbl('URIBL_SBL')
> /var/lib/spamassassin/3.004006/updates_spamassassin_org/25_uribl.cf:describe
>        URIBL_SBL        Contains an URL's NS IP listed in the Spamhaus
> SBL blocklist
> 
> which means if it wanted to check (for example) 195.35.109.44 it would 
> do
> DNS A record lookup on "44.109.35.195.zen.spamhaus.org" (note reversed 
> quads),
> and check if the result is "127.0.0.2" (which happens to be true in 
> this case
> at the moment - but might not be some time later):
> 
> % host -t a 44.109.35.195.zen.spamhaus.org
> 44.109.35.195.zen.spamhaus.org has address 127.0.0.2
> 
> Same procedure can be used for others RBLs.
> 
> As to why web lookup returns different result, is might be because
> DNS results was cached earlier (maybe by some previous spam message),
> and/or because you did not look it up fast enough. Data on RBL
> servers changes all the time, and there is usually delay between
> their current database (which is likely what the web interface looks
> up directly) and their published DNS records (which would lag behind
> it).
> 
> Anyway if you do DNS check at the same time (or very close; I think
> default TTL there is 60 seconds) as spamassasin does it, you should
> get the same result. If you do it minutes or hours later, the results
> might be different again (how often they change depend on the RBL in
> question, as well as your luck).

Thank you, this is exactly what I was looking for. Using dig it looks 
like the TTL is 2100.

Re: Spamhaus spurious positives - how does SpamAssassin check Spamhaus?

Posted by Matija Nalis <mn...@voyager.hr>.
On Sat, May 07, 2022 at 09:35:31AM -0700, Paul Pace wrote:
> On 2022-05-07 07:53, Benny Pedersen wrote:
> > On 2022-05-07 16:42, Paul Pace wrote:
> > > 	*   10 URIBL_SBL Contains an URL's NS IP listed in the Spamhaus SBL
> > > 	*      blocklist
> > > 	*      [URIs: wikileaksdotorg]
> 
> The problem with this solution is I don't know which domain is going to be
> next, plus I'm not so much looking for a solution to this specific result,
> but rather I want to understand why there is a disparity between what
> SpamAssassin is reporting and what the Spamhaus website is reporting.

If you do:

grep -r URIBL_SBL /var/lib/spamassassin/
you'll see it does this:

/var/lib/spamassassin/3.004006/updates_spamassassin_org/25_uribl.cf:uridnssub       URIBL_SBL        zen.spamhaus.org.       A   127.0.0.2
/var/lib/spamassassin/3.004006/updates_spamassassin_org/25_uribl.cf:body            URIBL_SBL        eval:check_uridnsbl('URIBL_SBL')
/var/lib/spamassassin/3.004006/updates_spamassassin_org/25_uribl.cf:describe        URIBL_SBL        Contains an URL's NS IP listed in the Spamhaus SBL blocklist

which means if it wanted to check (for example) 195.35.109.44 it would do
DNS A record lookup on "44.109.35.195.zen.spamhaus.org" (note reversed quads),
and check if the result is "127.0.0.2" (which happens to be true in this case
at the moment - but might not be some time later):

% host -t a 44.109.35.195.zen.spamhaus.org
44.109.35.195.zen.spamhaus.org has address 127.0.0.2

Same procedure can be used for others RBLs. 

As to why web lookup returns different result, is might be because
DNS results was cached earlier (maybe by some previous spam message),
and/or because you did not look it up fast enough. Data on RBL
servers changes all the time, and there is usually delay between
their current database (which is likely what the web interface looks
up directly) and their published DNS records (which would lag behind
it).

Anyway if you do DNS check at the same time (or very close; I think
default TTL there is 60 seconds) as spamassasin does it, you should
get the same result. If you do it minutes or hours later, the results
might be different again (how often they change depend on the RBL in
question, as well as your luck).

-- 
Opinions above are GNU-copylefted.

Re: Spamhaus spurious positives - how does SpamAssassin check Spamhaus?

Posted by Paul Pace <pa...@mostlybsd.com>.
On 2022-05-07 07:53, Benny Pedersen wrote:
> On 2022-05-07 16:42, Paul Pace wrote:
>> I have set up SpamAssassin with the following in
>> /etc/spamassassin/mycustomscores.cf:
> 
>> 	*   10 URIBL_SBL Contains an URL's NS IP listed in the Spamhaus SBL
>> 	*      blocklist
>> 	*      [URIs: wikileaksdotorg]
> 
> add to /etc/spamassassin/mycustomskipuribl.cf:
> 
> skip_uribl_domains wikileaksdotorg

The problem with this solution is I don't know which domain is going to 
be next, plus I'm not so much looking for a solution to this specific 
result, but rather I want to understand why there is a disparity between 
what SpamAssassin is reporting and what the Spamhaus website is 
reporting.

> 
> or reduce spamhaus score

With this I will get more spam in my inbox, especially spam sent from 
compromised accounts which usually have lots of positive modifiers.

Re: Spamhaus spurious positives - how does SpamAssassin check Spamhaus?

Posted by Benny Pedersen <me...@junc.eu>.
On 2022-05-07 16:42, Paul Pace wrote:
> I have set up SpamAssassin with the following in
> /etc/spamassassin/mycustomscores.cf:

> 	*   10 URIBL_SBL Contains an URL's NS IP listed in the Spamhaus SBL
> 	*      blocklist
> 	*      [URIs: wikileaksdotorg]

add to /etc/spamassassin/mycustomskipuribl.cf:

skip_uribl_domains wikileaksdotorg

or reduce spamhaus score