You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Jeremy Ardley <je...@ardley.org> on 2022/04/07 03:31:46 UTC

Sequential spamassassin scans get different results

I have a mail setup with an internet facing postfix mail server "edge" 
(LAN name "firewall") and in internal LAN postfix with dovecot server 
"internal".

They both run the same version of SA with the same rules.

"edge" receives internet mail, scans it with spamassassin, and then 
forwards it to "internal" which also scans it with spamassassin.

The problem in this instance is "edge" got a spam score of 21.3, while 
"internal" got a score of 3.3

This is puzzling. Any explanations?

< Below headers from "internal" >

X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on internal.lan
X-Spam-Level: ***
X-Spam-Status: No, score=3.3 required=5.0 
tests=ALL_TRUSTED,DATE_IN_PAST_03_06,
     FROM_MISSPACED,HK_NAME_MR_MRS,HTML_MESSAGE,MISSING_HEADERS,
     T_FILL_THIS_FORM_SHORT autolearn=no autolearn_force=no version=3.4.6
Received: from edge.<redacted>
     by <...> (Postfix) with ESMTPS id E64F48601CD
     for <...>; Thu,  7 Apr 2022 09:32:58 +0800 (AWST)

< below headers and content from "edge" aka "firewall" >

Received: by edge.<...> (Postfix, from userid 115)
     id DC8554188D; Thu,  7 Apr 2022 09:32:58 +0800 (AWST)
Received: from localhost by firewall.lan
     with SpamAssassin (version 3.4.6);
     Thu, 07 Apr 2022 09:32:58 +0800

From: "MR. CHRISTOPHER TOWE."<ma...@thaidevhost.com>
Subject: MR. CHRISTOPHER TOWE.Director Airport Inspection Officer United 
Nations.
Date: Wed, 6 Apr 2022 15:09:53 -0700
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="----------=_624E3F4A.AF957A3D"
Message-Id: <20220407013258.DC8554188D@edge.<redacted>

This is a multi-part message in MIME format.

------------=_624E3F4A.AF957A3D
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit

Spam detection software, running on the system "<firewall>.lan",
has identified this incoming email as possible spam.  The original
message has been attached to this so you can view it or label
similar future email.  If you have any questions, see
@@CONTACT_ADDRESS@@ for details.

Content preview:  Good day. Thanks, how are you doing today? Hope you 
are doing
    very fine? I am newly transferred from Hartsfield-Jackson Atlanta 
International
    Airport to Laguardia International Airport New York City for an impo 
[...]


Content analysis details:   (21.3 points, 5.0 required)

  pts rule name              description
---- ---------------------- 
--------------------------------------------------
  1.0 NSL_RCVD_FROM_USER     Received from User
  1.0 FSL_CTYPE_WIN1251      Content-Type only seen in 419 spam
  1.2 MISSING_HEADERS        Missing To: header
  0.1 MIME_HTML_ONLY         BODY: Message only has text/html MIME parts
  0.0 HTML_MESSAGE           BODY: HTML included in message
  0.1 MISSING_MID            Missing Message-Id: header
  1.0 AXB_XMAILER_MIMEOLE_OL_024C2 Yet another X header trait
  1.0 HK_NAME_MR_MRS         No description available.
  1.0 FROM_MISSP_USER        From misspaced, from "User"
  0.0 FORGED_OUTLOOK_HTML    Outlook can't send HTML message only
  1.0 FROM_MISSP_MSFT        From misspaced + supposed Microsoft tool
  1.0 FSL_NEW_HELO_USER      Spam's using Helo and User
  0.6 FORGED_OUTLOOK_TAGS    Outlook can't send HTML in this format
  1.9 REPLYTO_WITHOUT_TO_CC  No description available.
  2.5 FREEMAIL_FORGED_REPLYTO Freemail in Reply-To, but not From
  1.0 FROM_MISSP_REPLYTO     From misspaced, has Reply-To
  1.0 TO_NO_BRKTS_FROM_MSSP  Multiple header formatting problems
  1.0 FROM_MISSPACED         From: missing whitespace
  1.0 TO_NO_BRKTS_MSFT       To: lacks brackets and supposed Microsoft tool
  0.0 T_FILL_THIS_FORM_SHORT Fill in a short form with personal
                             information
  2.8 FORGED_MUA_OUTLOOK     Forged mail pretending to be from MS Outlook
  1.0 FORM_FRAUD_3           Fill a form and several fraud phrases

The original message was not completely plain text, and may be unsafe to
open with some email clients; in particular, it may contain a virus,
or confirm that your address can receive spam.  If you wish to view
it, it may be safer to save it to a file and open it with an editor.

-- 
Jeremy


Re: Sequential spamassassin scans get different results

Posted by Matus UHLAR - fantomas <uh...@fantomas.sk>.
>On 7/4/22 3:09 pm, Matus UHLAR - fantomas wrote:
>>your edge sends the original message as an attachment, so your 
>>internal server can not process many of checks.  SA option 
>>"report_safe" does this.
>>
>>You should either trust edge server on its decision, or not do 
>>scanning there. If you do scan there, set "report_safe 0".

On 07.04.22 15:19, Jeremy Ardley wrote:
>I realised that dual scanning was redundant, so now I have it all done 
>on internal and none on edge. All I need to do now is wait for some 
>juicy spam!
>
>Off topic, I'm now running postcreen on the mail servers. Is that an 
>efficient way to limit spammers? Would I get better results if I 
>greylisted as well as postscreen? Or instead of?

I understand postscreen as quite effective replacement of greylisting...
-- 
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Linux is like a teepee: no Windows, no Gates and an apache inside...

Re: Sequential spamassassin scans get different results

Posted by Bill Cole <sa...@billmail.scconsult.com>.
On 2022-04-07 at 03:19:34 UTC-0400 (Thu, 7 Apr 2022 15:19:34 +0800)
Jeremy Ardley <je...@ardley.org>
is rumored to have said:

> Off topic, I'm now running postcreen on the mail servers. Is that an efficient way to limit spammers? Would I get better results if I greylisted as well as postscreen? Or instead of?

Postfix's postscreen layer is very good, and pretty much eliminates the most obnoxious 20-80% of spam. Where in that range you hit depends on the sort of spam you are tageted by. You should probably NOT enable its "after 220 server greeting" tests. Read the POSTSCREEN_README document in the Postfix docs.

Greylisting can catch more spam, including nearly all of what a conservatively configured postscreen would catches. However, it will also delay a lot of legitimate mail and on a large enough (or unlucky enough) system, it will need oversight to maintain a set of special cases of senders whose perfectly legitimate practices cause problems with greylisting.

I don't use greylisting anywhere, because it creates a support burden. People will complain about delayed legit mail. Some legit mail will be blocked until it is given special accommodation. The same is true of the "after 220 server greeting" tests in postscreen, which are effectively a sloppy sort of greylisting.

-- 
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire

Re: Sequential spamassassin scans get different results

Posted by Jeremy Ardley <je...@ardley.org>.
On 7/4/22 3:09 pm, Matus UHLAR - fantomas wrote:
>
> your edge sends the original message as an attachment, so your 
> internal server can not process many of checks.  SA option 
> "report_safe" does this.
>
> You should either trust edge server on its decision, or not do 
> scanning there. If you do scan there, set "report_safe 0".
>
>

I realised that dual scanning was redundant, so now I have it all done 
on internal and none on edge. All I need to do now is wait for some 
juicy spam!

Off topic, I'm now running postcreen on the mail servers. Is that an 
efficient way to limit spammers? Would I get better results if I 
greylisted as well as postscreen? Or instead of?

-- 
Jeremy

Re: Sequential spamassassin scans get different results

Posted by Matus UHLAR - fantomas <uh...@fantomas.sk>.
On 07.04.22 11:31, Jeremy Ardley wrote:
>I have a mail setup with an internet facing postfix mail server "edge" 
>(LAN name "firewall") and in internal LAN postfix with dovecot server 
>"internal".
>
>They both run the same version of SA with the same rules.
>
>"edge" receives internet mail, scans it with spamassassin, and then 
>forwards it to "internal" which also scans it with spamassassin.

>< Below headers from "internal" >

>X-Spam-Status: No, score=3.3 required=5.0 
>tests=ALL_TRUSTED,DATE_IN_PAST_03_06,
>    FROM_MISSPACED,HK_NAME_MR_MRS,HTML_MESSAGE,MISSING_HEADERS,
>    T_FILL_THIS_FORM_SHORT autolearn=no autolearn_force=no version=3.4.6


>< below headers and content from "edge" aka "firewall" >

>Spam detection software, running on the system "<firewall>.lan",
>has identified this incoming email as possible spam.  The original
>message has been attached to this so you can view it or label
>similar future email.  If you have any questions, see
>@@CONTACT_ADDRESS@@ for details.

your edge sends the original message as an attachment, so your internal 
server can not process many of checks.  SA option "report_safe" does this.

You should either trust edge server on its decision, or not do scanning 
there. If you do scan there, set "report_safe 0".


-- 
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
I drive way too fast to worry about cholesterol.

Re: Sequential spamassassin scans get different results

Posted by Jeremy Ardley <je...@ardley.org>.
On 7/4/22 11:05 pm, Bill Cole wrote:
> 2. There appears to be a spurious blank line before the From: line, 
> which logically breaks the header block, so the lines after that are 
> technically not headers. This MAY be an artifact of how you copied the 
> headers into your message rather than something in the original. 

My error in cutting and pasting the headers. Thinking about it, I 
needn't have bothered to edit them as almost all the redacted 
information is available in the headers of my post to this list; just 
less obvious names than placeholders 'edge' and 'internal'


>> Spam detection software, running on the system "<firewall>.lan",
>> has identified this incoming email as possible spam.  The original
>> message has been attached to this so you can view it or label
>> similar future email.  If you have any questions, see
>> @@CONTACT_ADDRESS@@ for details.
> SpamAssassin on 'edge' is not properly installed. The '@@CONTACT_ADDRESS@@' token there is a placeholder used in the SA source which is substituted in by the package's Makefile. I'm not sure how one could manage to get that to show up in production.

@@CONTACT_ADDRESS@@ is still present in messages processed solely by 'internal'. I'm using the package as supplied by debian/armbian. I assume the 3.4.6-1_all deb.

/var/cache/apt/archives/spamassassin_3.4.6-1_all.deb

X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on 'internal'

-- 
Jeremy


Re: Sequential spamassassin scans get different results

Posted by Bill Cole <sa...@billmail.scconsult.com>.
On 2022-04-06 at 23:31:46 UTC-0400 (Thu, 7 Apr 2022 11:31:46 +0800)
Jeremy Ardley <je...@ardley.org>
is rumored to have said:

> I have a mail setup with an internet facing postfix mail server "edge" (LAN name "firewall") and in internal LAN postfix with dovecot server "internal".
>
> They both run the same version of SA with the same rules.
>
> "edge" receives internet mail, scans it with spamassassin, and then forwards it to "internal" which also scans it with spamassassin.
>
> The problem in this instance is "edge" got a spam score of 21.3, while "internal" got a score of 3.3
>
> This is puzzling. Any explanations?

Very common. Many tests used by SpamAssassin involve the nature of the relay path by which the message arrived. It may be possible to set up the internal machine and the edge machine to get mail to score the same on both, but that will not necessarily be the same config on both machine.

>
> < Below headers from "internal" >
>
> X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on internal.lan
> X-Spam-Level: ***
> X-Spam-Status: No, score=3.3 required=5.0 tests=ALL_TRUSTED,DATE_IN_PAST_03_06,
>     FROM_MISSPACED,HK_NAME_MR_MRS,HTML_MESSAGE,MISSING_HEADERS,
>     T_FILL_THIS_FORM_SHORT autolearn=no autolearn_force=no version=3.4.6

Note that the list of matched rules begins with "ALL_TRUSTED," which means that 'internal' analyzed the Received headers and didn't see any which indicated a pass through an untrusted machine. This is a hint that you probably are modifying the message on 'edge' in some way that hides Received headers from being seen by 'internal.'

Most likely this is due to having report_safe set to something other than 0 on 'edge' but it can also happen if the MTA there is configured in some way that makes it impossible for SA on 'internal' to analyze the full set of Received headers.

> < below headers and content from "edge" aka "firewall" >

More precisely, *AFTER* 'edge' has modified the message, i.e. as seen when it hits 'internal'.

> Received: by edge.<...> (Postfix, from userid 115)
>     id DC8554188D; Thu,  7 Apr 2022 09:32:58 +0800 (AWST)
> Received: from localhost by firewall.lan
>     with SpamAssassin (version 3.4.6);
>     Thu, 07 Apr 2022 09:32:58 +0800
>
> From: "MR. CHRISTOPHER TOWE."<ma...@thaidevhost.com>
> Subject: MR. CHRISTOPHER TOWE.Director Airport Inspection Officer United Nations.
> Date: Wed, 6 Apr 2022 15:09:53 -0700
> MIME-Version: 1.0
> Content-Type: multipart/mixed; boundary="----------=_624E3F4A.AF957A3D"
> Message-Id: <20220407013258.DC8554188D@edge.<redacted>

3 interesting features:

1. The last Received header there is an artificial one created by SpamAssassin when report_safe is non-zero to terminate the Received chain. Having report_safe non-zero means that any system downstream (e.g. 'internal') will receive a 'wrapper' message with the original message embedded as a message/rfc822 or text/plain attachment.

2. There appears to be a spurious blank line before the From: line, which logically breaks the header block, so the lines after that are technically not headers. This MAY be an artifact of how you copied the headers into your message rather than something in the original.

3. The Message-Id was invented and added on 'edge' because the original message had none.

>
> This is a multi-part message in MIME format.
>
> ------------=_624E3F4A.AF957A3D
> Content-Type: text/plain; charset=iso-8859-1
> Content-Disposition: inline
> Content-Transfer-Encoding: 8bit
>
> Spam detection software, running on the system "<firewall>.lan",
> has identified this incoming email as possible spam.  The original
> message has been attached to this so you can view it or label
> similar future email.  If you have any questions, see
> @@CONTACT_ADDRESS@@ for details.

SpamAssassin on 'edge' is not properly installed. The '@@CONTACT_ADDRESS@@' token there is a placeholder used in the SA source which is substituted in by the package's Makefile. I'm not sure how one could manage to get that to show up in production.


The very long list of hits below is for the original message as it hit 'edge'. It is much larger than the list as scored on 'internal' because by the time the message hits 'internal' it has been wrapped by SA (due to report_safe being 1 or 2) and the originasl headers no longer are present.

> Content analysis details:   (21.3 points, 5.0 required)
>
>  pts rule name              description
> ---- ---------------------- --------------------------------------------------
>  1.0 NSL_RCVD_FROM_USER     Received from User
>  1.0 FSL_CTYPE_WIN1251      Content-Type only seen in 419 spam
>  1.2 MISSING_HEADERS        Missing To: header
>  0.1 MIME_HTML_ONLY         BODY: Message only has text/html MIME parts
>  0.0 HTML_MESSAGE           BODY: HTML included in message
>  0.1 MISSING_MID            Missing Message-Id: header
>  1.0 AXB_XMAILER_MIMEOLE_OL_024C2 Yet another X header trait
>  1.0 HK_NAME_MR_MRS         No description available.
>  1.0 FROM_MISSP_USER        From misspaced, from "User"
>  0.0 FORGED_OUTLOOK_HTML    Outlook can't send HTML message only
>  1.0 FROM_MISSP_MSFT        From misspaced + supposed Microsoft tool
>  1.0 FSL_NEW_HELO_USER      Spam's using Helo and User
>  0.6 FORGED_OUTLOOK_TAGS    Outlook can't send HTML in this format
>  1.9 REPLYTO_WITHOUT_TO_CC  No description available.
>  2.5 FREEMAIL_FORGED_REPLYTO Freemail in Reply-To, but not From
>  1.0 FROM_MISSP_REPLYTO     From misspaced, has Reply-To
>  1.0 TO_NO_BRKTS_FROM_MSSP  Multiple header formatting problems
>  1.0 FROM_MISSPACED         From: missing whitespace
>  1.0 TO_NO_BRKTS_MSFT       To: lacks brackets and supposed Microsoft tool
>  0.0 T_FILL_THIS_FORM_SHORT Fill in a short form with personal
>                             information
>  2.8 FORGED_MUA_OUTLOOK     Forged mail pretending to be from MS Outlook
>  1.0 FORM_FRAUD_3           Fill a form and several fraud phrases
>
> The original message was not completely plain text, and may be unsafe to
> open with some email clients; in particular, it may contain a virus,
> or confirm that your address can receive spam.  If you wish to view
> it, it may be safer to save it to a file and open it with an editor.
>
> -- 
> Jeremy


-- 
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire