You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2017/12/11 01:05:00 UTC

[Bug 7519] BODY_SINGLE_WORD triggers on base64 encoded text with more than one word.

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7519

Bill Cole <sa...@billmail.scconsult.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sa-bugz-20080315@billmail.s
                   |                            |cconsult.com
             Status|NEW                         |RESOLVED
         Resolution|---                         |INVALID

--- Comment #1 from Bill Cole <sa...@billmail.scconsult.com> ---
That message is badly malformed. The Content-Type header is invalid (missing
spaces,) there is no MIME-Version header, the Message-ID header is invalid
(missing angle brackets) and some of the putative MIME parts are improperly
encoded into lines an order of magnitude longer than MIME allows. 

As a result, there is no formally correct way to parse this message. That any
software can make any sense of it is a tribute to how lenient mail software is.
It is unclear to me why it is hitting BODY_SINGLE_WORD but it is also hitting
HTML_IMAGE_ONLY_20 and BODY_URI_ONLY incorrectly and I expect that all of these
are due to SA being confused by the compound pathology of the message. Note
that the rules it correctly hits (BASE64_LENGTH_79_INF, BAYES_50,
MIME_HEADER_CTYPE_ONLY, MISSING_SUBJECT, and INVALID_MSGID) add up to 5.3, so
even if we figured out precisely how the 3 bogus hits happened and fixed that,
SA would (by default) still call it spam.

The "garbage in, garbage out" principle applies here. It is not a bug for
SpamAssassin to misparse a message that technically has no correct parsing.

-- 
You are receiving this mail because:
You are the assignee for the bug.