You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Pedro David Marco <pe...@yahoo.com> on 2017/02/01 11:17:35 UTC
fake base64 encoding
Hi!
i have noticed that when an email contains this (wrong) headers:
Content-Type: text/html; charset="utf-8"Content-Transfer-Encoding: base64
as SMTP headers, not MIME headers, and the email body is not base64 enconded, email clients as Thunderbird show the content correctly butSpamAssasin body rules are blind.
Example:
suppose a rule like body TEST_TEXT_DETECTED /TEST TEXT/ score TEST_TEXT_DETECTED 1 describe TEST_TEXT_DETECTED Test text detected
With this .eml: (minimized)
# cat test.eml From: LinkedIn Email Confirmation <em...@linkedin.naver.com> To: "jcornago@" <sia.es jcornago@sia.es> Date: Tue, 8 Mar 2016 04:18:08 +0000 Subject: Please confirm your email address TEST TEXT #
The rule triggers ok! Now i add headers:
# cat test.eml From: LinkedIn Email Confirmation <em...@linkedin.naver.com> To: "jcornago@" <sia.es jcornago@sia.es> Subject: Please confirm your email address Date: Tue, 8 Mar 2016 04:18:08 +0000 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: base64 TEST TEXT #
And the rule never triggers!
It makes sense since SA tries to decode the body before applying rules but Thunderbird shows the email correctly in both cases (the email is human readable). Can anyone please try it as well to discard it is only me... just add those 2 headers at the end of smtp headers section..
Thanks!
--------Pedro
Re: fake base64 encoding
Posted by Pedro David Marco <pe...@yahoo.com>.
Correction:
Some Outlook versions do show the email just as Thunderbird does.. so most users can see the email but SA...
From: Pedro David Marco <pe...@yahoo.com>
To: Kevin A. McGrail <KM...@PCCC.com>; SA Mailing List <us...@spamassassin.apache.org>
Sent: Thursday, February 2, 2017 5:30 AM
Subject: Re: fake base64 encoding
Thanks Kevin,
I did a similar rule to detect it but with higher score (3) since we are seeing a huge LinkedIn Phishing campaign using this technique, that on purpose or by mistake is evading most SA rules...
I agree that Thunderbird may be doing it wrong. Outlook seems to do it right.
>I would say Thunderbird is not parsing it correctly. Looking to see if this is a spam indicator.
>I ran some test cases with this rule:
>#Bad UTF--8 content type and transfer encoding
>header __KAM_BAD_UTF8_1 Content-Type =~ /text\/html; charset=\"utf-8\"/i
>header __KAM_BAD_UTF8_2 Content-Transfer-Encoding =~ /base64/i
>meta KAM_BAD_UTF8 (__KAM_BAD_UTF8_1 + __KAM_BAD_UTF8_2 >= 2)
>score KAM_BAD_UTF8 1.0
>describe KAM_BAD_UTF8 Bad Content Type and Transfer Encoding that attempts to evade SA scanning
>
>
>So far not seeing any sign it's in the wild. Have you?
-----
Pedro
Re: fake base64 encoding
Posted by John Wilcock <jo...@tradoc.fr>.
Le 02/02/2017 15:50, RW a crit :
> On Thu, 2 Feb 2017 05:43:24 -0500
> Kevin A. McGrail wrote:
...
>> I will score much higher since it is in the wild. Can you throw a
>> spample up on pastebin?
> Perhaps text/html makes a big difference, but base64 encoded utf-8
> text is not uncommon these days - particularly outside North America.
>
> To score it higher you might want to include a "full" rule that checks
> for base64 encoding in the headers followed by illegal whitespace near
> the beginning of what should be the base64 text.
Indeed. In my (very small) corpus, I see lots of base64-encoded utf-8
text/html parts of multipart messages, but very few non-multipart examples.
All of the latter really are base64-encoded, rather than plain text
labelled as base64, but that may simply be due to the small size of my
corpus. As it happens they are all spam, but I'm not convinced that
hitting on any utf-8 text/html message that purports to be
base64-encoded, regardless of whether it is actually base64 or not, is a
good idea.
FWIW,
John
Re: fake base64 encoding
Posted by RW <rw...@googlemail.com>.
On Thu, 2 Feb 2017 05:43:24 -0500
Kevin A. McGrail wrote:
> On 2/1/2017 11:30 PM, Pedro David Marco wrote:
> > I did a similar rule to detect it but with higher score (3) since
> > we are seeing a huge LinkedIn Phishing campaign using this
> > technique, that on purpose or by mistake is evading most SA
> > rules...
> I will score much higher since it is in the wild. Can you throw a
> spample up on pastebin?
Perhaps text/html makes a big difference, but base64 encoded utf-8
text is not uncommon these days - particularly outside North America.
To score it higher you might want to include a "full" rule that checks
for base64 encoding in the headers followed by illegal whitespace near
the beginning of what should be the base64 text.
Re: fake base64 encoding
Posted by "Kevin A. McGrail" <KM...@PCCC.com>.
On 2/2/2017 5:43 AM, Kevin A. McGrail wrote:
> On 2/1/2017 11:30 PM, Pedro David Marco wrote:
>> I did a similar rule to detect it but with higher score (3) since we
>> are seeing a huge LinkedIn Phishing campaign using this technique,
>> that on purpose or by mistake is evading most SA rules...
> I will score much higher since it is in the wild. Can you throw a
> spample up on pastebin?
I've also create a bug for this
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7388
Regards,
KAM
Re: fake base64 encoding
Posted by "Kevin A. McGrail" <KM...@PCCC.com>.
On 2/1/2017 11:30 PM, Pedro David Marco wrote:
> I did a similar rule to detect it but with higher score (3) since we
> are seeing a huge LinkedIn Phishing campaign using this technique,
> that on purpose or by mistake is evading most SA rules...
I will score much higher since it is in the wild. Can you throw a
spample up on pastebin?
Regards,
KAM
Re: fake base64 encoding
Posted by Pedro David Marco <pe...@yahoo.com>.
Thanks Kevin,
I did a similar rule to detect it but with higher score (3) since we are seeing a huge LinkedIn Phishing campaign using this technique, that on purpose or by mistake is evading most SA rules...
I agree that Thunderbird may be doing it wrong. Outlook seems to do it right.
>I would say Thunderbird is not parsing it correctly. Looking to see if this is a spam indicator.
>I ran some test cases with this rule:
>#Bad UTF--8 content type and transfer encoding
>header __KAM_BAD_UTF8_1 Content-Type =~ /text\/html; charset=\"utf-8\"/i
>header __KAM_BAD_UTF8_2 Content-Transfer-Encoding =~ /base64/i
>meta KAM_BAD_UTF8 (__KAM_BAD_UTF8_1 + __KAM_BAD_UTF8_2 >= 2)
>score KAM_BAD_UTF8 1.0
>describe KAM_BAD_UTF8 Bad Content Type and Transfer Encoding that attempts to evade SA scanning
>
>
>So far not seeing any sign it's in the wild. Have you?
-----
Pedro
Re: fake base64 encoding
Posted by "Kevin A. McGrail" <KM...@PCCC.com>.
On 2/1/2017 9:35 PM, Kevin A. McGrail wrote:
> I agree. The test does not trigger
>
> The second test will trigger utf8_mode on
>
> Feb 1 21:29:32.246 [26958] dbg: message: HTML::Parser utf8_mode on
> (assumed UTF-8 octets)
> Content-Type: text/html; charset="utf-8"
>> It makes sense since SA tries to decode the body before applying
>> rules but Thunderbird shows the email correctly in
>> both cases (the email is human readable).
>> Can anyone please try it as well to discard it is only me... just
>> add those 2 headers at the end of smtp headers section..
>>
>
> I would say Thunderbird is not parsing it correctly. Looking to see
> if this is a spam indicator.
I ran some test cases with this rule:
#Bad UTF--8 content type and transfer encoding
header __KAM_BAD_UTF8_1 Content-Type =~ /text\/html;
charset=\"utf-8\"/i
header __KAM_BAD_UTF8_2 Content-Transfer-Encoding =~
/base64/i
meta KAM_BAD_UTF8 (__KAM_BAD_UTF8_1 + __KAM_BAD_UTF8_2 >= 2)
score KAM_BAD_UTF8 1.0
describe KAM_BAD_UTF8 Bad Content Type and Transfer Encoding that
attempts to evade SA scanning
So far not seeing any sign it's in the wild. Have you?
Regards,
KAM
Re: fake base64 encoding
Posted by "Kevin A. McGrail" <KM...@PCCC.com>.
On 2/1/2017 6:17 AM, Pedro David Marco wrote:
> Hi!
>
> i have noticed that when an email contains this (wrong) headers:
>
> Content-Type: text/html; charset="utf-8"
> Content-Transfer-Encoding: base64
>
> as SMTP headers, not MIME headers, and the email body is not base64
> enconded, email clients as Thunderbird show the content correctly but
> SpamAssasin body rules are blind.
>
> Example:
>
> suppose a rule like
> bodyTEST_TEXT_DETECTED/TEST TEXT/
> scoreTEST_TEXT_DETECTED1
> describeTEST_TEXT_DETECTEDTest text detected
>
> With this .eml: (minimized)
>
> # cat test.eml
> From: LinkedIn Email Confirmation <em...@linkedin.naver.com>
> To: "jcornago@" <sia.es jcornago@sia.es>
> Date: Tue, 8 Mar 2016 04:18:08 +0000
> Subject: Please confirm your email address
> TEST TEXT
> #
>
> The rule triggers ok! Now i add headers:
>
> # cat test.eml
> From: LinkedIn Email Confirmation <em...@linkedin.naver.com>
> To: "jcornago@" <sia.es jcornago@sia.es>
> Subject: Please confirm your email address
> Date: Tue, 8 Mar 2016 04:18:08 +0000
> Content-Type: text/html; charset="utf-8"
> Content-Transfer-Encoding: base64
> TEST TEXT
> #
>
> And the rule never triggers!
I agree. The test does not trigger
The second test will trigger utf8_mode on
Feb 1 21:29:32.246 [26958] dbg: message: HTML::Parser utf8_mode on
(assumed UTF-8 octets)
Content-Type: text/html; charset="utf-8"
> It makes sense since SA tries to decode the body before applying rules
> but Thunderbird shows the email correctly in
> both cases (the email is human readable).
> Can anyone please try it as well to discard it is only me... just add
> those 2 headers at the end of smtp headers section..
>
I would say Thunderbird is not parsing it correctly. Looking to see if
this is a spam indicator.