You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Derek Harding <de...@innovyx.com> on 2006/01/05 00:58:16 UTC

SUBJECT_ENCODED_TWICE question

This may be more a dev question but I thought I'd start here.

I've been seeing this rule (SUBJECT_ENCODED_TWICE) trigger recently and
it is confusing me.

3.1.0 defines it as :header SUBJECT_ENCODED_TWICE   Subject:raw =~ /=
\?\S+\?[BQ]\?.*=\?\S+\?[BQ]\?/i

It checks for a subject line having two encoded sections however I'm not
sure why it does this. I've checked RFC 2047 and two encoded sections
does not appear to be a violation. In fact it gives an example of
exactly this:

>>From http://www.ietf.org/rfc/rfc2047.txt:

8. Examples

   The following are examples of message headers containing 'encoded-
   word's:

   From: =?US-ASCII?Q?Keith_Moore?= <mo...@cs.utk.edu>
   To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <ke...@dkuug.dk>
   CC: =?ISO-8859-1?Q?Andr=E9?= Pirard <PI...@vm1.ulg.ac.be>
   Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=
    =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?=

So why the disconnect?

Derek


Re: SUBJECT_ENCODED_TWICE question

Posted by Matt Kettler <mk...@evi-inc.com>.
Derek Harding wrote:
> This may be more a dev question but I thought I'd start here.
> 
> I've been seeing this rule (SUBJECT_ENCODED_TWICE) trigger recently and
> it is confusing me.
> 
> It checks for a subject line having two encoded sections however I'm not
> sure why it does this. I've checked RFC 2047 and two encoded sections
> does not appear to be a violation. In fact it gives an example of
> exactly this:

I don't think this rule is trying to imply that a two-encoding subject is an RFC
violation.

There are plenty of rules that are based on perfectly RFC legal things like
obfuscated drug names. Just because the rule exists, don't assume it's for RFC
reasons.

In fact, RFC violations alone are never a reason for a SA rule to exist. SA
rules are created to look for things spammers do fairly often, but normal people
and businesses don't. Some of these happen to be RFC violations, many aren't.

> 
>>>From http://www.ietf.org/rfc/rfc2047.txt:

> So why the disconnect?

I think this is a case of something that is extraordinarily rare in the real
world, but not horribly uncommon in spam. In the SA mass-checks, 95% of email
matching this rule was spam, and about 5% was nonspam.

However, this rule has got a pretty low score (under 1.8), so this alone really
shouldn't be causing you any trouble.