You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Thomas Arend <ml...@arend-whv.info> on 2005/02/24 11:42:03 UTC
Character Sets in Subject and To/From
Hello,
I got lots of messages with subjects of the form:
Subject: =?utf-8?q?Wholesale Rolex Watc?=
=?utf-8?q?hes?=
Also mail Addresses use this type of obfuscation.
My Question: How are thes character set changes handled by SpamAssassin rules
and bayesian filtering.
Best regards
Thomas Arend
--
icq:133073900
http://www.t-arend.de
Re: Character Sets in Subject and To/From
Posted by Matt Kettler <mk...@evi-inc.com>.
At 02:33 PM 2/24/2005, Thomas Arend wrote:
>When I understand you right my rolex rule is spoiled by this trick.
>
>Because
>header LOCAL_ENCSUBJECT Subject: =~ /rolex/i
>
>will not fire on these subjects.
You misunderstood me completely.
That rule should fire on those subject lines just fine.
SA will automatically decode the character sets and then feed the decoded
text to your rule. You don't need to take any extra action to try to detect
encoded text. SA handles this for you by default.
SA always decodes unless you change it to "Subject:raw" instead of "Subject:".
Re: Character Sets in Subject and To/From
Posted by Thomas Arend <ml...@arend-whv.info>.
Am Donnerstag, 24. Februar 2005 19:12 schrieb Matt Kettler:
> At 05:42 AM 2/24/2005, Thomas Arend wrote:
> >I got lots of messages with subjects of the form:
> >
> >Subject: =3D?utf-8?q?Wholesale Rolex Watc?=3D
> > =3D?utf-8?q?hes?=3D
> >
> >Also mail Addresses use this type of obfuscation.
> >
> >My Question: How are thes character set changes handled by SpamAssassin
> >rules and bayesian filtering.
>
> Normal rules and bayes see them after they've been decoded. So as far as
> 90% of SA is concerned, the character set changes aren't there.
>
> Rules that specifically want to detect this stuff can do so by using the
>
> :raw modifier.. i.e.:
>
> header LOCAL_ENCSUBJECT Subject:raw =~ /\=\?.*\?\=/i
>
> Matches subject lines like:
>
> Subject: =?iso-8859-8?Q?=F2=EC_=E7=EB=EE=FA_=E4=E9=EC=E3?=
When I understand you right my rolex rule is spoiled by this trick.
Because
header LOCAL_ENCSUBJECT Subject: =~ /rolex/i
will not fire on these subjects.
Thomas
--
icq:133073900
http://www.t-arend.de
Re: Character Sets in Subject and To/From
Posted by Matt Kettler <mk...@evi-inc.com>.
At 05:42 AM 2/24/2005, Thomas Arend wrote:
>I got lots of messages with subjects of the form:
>
>Subject: =3D?utf-8?q?Wholesale Rolex Watc?=3D
> =3D?utf-8?q?hes?=3D
>
>Also mail Addresses use this type of obfuscation.
>
>My Question: How are thes character set changes handled by SpamAssassin
>rules and bayesian filtering.
Normal rules and bayes see them after they've been decoded. So as far as
90% of SA is concerned, the character set changes aren't there.
Rules that specifically want to detect this stuff can do so by using the
:raw modifier.. i.e.:
header LOCAL_ENCSUBJECT Subject:raw =~ /\=\?.*\?\=/i
Matches subject lines like:
Subject: =?iso-8859-8?Q?=F2=EC_=E7=EB=EE=FA_=E4=E9=EC=E3?=
Re: Character Sets in Subject and To/From
Posted by Thomas Arend <ml...@arend-whv.info>.
Am Freitag, 25. Februar 2005 02:41 schrieb Robert Menschel:
> Hello Thomas,
>
> Thursday, February 24, 2005, 2:42:03 AM, you wrote:
>
> TA> Hello,
>
> TA> I got lots of messages with subjects of the form:
> TA> Subject: =?utf-8?q?Wholesale Rolex Watc?=
> TA> =?utf-8?q?hes?=
> TA> Also mail Addresses use this type of obfuscation.
> TA> My Question: How are thes character set changes handled by SpamAssassin
> rules TA> and bayesian filtering.
>
> See the UTF8 rules in
> http://www.rulesemporium.com/rules/70_sare_genlsubj_eng.cf
>
> Haven't found this process useful in addresses yet, but if you'll send
> me an example so I can verify what's going on, I'll run a few tests.
>
> Bob Menschel
I have send you the example by private mail to avoid spoiling the spamfilter
of others.
Thomas
--
icq:133073900
http://www.t-arend.de
Re: Character Sets in Subject and To/From
Posted by Robert Menschel <Ro...@Menschel.net>.
Hello Thomas,
Thursday, February 24, 2005, 2:42:03 AM, you wrote:
TA> Hello,
TA> I got lots of messages with subjects of the form:
TA> Subject: =?utf-8?q?Wholesale Rolex Watc?=
TA> =?utf-8?q?hes?=
TA> Also mail Addresses use this type of obfuscation.
TA> My Question: How are thes character set changes handled by SpamAssassin rules
TA> and bayesian filtering.
See the UTF8 rules in
http://www.rulesemporium.com/rules/70_sare_genlsubj_eng.cf
Haven't found this process useful in addresses yet, but if you'll send
me an example so I can verify what's going on, I'll run a few tests.
Bob Menschel