You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Philipp Ewald <ph...@digionline.de> on 2021/01/13 15:57:55 UTC

What does that rule mean "SUBJ_OBFU_PUNCT FEW"

Hello,

we try to deliver mails to GMX/WEB but we got frequency blocked because "ro-reply@ Mails" hits following rules:

SUBJ_OBFU_PUNCT_FEW -> Possible punctuation-obfuscated Subject: header

SUBJ_OBFU_PUNCT_MANY ->  Punctuation-obfuscated Subject: header

i can't find any good declaration for this rules.. can some one explain please? (easy as possible)
Does that has todo with ".", ";", ":" in Headers?

many thank!


kind regards
Philipp

-- 
Philipp Ewald
Administrator

DigiOnline GmbH, Probsteigasse 15 - 19, 50670 Köln
Fax: +49 221 6500-690, E-Mail: philipp.ewald@digionline.de

AG Köln HRB 27711, St.-Nr. 5215 5811 0640
Geschäftsführer: Werner Grafenhain

Informationen zum Datenschutz: www.digionline.de/ds

Re: What does that rule mean "SUBJ_OBFU_PUNCT FEW"

Posted by John Hardin <jh...@impsec.org>.
On Wed, 13 Jan 2021, RW wrote:

> On Wed, 13 Jan 2021 17:43:41 +0100
> Alex Woick wrote:
>
>> Which means:
>> (?!<[a-z][a-z])                                            -> don't
>> match if the next 3 chars are "<" followed by 2 letters
>
> I suspect that this was intended to be (?<![a-z][a-z]).

That's an attempt to avoid matching bracketed email addresses, which often
have embedded punctuation. It's probably not enough by itself.

> As it stands the negative look-ahead never affects anything,

Right, because the remainder would only match "<[a-z](other punct)"

> but the negative look-behind would avoid matches where the first 
> punctuation character is on the end of a multi-letter word.

That wasn't the intent. It's not the punctuation character alone. It's 
(punct)(letter)(punct) or (letter)(punct)(letter). And only multiple 
instances of that occurring are actually scored.

>> In short: it tries to match a sequence of 5 characters.
>> don't match <ab..
>> match something like  :a::a
>> match something like  :aa:a
>> match something like  :a :a
>
> You missed a "|", it's looking for punctuation bracketing a letter or
> vice versa, e.g. "a:b" or ".g:"
>
> FWIW in my mail the SUBJ_OBFU_PUNCT_* rules have only ever matched
> urls in the subject - a spam sign in its own right in my experience.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org                         pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  4 days until Benjamin Franklin's 315th Birthday

Re: What does that rule mean "SUBJ_OBFU_PUNCT FEW"

Posted by RW <rw...@googlemail.com>.
On Wed, 13 Jan 2021 17:43:41 +0100
Alex Woick wrote:


> Which means:
> (?!<[a-z][a-z])                                            -> don't 
> match if the next 3 chars are "<" followed by 2 letters

I suspect that this was intended to be (?<![a-z][a-z]). As it stands
the negative look-ahead never affects anything, but the negative
look-behind would avoid matches where the first punctuation character
is on the end of a multi-letter word.

> In short: it tries to match a sequence of 5 characters.
> don't match <ab..
> match something like  :a::a
> match something like  :aa:a
> match something like  :a :a

You missed a "|", it's looking for punctuation bracketing a letter or
vice versa, e.g. "a:b" or ".g:"

FWIW in my mail the SUBJ_OBFU_PUNCT_* rules have only ever matched
urls in the subject - a spam sign in its own right in my experience.

Re: What does that rule mean "SUBJ_OBFU_PUNCT FEW"

Posted by Alex Woick <al...@wombaz.de>.
Philipp Ewald schrieb am 13.01.2021 um 16:57:
> we try to deliver mails to GMX/WEB but we got frequency blocked 
> because "ro-reply@ Mails" hits following rules:
>
> SUBJ_OBFU_PUNCT_FEW -> Possible punctuation-obfuscated Subject: header
>
> SUBJ_OBFU_PUNCT_MANY ->  Punctuation-obfuscated Subject: header
>
> i can't find any good declaration for this rules.. can some one 
> explain please? (easy as possible)
> Does that has todo with ".", ";", ":" in Headers?
Yes, it does, in the subject.
Congratulations, this rule matches one of the more complicated regular 
expressions that usually takes a magician or quantum physicist to decypher.
This is a part of its definition:

$ grep "__SUBJ_OBFU_PUNCT" *
72_active.cf:meta        SUBJ_OBFU_PUNCT_FEW    __SUBJ_OBFU_PUNCT > 1 && 
!__THREADED && !__RP_MATCHES_RCVD && !__NOT_SPOOFED && 
!__LCL__ENV_AND_HDR_FROM_MATCH
72_active.cf:meta        SUBJ_OBFU_PUNCT_MANY   __SUBJ_OBFU_PUNCT > 2 && 
!__THREADED && !__RP_MATCHES_RCVD && !__NOT_SPOOFED && 
!__LCL__ENV_AND_HDR_FROM_MATCH
72_active.cf:header      __SUBJ_OBFU_PUNCT      Subject =~ 
/(?:(?!<[a-z][a-z])[-~`"!@\#$%^&*()_+={}|\\\/?<>,.:;][a-z][-~`"!@\#$%^&*()_+={}|\\\/?<>,.:;\s]|[a-z][~`"!@\#$%^&*()_+={}|\\?<>,.:;][a-z])/i
It deals with the subject:
/(?:(?!<[a-z][a-z])[-~`"!@\#$%^&*()_+={}|\\\/?<>,.:;][a-z][-~`"!@\#$%^&*()_+={}|\\\/?<>,.:;\s]|[a-z][~`"!@\#$%^&*()_+={}|\\?<>,.:;][a-z])/i

It can be divided into these parts:
(?!<[a-z][a-z])
[-~`"!@\#$%^&*()_+={}|\\\/?<>,.:;]
[a-z]
[-~`"!@\#$%^&*()_+={}|\\\/?<>,.:;\s]|[a-z]
[~`"!@\#$%^&*()_+={}|\\?<>,.:;]
[a-z]

Which means:
(?!<[a-z][a-z])                                            -> don't 
match if the next 3 chars are "<" followed by 2 letters
[-~`"!@\#$%^&*()_+={}|\\\/?<>,.:;]             -> match one punctuation 
character
[a-z]                                                          -> match 
one letter
[-~`"!@\#$%^&*()_+={}|\\\/?<>,.:;\s]|[a-z]  -> match one punctuation 
character, or space, or letter character
[~`"!@\#$%^&*()_+={}|\\?<>,.:;]                -> match one punctuation 
character (some less than in the definition before)
[a-z]                                                          -> match 
one letter

In short: it tries to match a sequence of 5 characters.
don't match <ab..
match something like  :a::a
match something like  :aa:a
match something like  :a :a

If your subjects contain single letter+punctuation combinations like 
this, don't use it.

There is also a number of exclusion rules which might be easy to hit. 
__NOT_SPOOFED for example matches DKIM-signed messages. If you manage to 
DKIM-sign your outgoing messages, the SUBJ_OBFU_PUNCT_* rules would not 
match at all. Or if you use SPF to define your outgoing mail servers.

Alex



Re: What does that rule mean "SUBJ_OBFU_PUNCT FEW"

Posted by Alex Woick <al...@wombaz.de>.
Philipp Ewald schrieb am 13.01.2021 um 18:40:
> Subject: <USER>: Mailservice: Neue Mail

The rule actually matches, if you have usernames like "anton.b", which 
produces a subject like this:
Subject: <anton.b>: Mailservice: Neue Mail

However, the rule scores a measly 0.749, which isn't marking a message 
as spam or a cause for rejection on its own. There have to be additional 
matching rules. A message is considered spam with a total score of 5 and 
above.

Alex

Re: What does that rule mean "SUBJ_OBFU_PUNCT FEW"

Posted by John Hardin <jh...@impsec.org>.
On Wed, 13 Jan 2021, Philipp Ewald wrote:

>>> SUBJ_OBFU_PUNCT_FEW -> Possible punctuation-obfuscated Subject: header
>>> 
>>> SUBJ_OBFU_PUNCT_MANY ->  Punctuation-obfuscated Subject: header
>
> We send mails Like this: (You got a E-Mail)
>
> Subject: <USER>: Mailservice: Neue Mail

Ok. I will assume <USER> is an email address, like:

   Subject: <ph...@digionline.de>: Mailservice: Neue Mail

That would hit due to the punctuation embedded in the email address.

If my assumption is incorrect please let me know.

Question: is the email address in <USER> the same as the email address in 
the To: header?

If you can send me the full unedited headers of one such message in 
private email I'll test exclusions for it. Note: any changes you make to 
that will potentially interfere with the accuracy of the exclusion.

Thanks!


-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org                         pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  4 days until Benjamin Franklin's 315th Birthday

Re: What does that rule mean "SUBJ_OBFU_PUNCT FEW"

Posted by Philipp Ewald <ph...@digionline.de>.
aaah sorry: i mean "no-reply(system notification)" E-Mails Hits SPAM Rule:

>> SUBJ_OBFU_PUNCT_FEW -> Possible punctuation-obfuscated Subject: header
>>
>> SUBJ_OBFU_PUNCT_MANY ->  Punctuation-obfuscated Subject: header

We send mails Like this: (You got a E-Mail)

X-To: <@web.de>
From: "" <no-reply@>
Reply-To: "" <no-reply@>
Date: Mon, 07 Sep 2020 07:14:19 +0200
Subject: <USER>: Mailservice: Neue Mail
X-Date: Mon, 07 Sep 2020 07:14:19 +0200
To: @web.de
Message-ID:
X-User-Message: X-User-Message-013
X-Auto-Response-Suppress: All
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 7bit
X-Mime-Autoconverted: from 8bit to 7bit by courier 1.0

On 1/13/21 5:02 PM, Antony Stone wrote:
> On Wednesday 13 January 2021 at 16:57:55, Philipp Ewald wrote:
> 
>> Hello,
>>
>> we try to deliver mails to GMX/WEB but we got frequency blocked because
>> "ro-reply@ Mails" hits following rules:
> 
> Sorry, but what do you mean by "ro-reply@ Mails"?
> 
>> SUBJ_OBFU_PUNCT_FEW -> Possible punctuation-obfuscated Subject: header
>>
>> SUBJ_OBFU_PUNCT_MANY ->  Punctuation-obfuscated Subject: header
> 
> Can you give us an example of the Subject line you're trying to send the
> emails with?
> 
> 
> Antony.
> 

-- 
Philipp Ewald
Administrator

DigiOnline GmbH, Probsteigasse 15 - 19, 50670 Köln
Fax: +49 221 6500-690, E-Mail: philipp.ewald@digionline.de

AG Köln HRB 27711, St.-Nr. 5215 5811 0640
Geschäftsführer: Werner Grafenhain

Informationen zum Datenschutz: www.digionline.de/ds

Re: What does that rule mean "SUBJ_OBFU_PUNCT FEW"

Posted by Antony Stone <An...@spamassassin.open.source.it>.
On Wednesday 13 January 2021 at 16:57:55, Philipp Ewald wrote:

> Hello,
> 
> we try to deliver mails to GMX/WEB but we got frequency blocked because
> "ro-reply@ Mails" hits following rules:

Sorry, but what do you mean by "ro-reply@ Mails"?

> SUBJ_OBFU_PUNCT_FEW -> Possible punctuation-obfuscated Subject: header
> 
> SUBJ_OBFU_PUNCT_MANY ->  Punctuation-obfuscated Subject: header

Can you give us an example of the Subject line you're trying to send the 
emails with?


Antony.

-- 
"I think both KDE and Gnome suck - I'm quite unbiased in that, because I use a 
Mac."

 - Jason Isitt

                                                   Please reply to the list;
                                                         please *don't* CC me.

Re: What does that rule mean "SUBJ_OBFU_PUNCT FEW"

Posted by Philipp Ewald <ph...@digionline.de>.
No the Support said "Yes your listed because your "no-reply@" his hitting the following rules..." nothing *else*....




On 1/13/21 6:07 PM, John Hardin wrote:
> The scores on those rules are rather low - they are not "poison pills". What *else* are those mails hitting?

-- 
Philipp Ewald
Administrator

DigiOnline GmbH, Probsteigasse 15 - 19, 50670 Köln
Fax: +49 221 6500-690, E-Mail: philipp.ewald@digionline.de

AG Köln HRB 27711, St.-Nr. 5215 5811 0640
Geschäftsführer: Werner Grafenhain

Informationen zum Datenschutz: www.digionline.de/ds

Re: What does that rule mean "SUBJ_OBFU_PUNCT FEW"

Posted by John Hardin <jh...@impsec.org>.
On Wed, 13 Jan 2021, Philipp Ewald wrote:

> Hello,
>
> we try to deliver mails to GMX/WEB but we got frequency blocked because 
> "ro-reply@ Mails" hits following rules:
>
> SUBJ_OBFU_PUNCT_FEW -> Possible punctuation-obfuscated Subject: header
>
> SUBJ_OBFU_PUNCT_MANY ->  Punctuation-obfuscated Subject: header

The scores on those rules are rather low - they are not "poison pills". 
What *else* are those mails hitting?

An actual sample of a problematic subject text would be very helpful to 
allow us to suggest how you could fix the problem or to add an exception 
for the rule if it's a valid FP.

> i can't find any good declaration for this rules.. can some one explain 
> please? (easy as possible)
> Does that has todo with ".", ";", ":" in Headers?

Alex did a good job. Basically: multiple instances of letter-punct-letter 
or punct-letter-punct in the message subject.

Spammers have used punctuation to obfuscate "trigger words" in subjects, 
like:

    :B:U:Y: :Y:O:U:R: :C:H:E:A:P: :V:I:A:G:R:A: :H:E:R:E: :T:O:D:A:Y:

in an attempt to bypass naïve text matching filters. These rules are 
intended to detect that.


-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org                         pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   How do you argue with people to whom math is an opinion? -- Unknown
-----------------------------------------------------------------------
  4 days until Benjamin Franklin's 315th Birthday