You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by micah anderson <mi...@riseup.net> on 2020/06/17 16:08:20 UTC

homograph spam

Are there any plugins or techniques that can deal with UTF-8 homographs?
In particular, i'm seeing a lot of attempts to get past filters that
would match on a word like 'amazon', but do not catch it because the 'm'
has been replaced by the UTF-8 version of 'm' that looks identical.

I understand that UTF-8 From and Subject are legitimate, so I do not
want to just block those, but it seems like we should look for typical
homographs in the middle of words and add a weighted score for these.

I do have 'normalize_charset 1' set here.

-- 
        micah

Re: homograph spam

Posted by John Hardin <jh...@impsec.org>.
On Wed, 17 Jun 2020, micah anderson wrote:

> Are there any plugins or techniques that can deal with UTF-8 homographs?
> In particular, i'm seeing a lot of attempts to get past filters that
> would match on a word like 'amazon', but do not catch it because the 'm'
> has been replaced by the UTF-8 version of 'm' that looks identical.

Yes, look at the FUZZY_* rules, the ReplaceTags plugin and the 
25_replace.cf rules file in the base ruleset.

> I understand that UTF-8 From and Subject are legitimate, so I do not
> want to just block those, but it seems like we should look for typical
> homographs in the middle of words and add a weighted score for these.

Unfortunately that sort of obfuscation requires specific rules, as there's 
no general way to detect it in arbitrary words.

You'd probably want something like:

ifplugin Mail::SpamAssassin::Plugin::ReplaceTags
   body           FUZZY_AMAZON   /\s<A>(?!mazon)<M><A><Z><O><N>/i
   replace_rules  FUZZY_AMAZON
   describe       FUZZY_AMAZON   Obfuscated "Amazon"
endif


-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Activist: Someone who gets involved.
   Unregistered Lobbyist: Someone who gets involved
        with something the MSM doesn't approve of.         -- WizardPC
-----------------------------------------------------------------------
  140 days until the Presidential Election