You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2006/01/05 06:53:50 UTC

[Bug 4753] New: New encoding for listashing addresses, not ROT-13

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4753

           Summary: New encoding for listashing addresses, not ROT-13
           Product: Spamassassin
           Version: SVN Trunk (Latest Devel Version)
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: Rules (Eval Tests)
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: lwilton@earthlink.net


Matt Kettler did an analysis of the current common encoding format for email 
addresses.  This one is designed to bypass the simple ROT-13 check.  They 
encode into a ROT-17 *reversed* alphabet.

This isn't something that is trivially checkable with a standard rule, but 
should be reasonably trivial to catch in the eval code that does the current 
catch.

Matt sayeth:

Spammers have been embedding encoded versions of our email addresses in 
spam and web links for listwashing purposes for a long time.

One of the early popular encodings was a variant of rot-13.

More recently I've noticed a lot of the geocities exploit spams are using a 
new encoding.

Before:
mkettler@evi-inc.com
Encoded by rot-13 into:
zxrggyre^riv-vap(pbz

Now I'm seeing a lot using:
XZfQQYfS.fOb-bWh,hVX

Which I could tell was a simple character substitution, but not constant 
addition or XOR.

So I did some digging and got more examples from NANAS posts:
http://groups.google.com/group/news.admin.net-
abuse.sightings/msg/5724bf90fa6fae6e
http://groups.google.com/group/news.admin.net-
abuse.sightings/msg/ba468b2d26bb3494
http://groups.google.com/group/news.admin.net-
abuse.sightings/msg/570cfd7a517a598f

And built up an alphabet table. The results are amusing.

The new version takes a backwards alphabet, and reverse-rotates it 17 
characters.

Here's my table. I extrapolated the obvious for the items in ().
Plain -> encoded
------------

a       j
b       i
c       h
d       g
e       f
f       e
g       (d)
h       c
i       b
j       (a)
k       Z
l       Y
m       X
n       W
o       V
p       U
q       (T)
r       S
s       (R)
t       Q
u       (P)
v       O
w       (N)
x       M
y       L
z       K
------------

They're also using both upper and lower-case alphabets here, so you can 
continue the list at the top such that Z encodes to 'k'

Cute eh?

I might suggest encoding up your own email domains into a body rule.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4753] New encoding for listashing addresses, not ROT-13

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4753


felicity@apache.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WONTFIX




------- Additional Comments From felicity@apache.org  2006-12-05 11:59 -------
I don't think we're really going to be able to handle generic substitution
cipher detection as a spam rule. :|



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.