You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Scott A Crosby <sc...@cs.rice.edu> on 2005/03/02 02:24:08 UTC

Re: Obfuscation

On Mon, 28 Feb 2005 15:34:13 +0000, jm@jmason.org (Justin Mason) writes:

> A paper at the spam conference suggested using an Edit Distance algorithm
> with very good results; the idea being, the edit distance from "cialis" to
> "C 1 a l | s" isn't as far as it is to "specialized" or so on.
>
> if I recall correctly, someone submitted an implementation quite a while
> ago on our BZ, but I think the FP rates were too high.   Given the
> recent paper's published results, though, it may be there are good ways
> to tweak it to get FPs at a tolerable rate.

I did an implementation of it some time ago, but I didn't get a chance
to take it far enough to test out its effectiveness. I heard remarks
that naively applying edit distance is too slow. To avoid having a FP
rate that was too high, the edit-distance costs are paramaterized, so
some edits are much cheaper than others. Eg.

# Cost of replacing a character with a punctuation in the obfu.
setreps ("bcdfghijklmnpqrstvwxyz","*?.-",.08);
setreps ("aeiou","*?.-",.03);

# Cost to insert these into the obfuscated string is cheap
setins ("/\|()=-'!*`;:?+[]\"^",.01);
setins ("_,.",.01);

So, 'v.agr.' and 'v..ia...gra' both cost <.10  


Got a bugzilla# that I can attach the prototype code to?  (Also, is it
possible to report a bug/attach the code without creating a bugzilla
account?)

Scott

Re: Obfuscation

Posted by Martin Hepworth <ma...@solid-state-logic.com>.
All

nice obsfu generator at..

http://sandgnat.com/cmos/cmos.jsp

--
Martin Hepworth
Snr Systems Administrator
Solid State Logic
Tel: +44 (0)1865 842300


Scott A Crosby wrote:
> On Mon, 28 Feb 2005 15:34:13 +0000, jm@jmason.org (Justin Mason) writes:
> 
> 
>>A paper at the spam conference suggested using an Edit Distance algorithm
>>with very good results; the idea being, the edit distance from "cialis" to
>>"C 1 a l | s" isn't as far as it is to "specialized" or so on.
>>
>>if I recall correctly, someone submitted an implementation quite a while
>>ago on our BZ, but I think the FP rates were too high.   Given the
>>recent paper's published results, though, it may be there are good ways
>>to tweak it to get FPs at a tolerable rate.
> 
> 
> I did an implementation of it some time ago, but I didn't get a chance
> to take it far enough to test out its effectiveness. I heard remarks
> that naively applying edit distance is too slow. To avoid having a FP
> rate that was too high, the edit-distance costs are paramaterized, so
> some edits are much cheaper than others. Eg.
> 
> # Cost of replacing a character with a punctuation in the obfu.
> setreps ("bcdfghijklmnpqrstvwxyz","*?.-",.08);
> setreps ("aeiou","*?.-",.03);
> 
> # Cost to insert these into the obfuscated string is cheap
> setins ("/\|()=-'!*`;:?+[]\"^",.01);
> setins ("_,.",.01);
> 
> So, 'v.agr.' and 'v..ia...gra' both cost <.10  
> 
> 
> Got a bugzilla# that I can attach the prototype code to?  (Also, is it
> possible to report a bug/attach the code without creating a bugzilla
> account?)
> 
> Scott

<br />**********************************************************************
<br />
<br />This email and any files transmitted with it are confidential and
<br />intended solely for the use of the individual or entity to whom they
<br />are addressed. If you have received this email in error please notify
<br />the system manager.
<br />
<br />This footnote confirms that this email message has been swept
<br />for the presence of computer viruses and is believed to be clean.	
<br />
<br />**********************************************************************