You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Clayton Keller <in...@ruraltel.net> on 2005/05/03 20:13:01 UTC

Re: Extra Sare Rules for meds?

Jesse Houwing wrote:
> Keith Ivey wrote:
> 
>> Jesse Houwing wrote:
>>
>>> BODY TABLEOBFU 
>>> m{<td([^>]+|"[^"]+)>(<([^>]+|"[^"]+)>)*[a-z]{1,2}(<([^>]+|"[^"]+)>)*</td([^>]+|"[^"]+)>}i 
>>
>>
>>
>>
>>
>> I think you may want a * after the ) inside the <>.  As it is, you're 
>> looking for either a bunch of characters that are not > or a quote 
>> followed by a bunch of characters that are not quote.  In fact, I 
>> think what was really intended was something more like this (note that 
>> this also requires an ending quote on contained quoted strings and 
>> allows ""):
>>
>> m{<td([^>"]+|"[^"]*")*>(<([^>"]+|"[^"]*")*>)*[a-z]{1,2}(<([^>"]+|"[^"]*")*>)*</td([^>"]+|"[^"]*")*>}i 
>>
>>
>> The other problem with the pattern as written (with no *) is that the 
>> subpatterns don't match plain <td> or </td>, since they require at 
>> least one character between the td and the >.
>>
> It was late ;)
> 
> I'm currently rinning tests on a couple of alternatives:
> 
> rawbody tblobfu_opttag 
> /<td(?:[^>'"]|"[^"]*"|'[^']*')*>(?:<(?!\/?td)(?:[^>'"]|"[^"]*"|'[^']*')*>){0,5}(?![oi][ns]|an?|en|of|de|l[ae]|us|no|tm)[a-z]{1,2}(?:<(?!\/?td)(?:[^>'"]|"[^"]*"|'[^']*')*>){0,5}<\/td(?:[^>'"]|"[^"]*"|'[^']*')*>/i 
> 
> rawbody tblobfu_tag 
> /<td(?:[^>'"]|"[^"]*"|'[^']*')*>(?:<(?!\/?td)(?:[^>'"]|"[^"]*"|'[^']*')*>){1,5}(?![oi][ns]|an?|en|of|de|l[ae]|us|no|tm)[a-z]{1,2}(?:<(?!\/?td)(?:[^>'"]|"[^"]*"|'[^']*')*>){1,5}<\/td(?:[^>'"]|"[^"]*"|'[^']*')*>/i 
> 
> 
> Please note that before making this final I will be removing the splats 
> (*) with some usable limitations, but I want to compare the number ofg 
> ham/spam hits first before making the final rules.
> 
> Jesse
> 
> 
>


Does any have any updated information regarding the effectiveness of 
these rules, or possibly any updated alternatives to put in place 
against messages involving the use of tables?

Thanks
Clay

Re[2]: Extra Sare Rules for meds?

Posted by Robert Menschel <Ro...@Menschel.net>.
Hello Clayton,

Tuesday, May 3, 2005, 11:13:01 AM, you wrote:

CK> Jesse Houwing wrote:
>> Keith Ivey wrote:
>>> Jesse Houwing wrote:
>>>> BODY TABLEOBFU
>>>> m{<td([^>]+|"[^"]+)>(<([^>]+|"[^"]+)>)*[a-z]{1,2}(<([^>]+|"[^"]+)>)*</td([^>]+|"[^"]+)>}i
>>>
>>> m{<td([^>"]+|"[^"]*")*>(<([^>"]+|"[^"]*")*>)*[a-z]{1,2}(<([^>"]+|"[^"]*")*>)*</td([^>"]+|"[^"]*")*>}i
>>>
>> rawbody tblobfu_opttag
>> /<td(?:[^>'"]|"[^"]*"|'[^']*')*>(?:<(?!\/?td)(?:[^>'"]|"[^"]*"|'[^']*')*>){0,5}(?![oi][ns]|an?|en|of|de|l[ae]|us|no|tm)[a-z]{1,2}(?:<(?!\/?td)(?:[^>'"]|"[^"]*"|'[^']*')*>){0,5}<\/td(?:[^>'"]|"[^"]*"|'[^']*')*>/i
>> rawbody tblobfu_tag
>> /<td(?:[^>'"]|"[^"]*"|'[^']*')*>(?:<(?!\/?td)(?:[^>'"]|"[^"]*"|'[^']*')*>){1,5}(?![oi][ns]|an?|en|of|de|l[ae]|us|no|tm)[a-z]{1,2}(?:<(?!\/?td)(?:[^>'"]|"[^"]*"|'[^']*')*>){1,5}<\/td(?:[^>'"]|"[^"]*"|'[^']*')*>/i
>> 

CK> Does any have any updated information regarding the effectiveness of
CK> these rules, or possibly any updated alternatives to put in place 
CK> against messages involving the use of tables?

Nothing useful from here, but now that the first round of this
generation of OBFU rules from SARE are out, I've got some rules which
are worth testing, and will be mass-checking those tonight.  Hopefully
I'll have something worth publishing within a week or two (sooner if
they exceed expectations).

Bob Menschel




Re: Extra Sare Rules for meds?

Posted by Jim Maul <jm...@elih.org>.
Clayton Keller wrote:
> Jesse Houwing wrote:
> 
>> Keith Ivey wrote:
>>
>>> Jesse Houwing wrote:
>>>
>>>> BODY TABLEOBFU 
>>>> m{<td([^>]+|"[^"]+)>(<([^>]+|"[^"]+)>)*[a-z]{1,2}(<([^>]+|"[^"]+)>)*</td([^>]+|"[^"]+)>}i 
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> I think you may want a * after the ) inside the <>.  As it is, you're 
>>> looking for either a bunch of characters that are not > or a quote 
>>> followed by a bunch of characters that are not quote.  In fact, I 
>>> think what was really intended was something more like this (note 
>>> that this also requires an ending quote on contained quoted strings 
>>> and allows ""):
>>>
>>> m{<td([^>"]+|"[^"]*")*>(<([^>"]+|"[^"]*")*>)*[a-z]{1,2}(<([^>"]+|"[^"]*")*>)*</td([^>"]+|"[^"]*")*>}i 
>>>
>>>
>>> The other problem with the pattern as written (with no *) is that the 
>>> subpatterns don't match plain <td> or </td>, since they require at 
>>> least one character between the td and the >.
>>>
>> It was late ;)
>>
>> I'm currently rinning tests on a couple of alternatives:
>>
>> rawbody tblobfu_opttag 
>> /<td(?:[^>'"]|"[^"]*"|'[^']*')*>(?:<(?!\/?td)(?:[^>'"]|"[^"]*"|'[^']*')*>){0,5}(?![oi][ns]|an?|en|of|de|l[ae]|us|no|tm)[a-z]{1,2}(?:<(?!\/?td)(?:[^>'"]|"[^"]*"|'[^']*')*>){0,5}<\/td(?:[^>'"]|"[^"]*"|'[^']*')*>/i 
>>
>> rawbody tblobfu_tag 
>> /<td(?:[^>'"]|"[^"]*"|'[^']*')*>(?:<(?!\/?td)(?:[^>'"]|"[^"]*"|'[^']*')*>){1,5}(?![oi][ns]|an?|en|of|de|l[ae]|us|no|tm)[a-z]{1,2}(?:<(?!\/?td)(?:[^>'"]|"[^"]*"|'[^']*')*>){1,5}<\/td(?:[^>'"]|"[^"]*"|'[^']*')*>/i 
>>
>>
>> Please note that before making this final I will be removing the 
>> splats (*) with some usable limitations, but I want to compare the 
>> number ofg ham/spam hits first before making the final rules.
>>
>> Jesse
>>
>>
>>
> 
> 
> Does any have any updated information regarding the effectiveness of 
> these rules, or possibly any updated alternatives to put in place 
> against messages involving the use of tables?
> 
> Thanks
> Clay
> 
> 

Being that i started this thread, i'd love to say that i have tested 
these rules and have some stats on their effectiveness, but the truth is 
i've been so swamped here with other work that i havent even had a 
chance to try them.  When i do, i'll post my findings.

-Jim