You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by da...@chaosreigns.com on 2011/04/03 18:27:31 UTC

Lack of sorting in sought Re: svn commit: r1088313 - /spamassassin/trunk/rulesrc/sandbox/jm/20_sought.cf

Over half the lines in this are due entirely to a lack of sorting.

Number of body rules changes:

$ cat sought.txt | grep "^.body __" | wc -l
455

Number of body rules changes that weren't just removing and re-adding
the same rule:

$ cat sought.txt | grep "^.body __" | awk '{print $2}' | sort | uniq -c | sort -nr | grep -c '1 __'
199


On 04/03, jm@apache.org wrote:
>  score JM_SOUGHT_1  4.0
>  describe JM_SOUGHT_1  Body contains frequently-spammed text patterns

> -body __SEEK_2GEMSF  /United Parcel Service notification /
> +body __SEEK_2GEMSF  /United Parcel Service notification /

> -body __SEEK_2NAEPI  / them pass\?hover\?dancetheirlanguage\? /
> +body __SEEK_2NAEPI  / them pass\?hover\?dancetheirlanguage\? /

> -body __SEEK_3VWDKG  /Limited time offer \x{96} 555USD Bonus/
> +body __SEEK_3VWDKG  /Limited time offer \x{96} 555USD Bonus/

> -body __SEEK_6ZO_TB  /Thank you for attention\. Post Express/
> +body __SEEK_6ZO_TB  /Thank you for attention\. Post Express/

> -body __SEEK_CWKVHY  / the sameinstant:Why,thosesigns\!Yes,thehentracks\! /
> +body __SEEK_CWKVHY  / the sameinstant:Why,thosesigns\!Yes,thehentracks\! /

Etc.

-- 
"Of course there's strength in numbers. But there's strength in sharp
weaponry too. Ironically, this lead to what we call 'civilization'."
- spore
http://www.ChaosReigns.com