You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Per Jessen <pe...@computer.org> on 2004/12/10 13:25:43 UTC

2.64 - SUBJ_HAS_UNIQ_ID - incorrect interpretation of underscores??

Why does SUBJ_HAS_UNIQ_ID fire on this subject:

Subject: =?iso-8859-1?Q?MIGROL_Heiz=F6l-Angebot_mit_Cumulus-Bonuspunkten?=

It looks as SA mistakenly interprets the underscores as underscores - which in
an RFC2047 encoded string, they're not - http://rfc.net/rfc2047.html ,

Is this a bug in the RFC2047 decoding in SA 2.64? 


-- 
Per Jessen, Zurich
Let your spam stop here -- http://www.spamchek.com



Re: 2.64 - SUBJ_HAS_UNIQ_ID - incorrect interpretation of underscores??

Posted by Loren Wilton <lw...@earthlink.net>.
I just finally turned this rule off.  For some reason it has started
triggering on a whole lot of my normal mail, which isn't useful and is
creating a bunch of FPs.  I don't think I've ever seen it trigger on spam...
:-)

        Loren


Re: 2.64 - SUBJ_HAS_UNIQ_ID - incorrect interpretation of underscores??

Posted by Matt Kettler <mk...@evi-inc.com>.
At 08:30 PM 12/10/2004, Matt Kettler wrote:
> >The rule doesn't do very well anyway:
> >
> >   1.039   1.1433   0.1190    0.906   0.73    0.90  SUBJ_HAS_UNIQ_ID
> >
> >Hence the <1 score it receives.
>
>Perhaps this is a decent chunk of why the rule doesn't perform well.... It 
>might be worth looking into modifying that regex in the eval to try to get 
>better performance, or splitting them up so you can test each separately...

Nevermind. Looking at my most recent 300 spams, only one matched, and that 
didn't have a UNIQ_ID..


         Subject: {SPAM} 0rder your meds" today`

It doesn't look like spammers use UNIQ ID's in the subject lines often 
anymore..

The only one I did find, doesn't match the rule:

         Subject: {SPAM} STOP_PAYING_FOR YOUR Cable_Movies e6pgu33335

There are others posing as shipment notices, but the rule tries to skip 
them on purpose..

         Subject: {SPAM} Fedex Ship Notification, Tracking Number : 
VBN24530946 - 40352TZLP
         Subject: {SPAM} Fedex Delivery Confirmation, Tracking Number : 
ITZ65070066405343DJCK




Re: 2.64 - SUBJ_HAS_UNIQ_ID - incorrect interpretation of underscores??

Posted by Matt Kettler <mk...@evi-inc.com>.
At 04:06 PM 12/10/2004, Theo Van Dinter wrote:
>It's not simply a hyphenated word.  It looks like two long sets of characte=
>rs
>with a hyphen in the middle, which is the exact same thing as a unique id.
>
>The rule doesn't do very well anyway:
>
>   1.039   1.1433   0.1190    0.906   0.73    0.90  SUBJ_HAS_UNIQ_ID
>
>Hence the <1 score it receives.

Perhaps this is a decent chunk of why the rule doesn't perform well.... It 
might be worth looking into modifying that regex in the eval to try to get 
better performance, or splitting them up so you can test each separately... 


Re: 2.64 - SUBJ_HAS_UNIQ_ID - incorrect interpretation of underscores??

Posted by Theo Van Dinter <fe...@kluge.net>.
On Fri, Dec 10, 2004 at 08:31:57PM +0100, Per Jessen wrote:
> > No.  The issue is that "cumulus-bonuspunkten" looks like an ID tag.
> 
> Should SUBJ_HAS_UNIQ_ID really fire on that - simply a hyphenated word?  There
> are plenty of those around (although less in german then in english). 

It's not simply a hyphenated word.  It looks like two long sets of characters
with a hyphen in the middle, which is the exact same thing as a unique id.

The rule doesn't do very well anyway:

  1.039   1.1433   0.1190    0.906   0.73    0.90  SUBJ_HAS_UNIQ_ID

Hence the <1 score it receives.

-- 
Randomly Generated Tagline:
"Linux poses a real challenge for those with a taste for late-night
 hacking (and/or conversations with God)."
 (By Matt Welsh)

Re: 2.64 - SUBJ_HAS_UNIQ_ID - incorrect interpretation of underscores??

Posted by Per Jessen <pe...@computer.org>.
Theo Van Dinter wrote:

> On Fri, Dec 10, 2004 at 01:25:43PM +0100, Per Jessen wrote:
>> Why does SUBJ_HAS_UNIQ_ID fire on this subject:
>> 
>> Subject: =?iso-8859-1?Q?MIGROL_Heiz=F6l-Angebot_mit_Cumulus-Bonuspunkten?=
>> 
>> Is this a bug in the RFC2047 decoding in SA 2.64?
> 
> No.  The issue is that "cumulus-bonuspunkten" looks like an ID tag.

Should SUBJ_HAS_UNIQ_ID really fire on that - simply a hyphenated word?  There
are plenty of those around (although less in german then in english). 
 

-- 
Per Jessen, Zurich
Let your spam stop here -- http://www.spamchek.com



Re: 2.64 - SUBJ_HAS_UNIQ_ID - incorrect interpretation of underscores??

Posted by Theo Van Dinter <fe...@kluge.net>.
On Fri, Dec 10, 2004 at 01:25:43PM +0100, Per Jessen wrote:
> Why does SUBJ_HAS_UNIQ_ID fire on this subject:
> 
> Subject: =?iso-8859-1?Q?MIGROL_Heiz=F6l-Angebot_mit_Cumulus-Bonuspunkten?=
> 
> Is this a bug in the RFC2047 decoding in SA 2.64? 

No.  The issue is that "cumulus-bonuspunkten" looks like an ID tag.

-- 
Randomly Generated Tagline:
"Any similarity to person/persons now living to anyone or thing, dead or 
 undead, is entirely accidental and just one more irrefutable proof of the 
 paranormal."                  - From the 7th Guest