You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Karsten Bräckelmann <gu...@rudersport.de> on 2009/09/18 15:30:56 UTC

Re: bug 4234 - MIME_HTML_ONLY + MPART_ALT_DIFF both firing on html-only email

On Fri, 2009-09-18 at 15:10 +0200, Per Jessen wrote:
> The bug report seems to suggest that this was solved in 3.1.x by
> significantly reducing the score for MIME_HTML_ONLY and MPART_ALT_DIFF,

Comment 5 suggests, the scores have not been manually corrected, but the
score generation process for 3.1 did -- effectively reflecting the fact
they did appear in ham at that time.

> but they seem to have crept back up in 3.2.5 ? 

The score-generation for 3.2 resulted in a different assessment, with
higher scores -- based on the corpora, it obviously did not appear
frequently in ham.


So, yes, the scores are higher again -- and have been for over 2 years
now. ;)  However, I wouldn't say "creeping back in", cause it never has
been manually fixed or adjusted, but always generated.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: bug 4234 - MIME_HTML_ONLY + MPART_ALT_DIFF both firing on html-only email

Posted by Per Jessen <pe...@computer.org>.
Per Jessen wrote:

>> That's still ham.  Did you see a FP due to these rules plus others
>> (which ones?), or are you merely about the cumulative score for these
>> on their own?
> 
> I noticed an FP on a mail from networksolutions, and judging from my
> logs, on the 18Sep I had 9 emails that scored just above 5 including
> these two rules.

Of which at least three were FPs (from a Vietnamese bank). 


/Per Jessen, Zürich


Re: bug 4234 - MIME_HTML_ONLY + MPART_ALT_DIFF both firing on html-only email

Posted by Per Jessen <pe...@computer.org>.
Karsten Bräckelmann wrote:

> On Sat, 2009-09-19 at 10:22 +0200, Per Jessen wrote:
>> Karsten Bräckelmann wrote:
>> > So, yes, the scores are higher again -- and have been for over 2
>> > years
>> > now. ;)  However, I wouldn't say "creeping back in", cause it never
>> > has been manually fixed or adjusted, but always generated.
>> 
>> Mea culpa, I didn't concern myself with _how_ the scores had been
>> changed back.  Nonetheless, perfectly legitimate mails are now given
>> 2.8 (1.1+1.7) points purely for consisting of only an HTML part. 
>> Seems a bit excessive.
> 
> That's still ham.  Did you see a FP due to these rules plus others
> (which ones?), or are you merely about the cumulative score for these
> on their own?

I noticed an FP on a mail from networksolutions, and judging from my
logs, on the 18Sep I had 9 emails that scored just above 5 including
these two rules. 

> Either way, it'd be interesting to see what the next GA run returns
> for them. Could you keep an eye on this and get back later with the
> scores for 3.3?

I'll try. 


/Per Jessen, Zürich


Re: bug 4234 - MIME_HTML_ONLY + MPART_ALT_DIFF both firing on html-only email

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Sat, 2009-09-19 at 10:22 +0200, Per Jessen wrote:
> Karsten Bräckelmann wrote:
> > So, yes, the scores are higher again -- and have been for over 2 years
> > now. ;)  However, I wouldn't say "creeping back in", cause it never
> > has been manually fixed or adjusted, but always generated.
> 
> Mea culpa, I didn't concern myself with _how_ the scores had been
> changed back.  Nonetheless, perfectly legitimate mails are now given
> 2.8 (1.1+1.7) points purely for consisting of only an HTML part.  Seems
> a bit excessive.

That's still ham.  Did you see a FP due to these rules plus others
(which ones?), or are you merely about the cumulative score for these on
their own?

Either way, it'd be interesting to see what the next GA run returns for
them. Could you keep an eye on this and get back later with the scores
for 3.3?


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: bug 4234 - MIME_HTML_ONLY + MPART_ALT_DIFF both firing on html-only email

Posted by Per Jessen <pe...@computer.org>.
Karsten Bräckelmann wrote:

> On Fri, 2009-09-18 at 15:10 +0200, Per Jessen wrote:
>> The bug report seems to suggest that this was solved in 3.1.x by
>> significantly reducing the score for MIME_HTML_ONLY and
>> MPART_ALT_DIFF,
> 
> Comment 5 suggests, the scores have not been manually corrected, but
> the score generation process for 3.1 did -- effectively reflecting the
> fact they did appear in ham at that time.
> 
>> but they seem to have crept back up in 3.2.5 ?
> 
> The score-generation for 3.2 resulted in a different assessment, with
> higher scores -- based on the corpora, it obviously did not appear
> frequently in ham.
> 
> So, yes, the scores are higher again -- and have been for over 2 years
> now. ;)  However, I wouldn't say "creeping back in", cause it never
> has been manually fixed or adjusted, but always generated.

Mea culpa, I didn't concern myself with _how_ the scores had been
changed back.  Nonetheless, perfectly legitimate mails are now given
2.8 (1.1+1.7) points purely for consisting of only an HTML part.  Seems
a bit excessive.


/Per Jessen, Zürich


Re: bug 4234 - MIME_HTML_ONLY + MPART_ALT_DIFF both firing on html-only email

Posted by Matus UHLAR - fantomas <uh...@fantomas.sk>.
> On Fri, 2009-09-18 at 15:10 +0200, Per Jessen wrote:
> > The bug report seems to suggest that this was solved in 3.1.x by
> > significantly reducing the score for MIME_HTML_ONLY and MPART_ALT_DIFF,

On 18.09.09 15:30, Karsten Bräckelmann wrote:
> Comment 5 suggests, the scores have not been manually corrected, but the
> score generation process for 3.1 did -- effectively reflecting the fact
> they did appear in ham at that time.

> > but they seem to have crept back up in 3.2.5 ? 

> The score-generation for 3.2 resulted in a different assessment, with
> higher scores -- based on the corpora, it obviously did not appear
> frequently in ham.
> 
> So, yes, the scores are higher again -- and have been for over 2 years
> now. ;)  However, I wouldn't say "creeping back in", cause it never has
> been manually fixed or adjusted, but always generated.

I think this is exactly the place where MetaSVM plugin should give us good
score adjustment... aren't SA people willing to test it at the same corpuses
they are doing masschecks now?

-- 
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Due to unexpected conditions Windows 2000 will be released
in first quarter of year 1901