You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Axb <ax...@gmail.com> on 2012/07/04 09:32:01 UTC
high scores on HDRS_LCASE,MANY_HDRS_LCASE > FPs
from last update's 72_scores.cf
score HDRS_LCASE 3.749 3.999 3.749 3.999
score MANY_HDRS_LCASE 1.251 1.004 1.251 1.004
Although John manually set low scores in the sandbox file, these are
ignored (per design).
Fixed/forced scores should be set via 73_sandbox_manual_scores.cf and
not in sanbox files
They have comment:
# observed in UCE 9/2009
As they are hitting lots of ham, can we please loose these.
HDRS_LCASE_IMGONLY may be another candidate to be dropped.
Thanks
Axb.
Re: high scores on HDRS_LCASE,MANY_HDRS_LCASE > FPs
Posted by Axb <ax...@gmail.com>.
On 07/04/2012 04:40 PM, John Hardin wrote:
> On Wed, 4 Jul 2012, Axb wrote:
>
>> from last update's 72_scores.cf
>>
>> score HDRS_LCASE 3.749 3.999 3.749 3.999
>> score MANY_HDRS_LCASE 1.251 1.004 1.251 1.004
>>
>> Although John manually set low scores in the sandbox file, these are
>> ignored (per design).
>
> They are _limits_. The generator should not exceed those scores. The
> newly limited scores may take a bit to show up in an update.
I'll watch those scores closely.
>> Fixed/forced scores should be set via 73_sandbox_manual_scores.cf and
>> not in sanbox files
>>
>> They have comment:
>> # observed in UCE 9/2009
>>
>> As they are hitting lots of ham, can we please loose these.
>>
>> HDRS_LCASE_IMGONLY may be another candidate to be dropped.
>
> Alex, I don't recall if you're running masschecks; if you are, can you
> include such FPs in your ham corpus? The reason they're being scored so
> highly by the rescorer is they do perform well against the masscheck
> corpus.
I am running masschecks but these hits I see on msgs (maillog)
gatewayed thru $dayjob's boxes - not stuff stored locally.
I understand, a lot of rules may perform well in masschecks but overall
generic patterns should be dropped if we detect that real world traffic
shows they're dangerous.
Imo, we should be able to trust our traffic & judgement more than
masscheck corpuses which may be highly biased.
Axb
Re: high scores on HDRS_LCASE,MANY_HDRS_LCASE > FPs
Posted by John Hardin <jh...@impsec.org>.
On Wed, 4 Jul 2012, Axb wrote:
> from last update's 72_scores.cf
>
> score HDRS_LCASE 3.749 3.999 3.749 3.999
> score MANY_HDRS_LCASE 1.251 1.004 1.251 1.004
>
> Although John manually set low scores in the sandbox file, these are ignored
> (per design).
They are _limits_. The generator should not exceed those scores. The newly
limited scores may take a bit to show up in an update.
> Fixed/forced scores should be set via 73_sandbox_manual_scores.cf and not in
> sanbox files
>
> They have comment:
> # observed in UCE 9/2009
>
> As they are hitting lots of ham, can we please loose these.
>
> HDRS_LCASE_IMGONLY may be another candidate to be dropped.
Alex, I don't recall if you're running masschecks; if you are, can you
include such FPs in your ham corpus? The reason they're being scored so
highly by the rescorer is they do perform well against the
masscheck corpus.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Ignorance is no excuse for a law.
-----------------------------------------------------------------------
Today: the 236th anniversary of the Declaration of Independence