You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Ned Slider <ne...@unixmail.co.uk> on 2011/12/01 15:47:49 UTC

Re: URIBL_PH_SURBL

On 01/12/11 08:29, Tom Kinghorn wrote:
> Good morning list.
>
> could someone possibly explain how the scoring for ph.surbl.org works?
>
> I see the following in my spam logs
>
> spam-1DSMgl4+-YFV.gz: TO_NO_BRKTS_HTML_ONLY=1.258, URIBL_PH_SURBL=0.001]
> spam-1DSMgl4+-YFV.gz: * 0.0 URIBL_PH_SURBL Contains an URL listed in the PH
> SURBL blocklist
>
>
> Why does the ph.surbl.org score so low?
>
> I see the rule is defined as
>
> urirhssub URIBL_PH_SURBL multi.surbl.org. A 8
> body URIBL_PH_SURBL eval:check_uridnsbl('URIBL_PH_SURBL')
> describe URIBL_PH_SURBL Contains an URL listed in the PH SURBL blocklist
> tflags URIBL_PH_SURBL net
> reuse URIBL_PH_SURBL
>
> how does this work?
>
> Thanks
>
> Tom
>

and the score is defined in 50_scores.cf:

score URIBL_PH_SURBL 0 0.001 0 0.610 # n=0 n=2

These 4 scores are defined as local, net, with bayes, with bayes+net.

Net means you have network tests enabled, local means you don't have 
network tests enabled.

So because you are showing a score of 0.001, you appear to be using the 
"net" score set - network tests enabled but no bayes. If you were using 
"net" and bayes, then this rule would have scored 0.610.

You can over ride scores locally in local.cf if you want.

The scores are automatically generated based on nightly masschecks

http://wiki.apache.org/spamassassin/NightlyMassCheck

This is obviously dependent upon people contributing data for their spam 
and ham.


Re: URIBL_PH_SURBL

Posted by Jeff Chan <je...@surbl.org>.
On Thursday, December 1, 2011, 10:11:35 AM, Darxus Darxus wrote:
> On 12/01, Jeff Chan wrote:
>> Also keep in mind that PH has a generally low score even for net
>> + bayes since it doesn't hit a large portion of spam in the SA
>> corpus.  

> No.  Scores are not determined by how many spams a rule hits.  Scores are
> automatically generated to correctly flag as many spams as possible
> without exceeding 1 false positive in every 2500 hams (with a
> required_score of 5).

> Stated in
> http://svn.apache.org/repos/asf/spamassassin/trunk/rules/50_scores.cf
> (a file you get via sa-update)

> So it's entirely possible to have a rule that hits a very small percentage
> of spam with a very large score.

Thanks for the correction.  I actually knew that but remembered
incorrectly.  :(

Cheers,

Jeff C.
-- 
Jeff Chan
mailto:jeffc@surbl.org
http://www.surbl.org/


Re: URIBL_PH_SURBL

Posted by da...@chaosreigns.com.
On 12/01, Jeff Chan wrote:
> Also keep in mind that PH has a generally low score even for net
> + bayes since it doesn't hit a large portion of spam in the SA
> corpus.  

No.  Scores are not determined by how many spams a rule hits.  Scores are
automatically generated to correctly flag as many spams as possible
without exceeding 1 false positive in every 2500 hams (with a
required_score of 5).

Stated in
http://svn.apache.org/repos/asf/spamassassin/trunk/rules/50_scores.cf
(a file you get via sa-update)

So it's entirely possible to have a rule that hits a very small percentage
of spam with a very large score.

-- 
"This hurts quite a bit. Very painful."
"Think of the sensation as reassurance that you are not dead yet. What
you are feeling is life in you!" - Johnny The Homicidal Maniac
http://www.ChaosReigns.com

Re: URIBL_PH_SURBL

Posted by Jeff Chan <je...@surbl.org>.
Also keep in mind that PH has a generally low score even for net
+ bayes since it doesn't hit a large portion of spam in the SA
corpus.  (In other words phishing and malware unsolicited
messages are a relatively small subset of unsolicited messages in
general.)  However the unsolicited messages it does hit are
generally going to be phishing or malware, so IMO it should have
a much higher score.  Unless people want to get phishing and
malware.... 

Cheers,

Jeff C.
-- 
Jeff Chan
mailto:jeffc@surbl.org
http://www.surbl.org/