You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spamassassin.apache.org by Joao Gouveia <jo...@anubisnetworks.com> on 2009/11/21 19:41:37 UTC

Strange ham corpus?

(resending this, used a wrong email account ..)

Hi, 

I was checking for FPs in our RBL, and noticed that most of them are
hitting on a ham corpus that doesn't look very hammy to me:

http://ruleqa.spamassassin.org/20091121-r882858-n/T_RCVD_IN_ANBREP_L3?mclog=ham-net-nbebout

The scores are a bit strange (so are the rules being hit). Is this
really supposed to be ham?


-- 
João Gouveia
AnubisNetworks

Re: Strange ham corpus?

Posted by Matt Kettler <mk...@verizon.net>.

Joao Gouveia wrote:
> (resending this, used a wrong email account ..)
>
> Hi, 
>
> I was checking for FPs in our RBL, and noticed that most of them are
> hitting on a ham corpus that doesn't look very hammy to me:
>
> http://ruleqa.spamassassin.org/20091121-r882858-n/T_RCVD_IN_ANBREP_L3?mclog=ham-net-nbebout
>
> The scores are a bit strange (so are the rules being hit). Is this
> really supposed to be ham?
>
>   
I have to admit, this does look like a spam corpus.

Of  77 messages
62 hit RAZOR2_CF_RANGE_51_100.
49 hit URIBL_BLACK
45 hit T_URIBL_META_SURBL_ANY
26 hit RCVD_IN_XBL
25 hit various JM_SOUGHT rules.

Given the broad diversity of fairly reliable spam indicators all
matching heavily on this mail, this is either a spam corpus, or a corpus
of email from "shady" companies that do lots of spamming, but the corpus
maintainer actually subscribed to them.