You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by sarahphia <sk...@gmail.com> on 2006/10/11 16:38:25 UTC

Low accuracy ?

Hi All,

I’m having a problem with Spam Assassin - I can’t get it to correctly
identify more than 20% of the spam emails I'm testing.

I downloaded 3.1.7 from Spamassassin.org and ran:
perl Makefile.pl
make
make test
make install

I didn’t have perl installed before, and downloaded only the required perl
modules

I wrote a little test in perl to call the Mail::SpamAssassin module that
looks like 
my $spamtest = Mail::SpamAssassin->new();
my $status = $spamtest->check_message_text($mail);
return $status->is_spam();

The $email is set to the full contents of the email message, including
headers.

I’m testing it on the spam assassin’s public corpus (
http://spamassassin.apache.org/publiccorpus/ ) so I would suspect accuracy
would be high.  

Am I missing something that’s hurting my accuracy?

-- 
View this message in context: http://www.nabble.com/Low-accuracy---tf2423878.html#a6757729
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: Low accuracy ?

Posted by Jeff Chan <je...@surbl.org>.
On Wednesday, October 11, 2006, 9:51:17 AM, sarahphia sarahphia wrote:
> I have also downloaded Spam from July 2006, which is being classified below
> 20% accuracy.

What rules are being hit?  Can you show us some scores from the
headers?

Does your config have a trust path problem?

  http://wiki.apache.org/spamassassin/TrustedRelays

If the trust path is wrong, then all of your incoming messages
can be mishandled, as if they are coming from your own internal
network. 

Jeff C.
-- 
Jeff Chan
mailto:jeffc@surbl.org
http://www.surbl.org/


Re: Low accuracy ?

Posted by sarahphia <sk...@gmail.com>.


Justin Mason wrote:
> 
> 
> worth noting that the public corpus is 3 years old now -- our rules
> are not designed to catch 3 year old spam ;)
> 
> --j.
> 
> 

I have also downloaded Spam from July 2006, which is being classified below
20% accuracy.
-- 
View this message in context: http://www.nabble.com/Low-accuracy---tf2423878.html#a6760317
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: Low accuracy ?

Posted by Martin Hepworth <ma...@solidstatelogic.com>.
sarahphia wrote:
> Hi All,
> 
> I’m having a problem with Spam Assassin - I can’t get it to correctly
> identify more than 20% of the spam emails I'm testing.
> 
> I downloaded 3.1.7 from Spamassassin.org and ran:
> perl Makefile.pl
> make
> make test
> make install
> 
> I didn’t have perl installed before, and downloaded only the required perl
> modules
> 
> I wrote a little test in perl to call the Mail::SpamAssassin module that
> looks like 
> my $spamtest = Mail::SpamAssassin->new();
> my $status = $spamtest->check_message_text($mail);
> return $status->is_spam();
> 
> The $email is set to the full contents of the email message, including
> headers.
> 
> I’m testing it on the spam assassin’s public corpus (
> http://spamassassin.apache.org/publiccorpus/ ) so I would suspect accuracy
> would be high.  
> 
> Am I missing something that’s hurting my accuracy?
> 

checks the extra rules in www.rulesemporium.com, make sure you're 
running the network tests as well (spamassassin -D --lint)

-- 
Martin Hepworth
Senior Systems Administrator
Solid State Logic
Tel: +44 (0)1865 842300

**********************************************************************

This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote confirms that this email message has been swept
for the presence of computer viruses and is believed to be clean.	

**********************************************************************