You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Fred T <sp...@freddyt.com> on 2006/03/21 20:57:17 UTC

Question about mass-check screen with --progress enabled

Hello users,

  I'm using mass-check script here to check rules against a corpus and
  I noticed possibly a bug in the mass-check script.  Before I went
  and created a ticket, I just want to check to see if this is a bug
  or if I am reading the results wrong.

  I created a screen-shot to help me explain what's going on.
  http://www.i-is.com/mass-check.gif

  The top half was the HAM check and the bottom half is the SPAM run.
   The numbers don't seem to make sense.

   why does it always say:  status:  ....  ham:  0
   That first rule looks like:

   
   status: starting run stage
   status: 10% ham: 0      spam: 747
   status: 20% ham: 0      spam: 1494
   ..
   ..
   ..
   status: starting run stage
   status: 10% ham: 0      spam: 10001
   status: 20% ham: 0      spam: 20002
   

   I am thinking that those numbers in the first run should be
   appearing in the HAM column but they are going into the spam
   column.



SpamAssassin version 3.2.0-r386260
  running on Perl version 5.8.7

  I'm running mass-check with these parameters:


  perl ./mass-check -p=$PWD --progress -n -j 2 --loghits --mid $netparm $tailparams --mbox ./corpus.ham/*  >ham.log  # mass-check rules
  perl ./mass-check -p=$PWD --progress -n -j 2 --loghits --mid $netparm $tailparams --mbox ./corpus.spam/* >spam.log  


  So this appears to be a minor bug with the --progress switch on
  mass-check.

  
-- 
Best regards,
 Fred                          mailto:spamassassin@freddyt.com


Re: Question about mass-check screen with --progress enabled

Posted by Theo Van Dinter <fe...@apache.org>.
On Tue, Mar 21, 2006 at 02:57:17PM -0500, Fred T wrote:
>   I created a screen-shot to help me explain what's going on.
>   http://www.i-is.com/mass-check.gif

Hrm.  Why are you doing 2 different runs?

>    I am thinking that those numbers in the first run should be
>    appearing in the HAM column but they are going into the spam
>    column.

Nope.

>   perl ./mass-check -p=$PWD --progress -n -j 2 --loghits --mid $netparm $tailparams --mbox ./corpus.ham/*  >ham.log  # mass-check rules
>   perl ./mass-check -p=$PWD --progress -n -j 2 --loghits --mid $netparm $tailparams --mbox ./corpus.spam/* >spam.log  

Well, the first issue is that you are running mass-check twice for some
reason.  Just pass it the ham and spam and it'll do both at once.

As for why things are appearing in the spam column (second issue)...
You don't tell mass-check what type of mail you're passing it (it won't
try to guess from the path), so it assumes everything is spam.  You want
to do something like:

mass-check [...] ham:mbox:corpus.ham spam:mbox:corpus.spam

mass-check will then know that ham should be read from mbox files in the
corpus.ham directory, and spam should be read from mbox files in the
corpus.spam directory.

take a look at the files in contrib/ -- those are what I use for development
and the nightly/weekly runs.  hopefully it'll help out with any confusion. :)

>   So this appears to be a minor bug with the --progress switch on
>   mass-check.

nope. :)

-- 
Randomly Generated Tagline:
Buffer: A nude hacker.