You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spamassassin.apache.org by sp...@incubator.apache.org on 2004/07/13 08:28:37 UTC

[SpamAssassin Wiki] New: RescoreSet01Details

   Date: 2004-07-12T23:28:37
   Editor: JustinMason <jm...@jmason.org>
   Wiki: SpamAssassin Wiki
   Page: RescoreSet01Details
   URL: http://wiki.apache.org/spamassassin/RescoreSet01Details

   mass-check instructions for set0/1

New Page:

= Rescore Mass-checks for Set 0 and Set 1 =

''(THIS IS ONLY A DRAFT RIGHT NOW)''

The mass-check runs for 3.0.0 will be starting shortly.  Here's the procedure
you'll need to follow, if you wish to submit rescoring data for the GA
run:

First, send mail to <submit.at.spamassassin.org>, and ask for a GA submission
account if you haven't already got one.

Turn off your nightly mass-checks, if you're running them, if you
want; they aren't important while this is going on.

Then run these commands:

{{{
  wget http://SpamAssassin.apache.org/released/Mail-SpamAssassin-3.0.0-pre2.tar.gz
  tar xvfz Mail-SpamAssassin-3.0.0-pre2.tar.gz
  cd Mail-SpamAssassin-3.0.0
  perl Makefile.PL < /dev/null; make

  cd masses
  mkdir spamassassin
  rm spamassassin/bayes*
  echo "use_bayes 0" > spamassassin/user_prefs
  echo "bayes_auto_learn 0" >> spamassassin/user_prefs
  rm ham.log spam.log

  ./mass-check --net -j 4 --all <targets>
}}}

{{{<targets>}}} is the list of directories, mboxes, etc., like
{{{spam:dir:~/Mail/spam}}}.  See the comments at the top of "mass-check" for
details.

This takes *ages* to run.  If you see "out of memory" errors, you may want
to try adding the {{{--restart}}} option.

{{{-j 4}}} controls the number of processes to use; 4 should be OK for a
single-processor machine, since most of the time they'll be waiting for
network results to arrive.

If you have an unusual network layout, you may need to specify
{{{trusted_networks}}} in the {{{spamassassin/user_prefs}}} file.  But SA
should be able to infer it in most cases.

Once it finishes:

{{{
  USER="[whatever your username is]"
  RSYNC_PASSWORD="[whatever your password is]"
  export RSYNC_PASSWORD

  rsync -CPcvuzb ham.log $USER@rsync.spamassassin.org::submit/ham-nobayes-net-$USER.log
  rsync -CPcvuzb spam.log $USER@rsync.spamassassin.org::submit/spam-nobayes-net-$USER.log
}}}

That's it!   Then we do the bayes+nonet and bayes+net runs later on.

The results for this run will need to be in by Monday July 19th.