You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spamassassin.apache.org by Apache Wiki <wi...@apache.org> on 2007/01/18 15:19:19 UTC

[Spamassassin Wiki] Update of "RescoreMassCheck" by JustinMason

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Spamassassin Wiki" for change notification.

The following page has been changed by JustinMason:
http://wiki.apache.org/spamassassin/RescoreMassCheck

The comment on the change is:
update for 3.2.0

------------------------------------------------------------------------------
  = Rescore Mass-Check =
+ 
+ '''(see RescoreMassCheck310 for the 3.1.x historical page)'''
  
  This is the procedure we use to generate new scores.  It takes quite a while and is labour-intensive, so we do it infrequently.
  
@@ -21, +23 @@

  
  = Procedure =
  
- Here's the process for generating the scores as of SpamAssassin 3.1.0:
+ Here's the process for generating the scores as of SpamAssassin 3.2.0:
  
  == 1. heads-up ==
  
@@ -45, +47 @@

  {{{
  To: users
  Cc: dev
- Subject: NOTICE: 3.1.0 rescoring mass-checks
+ Subject: NOTICE: 3.2.0 rescoring mass-checks
  
- OK, if you're planning to send us mass-check logs for the 3.1.0
+ OK, if you're planning to send us mass-check logs for the
- rescoring, now's the time!
+ 3.2.0 rescoring, now's the time!
  
- http://wiki.apache.org/spamassassin/RescoreDetails has all the
+ http://wiki.apache.org/spamassassin/RescoreDetails has all
- details.
+ the details.
  
  cheers!
  
  --j.
  }}}
- 
- We then take the log files rsync'd up to the server, and use those logs for all 4 score sets.  The initial logs are for score set 3 (the fourth), sets 0, 1, and 2 can be generated from set 4 by stripping out the network tests and/or the Bayes tests.
  
  == 3. allow several days to complete (it takes a really long time!) ==
  
@@ -77, +77 @@

  
  That ensures that the data isn't going to change under your feet.
  
+ We then take the log files rsync'd up to the server, and use those logs for all 4 score sets.  The initial logs are for score set 3 (the fourth), sets 0, 1, and 2 can be generated from set 4 by stripping out the network tests and/or the Bayes tests.
+ 
+ (TODO: add a filtering step to remove "too-old" spam from the logs!)
+ 
  == 5. generate scores for score sets ==
  
  See RunningPerceptron.
@@ -94, +98 @@

  
  == 6. upload the test logs to zone ==
  
- Since stuff like the STATISTICS cannot ever be regenerated without the (randomised) test logs, these need to be saved, too.   Currently, I think the best bet is to upload the {{{rescore-logs.tgz}}} file somewhere on spamassassin.zones.apache.org; it doesn't have to be in a public place, ASF-committer-account-required is fine.
+ Since stuff like the STATISTICS cannot ever be regenerated without the (randomised) test logs, these need to be saved, too.   Currently, I think the best bet is to upload the {{{rescore-logs.tgz}}} file somewhere on spamassassin.zones.apache.org; it doesn't have to be in a public place, ASF-committer-account-required is fine.  Just mention that path in the rescoring bug's comments.
  
  == 7. upload proposed new scores ==