You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spamassassin.apache.org by Apache Wiki <wi...@apache.org> on 2007/01/18 15:19:19 UTC
[Spamassassin Wiki] Update of "RescoreMassCheck" by JustinMason
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Spamassassin Wiki" for change notification.
The following page has been changed by JustinMason:
http://wiki.apache.org/spamassassin/RescoreMassCheck
The comment on the change is:
update for 3.2.0
------------------------------------------------------------------------------
= Rescore Mass-Check =
+
+ '''(see RescoreMassCheck310 for the 3.1.x historical page)'''
This is the procedure we use to generate new scores. It takes quite a while and is labour-intensive, so we do it infrequently.
@@ -21, +23 @@
= Procedure =
- Here's the process for generating the scores as of SpamAssassin 3.1.0:
+ Here's the process for generating the scores as of SpamAssassin 3.2.0:
== 1. heads-up ==
@@ -45, +47 @@
{{{
To: users
Cc: dev
- Subject: NOTICE: 3.1.0 rescoring mass-checks
+ Subject: NOTICE: 3.2.0 rescoring mass-checks
- OK, if you're planning to send us mass-check logs for the 3.1.0
+ OK, if you're planning to send us mass-check logs for the
- rescoring, now's the time!
+ 3.2.0 rescoring, now's the time!
- http://wiki.apache.org/spamassassin/RescoreDetails has all the
+ http://wiki.apache.org/spamassassin/RescoreDetails has all
- details.
+ the details.
cheers!
--j.
}}}
-
- We then take the log files rsync'd up to the server, and use those logs for all 4 score sets. The initial logs are for score set 3 (the fourth), sets 0, 1, and 2 can be generated from set 4 by stripping out the network tests and/or the Bayes tests.
== 3. allow several days to complete (it takes a really long time!) ==
@@ -77, +77 @@
That ensures that the data isn't going to change under your feet.
+ We then take the log files rsync'd up to the server, and use those logs for all 4 score sets. The initial logs are for score set 3 (the fourth), sets 0, 1, and 2 can be generated from set 4 by stripping out the network tests and/or the Bayes tests.
+
+ (TODO: add a filtering step to remove "too-old" spam from the logs!)
+
== 5. generate scores for score sets ==
See RunningPerceptron.
@@ -94, +98 @@
== 6. upload the test logs to zone ==
- Since stuff like the STATISTICS cannot ever be regenerated without the (randomised) test logs, these need to be saved, too. Currently, I think the best bet is to upload the {{{rescore-logs.tgz}}} file somewhere on spamassassin.zones.apache.org; it doesn't have to be in a public place, ASF-committer-account-required is fine.
+ Since stuff like the STATISTICS cannot ever be regenerated without the (randomised) test logs, these need to be saved, too. Currently, I think the best bet is to upload the {{{rescore-logs.tgz}}} file somewhere on spamassassin.zones.apache.org; it doesn't have to be in a public place, ASF-committer-account-required is fine. Just mention that path in the rescoring bug's comments.
== 7. upload proposed new scores ==