You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spamassassin.apache.org by Apache Wiki <wi...@apache.org> on 2009/08/14 22:31:06 UTC
[Spamassassin Wiki] Update of "RescoreMassCheck" by JustinMason
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Spamassassin Wiki" for change notification.
The following page has been changed by JustinMason:
http://wiki.apache.org/spamassassin/RescoreMassCheck
The comment on the change is:
update for 3.3.0
------------------------------------------------------------------------------
= Rescore Mass-Check =
- '''(see RescoreMassCheck310 for the 3.1.x historical page)'''
+ '''(see RescoreMassCheck310 or RescoreMasscheck320 for historical releases)'''
This is the procedure we use to generate new scores. It takes quite a while and is labour-intensive, so we do it infrequently.
@@ -23, +23 @@
= Procedure =
- Here's the process for generating the scores as of SpamAssassin 3.2.0:
+ Here's the process for generating the scores as of SpamAssassin 3.3.0:
== 1. heads-up ==
@@ -50, +50 @@
{{{
ssh spamassassin.zones.apache.org
cd /home/corpus-rsync
- OLDVERSION="3.1"
+ OLDVERSION="3.2"
sudo mv corpus/submit scoregen-$OLDVERSION
sudo mkdir corpus/submit
sudo chown rsync corpus/submit
@@ -67, +67 @@
svn cp \
https://svn.apache.org/repos/asf/spamassassin/trunk \
- https://svn.apache.org/repos/asf/spamassassin/tags/3_2_0_mcsnapshot_1
+ https://svn.apache.org/repos/asf/spamassassin/tags/3_3_0_mcsnapshot_1
}}}
(we can't use the standard build process here anymore since the dist tarball no longer includes "masses". Use a descriptive, unique tag name.)
== 2. announce mass-check ==
- RescoreDetails is the full announcement text (and instructions) for this phase. It's sufficient just to send out a mail something like the one we used in 3.1.0:
+ RescoreDetails is the full announcement text (and instructions) for this phase. It's sufficient just to send out a mail something like the one we used in previous releases:
{{{
To: users
Cc: dev
- Subject: NOTICE: 3.2.0 rescoring mass-checks
+ Subject: NOTICE: 3.3.0 rescoring mass-checks
OK, if you're planning to send us mass-check logs for the
- 3.2.0 rescoring, now's the time!
+ 3.3.0 rescoring, now's the time!
http://wiki.apache.org/spamassassin/RescoreDetails has all
the details.
@@ -122, +122 @@
./log-grep-recent -m 6 /home/corpus-rsync/corpus/submit/spam-*.log > spam-full.log
}}}
- We may have to tweak the number of months specified for each type, if there's too much or too little mail resulting from the grep. but 38 months / 6 months worked well for 3.2.0.
+ We may have to tweak the number of months specified for each type, if there's too much or too little mail resulting from the grep. but 38 months / 6 months worked well for 3.3.0.
== 4.2 tweak rules for evolver ==
- Go through the rulesrc dir, comment out all "score" lines except
- for rules that you think the scores are accurate like carefully-vetted net rules, or 0.001 informational rules.
+ Go through the rulesrc dir, comment out all "score" lines except for rules that you think the scores are accurate like carefully-vetted net rules, or 0.001 informational rules.
== 4.3 resync to mcsnapshot rules list ==
@@ -145, +144 @@
{{{
cd /path/to/checkout/of/trunk
svn co \
- https://svn.apache.org/repos/asf/spamassassin/tags/3_2_0_mcsnapshot_1/rules \
+ https://svn.apache.org/repos/asf/spamassassin/tags/3_3_0_mcsnapshot_1/rules \
rules-mcsnapshot
cp rules-mcsnapshot/active.list rules/active.list
make
@@ -159, +158 @@
== 5. generate scores for score sets ==
- See RunningGa. (in the past we used RunningPerceptron, but it acted up during 3.2.0 generation, so we used the GA again.)
+ See RunningGa. (in the past we used RunningPerceptron, but it acted up during 3.3.0 generation, so we used the GA again.)
Once this is complete, rules/50_scores.cf will have the generated scores, created by runGA. (TODO: I think.)
@@ -185, +184 @@
Since stuff like the STATISTICS cannot ever be regenerated without the (randomised) test logs, these need to be saved, too. Currently, I think the best bet is to upload the {{{rescore-logs.tgz}}} file somewhere on spamassassin.zones.apache.org; it doesn't have to be in a public place, ASF-committer-account-required is fine. Just mention that path in the rescoring bug's comments. last time, I did this:
{{{
- sudo mkdir /home/corpus-rsync/ARCHIVE/3.2.0
+ sudo mkdir /home/corpus-rsync/ARCHIVE/3.3.0
- sudo mv rescore-logs.tgz /home/corpus-rsync/ARCHIVE/3.2.0/rescore-logs-bug5270.tgz
+ sudo mv rescore-logs.tgz /home/corpus-rsync/ARCHIVE/3.3.0/rescore-logs-bug6155.tgz
}}}
== 6.5. mark evolved-score rules as 'always published' ==