You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2007/01/25 18:57:18 UTC
NOTICE: 3.2.0 rescoring mass-checks
hi all --
OK, if you're planning to send us mass-check logs for the 3.2.0 rescoring,
now's the time!
http://wiki.apache.org/spamassassin/RescoreDetails has all the details.
Note that the deadline for result submission is Tuesday, Feb 6 as
described at http://wiki.apache.org/spamassassin/Release320Schedule .
cheers!
--j.
Re: NOTICE: 3.2.0 rescoring mass-checks
Posted by Doc Schneider <ma...@maddoc.net>.
Fred Tarasevicius wrote:
> Hello Justin,
>
> Thursday, January 25, 2007, 12:57:18 PM, you wrote:
>
>> hi all --
>
>> OK, if you're planning to send us mass-check logs for the 3.2.0 rescoring,
>> now's the time!
>
> OK, so we can start running the tests now? To ensure I am correct at
> how to go about this, we just svn update the latest release, start the
> mass-checks as outlined on the wiki page and send away when we are
> done?
>
Nope you need to go to the wiki page he said. There is a custom tarball
for masschecking.
See:
http://wiki.apache.org/spamassassin/RescoreDetails
--
-Doc
Penguins: Do it on the ice.
1:04pm up 11 days, 22:02, 15 users, load average: 0.34, 0.50, 0.56
SARE HQ http://www.rulesemporium.com/
Re: NOTICE: 3.2.0 rescoring mass-checks
Posted by Fred Tarasevicius <te...@i-is.com>.
Hello Justin,
Thursday, January 25, 2007, 12:57:18 PM, you wrote:
> hi all --
> OK, if you're planning to send us mass-check logs for the 3.2.0 rescoring,
> now's the time!
OK, so we can start running the tests now? To ensure I am correct at
how to go about this, we just svn update the latest release, start the
mass-checks as outlined on the wiki page and send away when we are
done?
--
Best regards,
Fred mailto:tech2@i-is.com
Re: NOTICE: 3.2.0 rescoring mass-checks
Posted by Matthias Leisi <ma...@leisi.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi all,
My mass-check run shows a considerable number of bayes/locking errors:
| bayes: cannot open bayes databases
| /opt/masscheck-3.2.0/mcsnapshot/masses/spamassassin/bayes_* R/W: lock
| failed: Interrupted system call
nohup.out has 118 such entries for 17'912 mails done. Is this something
I should be worried about or even worthy of opening a bug?
It's a bit hard for me to diagnose this in more details, as I don't want
to break the actual mass-check run. I'm running it with
| nohup ./mass-check --progress --bayes --net -j 4 --restart=400 \
| --learn=35 --reuse --after=1072933200 \
| spam:dir:...
- -- Matthias
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org
iD8DBQFFump+xbHw2nyi/okRAlE1AJ9XeguMdklC2JgjE8NGkTM/g+e9xQCgiKs6
Ow+hx/2QnybFWIxWFmjw8fk=
=4SPo
-----END PGP SIGNATURE-----
Re: NOTICE: 3.2.0 rescoring mass-checks
Posted by Doc Schneider <ma...@maddoc.net>.
Matthias Leisi wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi all,
>
> As you may know, I'm running the dnswl.org project. Thanks to the rules
> in Theo's sandbox [1], the mass-checks will query dnswl.org.
>
> In order to estimate the performance / bandwidth impact on the server
> side, I would ask you to provide me with the name / IP address of the
> DNS servers you use to run the mass-checks and roughly the date/time
> (incl. timezone) when you started the mass-check.
>
> Only one of the servers is writing detailed logs, and this will not
> influence the actual mass-check results, but it is an opportunity to
> assess the SpamAssassin-related impact.
>
> Thanks for your help,
> - -- Matthias
>
> [1]
> http://svn.apache.org/viewvc/spamassassin/rules/trunk/sandbox/felicity/70_dnswl.cf?view=markup
>
I'm using a machine in the 64.21.208.208/28 netblock. I haven't yet
decided which one I'll be using... am still trying to sort through my
500k spam.
Just curious if this is a new network test and list? If so, you might
see about finding some more DNS mirrors to host it. (Yeah I might be
interested in doing this, contact me privately)
--
-Doc
SA/SARE -- Ninja
10:08am up 12 days, 19:06, 15 users, load average: 0.41, 1.74, 1.52
SARE HQ http://www.rulesemporium.com/
Re: NOTICE: 3.2.0 rescoring mass-checks
Posted by Matthias Leisi <ma...@leisi.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi all,
As you may know, I'm running the dnswl.org project. Thanks to the rules
in Theo's sandbox [1], the mass-checks will query dnswl.org.
In order to estimate the performance / bandwidth impact on the server
side, I would ask you to provide me with the name / IP address of the
DNS servers you use to run the mass-checks and roughly the date/time
(incl. timezone) when you started the mass-check.
Only one of the servers is writing detailed logs, and this will not
influence the actual mass-check results, but it is an opportunity to
assess the SpamAssassin-related impact.
Thanks for your help,
- -- Matthias
[1]
http://svn.apache.org/viewvc/spamassassin/rules/trunk/sandbox/felicity/70_dnswl.cf?view=markup
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org
iD8DBQFFug+QxbHw2nyi/okRAgL3AJ4nbLi65IIMja5GTZZXG8DkTeDSbwCgpu4E
6/OmkA0qHu1p5n22hT6TrYE=
=ru0Y
-----END PGP SIGNATURE-----
Re: NOTICE: 3.2.0 rescoring mass-checks
Posted by Michael Parker <pa...@pobox.com>.
Daryl C. W. O'Shea wrote:
> Justin Mason wrote:
>> hi all --
>>
>> OK, if you're planning to send us mass-check logs for the 3.2.0
>> rescoring,
>> now's the time!
>>
>> http://wiki.apache.org/spamassassin/RescoreDetails has all the details.
>
> Why do the instructions have bayes auto learning and AWL turned off?
>
AWL isn't scored, so no need to slow things down with the DB interaction.
The --learn=35 does 35% random bayes learning, which simulates human
learning so auto learn is turned off.
Michael
> echo "bayes_auto_learn 0" > spamassassin/user_prefs
> echo "lock_method flock" >> spamassassin/user_prefs
> echo "bayes_store_module Mail::SpamAssassin::BayesStore::SDBM" >>
> spamassassin/user_prefs
> echo "use_auto_whitelist 0" >> spamassassin/user_prefs
> echo "whitelist_bounce_relays example.com" >> spamassassin/user_prefs
>
>
> Daryl
>
Re: NOTICE: 3.2.0 rescoring mass-checks
Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.
Nevermind... it looks like my Meridian phone system isn't the only thing
that lost its memory today.
Daryl C. W. O'Shea wrote:
> Justin Mason wrote:
>> hi all --
>>
>> OK, if you're planning to send us mass-check logs for the 3.2.0
>> rescoring,
>> now's the time!
>>
>> http://wiki.apache.org/spamassassin/RescoreDetails has all the details.
>
> Why do the instructions have bayes auto learning and AWL turned off?
>
> echo "bayes_auto_learn 0" > spamassassin/user_prefs
> echo "lock_method flock" >> spamassassin/user_prefs
> echo "bayes_store_module Mail::SpamAssassin::BayesStore::SDBM" >>
> spamassassin/user_prefs
> echo "use_auto_whitelist 0" >> spamassassin/user_prefs
> echo "whitelist_bounce_relays example.com" >> spamassassin/user_prefs
>
>
> Daryl
>
RE: NOTICE: 3.2.0 rescoring mass-checks
Posted by Giampaolo Tomassoni <g....@libero.it>.
From: Theo Van Dinter [mailto:felicity@apache.org]
>
> On Mon, Jan 29, 2007 at 10:32:43PM +0100, Giampaolo Tomassoni wrote:
> > > Why do the instructions have bayes auto learning and AWL turned off?
> >
> > I guess because mass-check logs must be based on an absolute
> basis: two copies of the very same e-mail checked at beginning
> and at end of the list shall score the same. This wouldn't hold
> with AWK and bayes auto-learning.
>
> Messages aren't going to score the same at the beginning and end
> with Bayes.
> The idea is that you *want* to learn from mails as they go through.
>
> The reasons are:
>
> a) AWL is meaningless for score runs, so don't bother.
> b) "mass-check --learn" forces autolearning in mass-check on a percentage
> basis, versus the normal autolearn system which is just based on
> score -- which aren't set yet.
Ok, I guess I should try this tool. At least, it would avoid a lot of "guessing" from me... :)
Thanks,
Giampaolo
>
> --
> Randomly Selected Tagline:
> The following two statements are usually both true:
> There's not enough documentation.
> There's too much documentation.
> -- Larry Wall in <19...@wall.org>
>
Re: NOTICE: 3.2.0 rescoring mass-checks
Posted by Theo Van Dinter <fe...@apache.org>.
On Mon, Jan 29, 2007 at 10:32:43PM +0100, Giampaolo Tomassoni wrote:
> > Why do the instructions have bayes auto learning and AWL turned off?
>
> I guess because mass-check logs must be based on an absolute basis: two copies of the very same e-mail checked at beginning and at end of the list shall score the same. This wouldn't hold with AWK and bayes auto-learning.
Messages aren't going to score the same at the beginning and end with Bayes.
The idea is that you *want* to learn from mails as they go through.
The reasons are:
a) AWL is meaningless for score runs, so don't bother.
b) "mass-check --learn" forces autolearning in mass-check on a percentage
basis, versus the normal autolearn system which is just based on
score -- which aren't set yet.
--
Randomly Selected Tagline:
The following two statements are usually both true:
There's not enough documentation.
There's too much documentation.
-- Larry Wall in <19...@wall.org>
RE: NOTICE: 3.2.0 rescoring mass-checks
Posted by Giampaolo Tomassoni <g....@libero.it>.
From: Daryl C. W. O'Shea [mailto:spamassassin@dostech.ca]
>
> Justin Mason wrote:
> > hi all --
> >
> > OK, if you're planning to send us mass-check logs for the 3.2.0
> rescoring,
> > now's the time!
> >
> > http://wiki.apache.org/spamassassin/RescoreDetails has all the details.
>
> Why do the instructions have bayes auto learning and AWL turned off?
I guess because mass-check logs must be based on an absolute basis: two copies of the very same e-mail checked at beginning and at end of the list shall score the same. This wouldn't hold with AWK and bayes auto-learning.
Cheers,
Giampaolo
>
> echo "bayes_auto_learn 0" > spamassassin/user_prefs
> echo "lock_method flock" >> spamassassin/user_prefs
> echo "bayes_store_module Mail::SpamAssassin::BayesStore::SDBM" >>
> spamassassin/user_prefs
> echo "use_auto_whitelist 0" >> spamassassin/user_prefs
> echo "whitelist_bounce_relays example.com" >> spamassassin/user_prefs
>
>
> Daryl
Re: NOTICE: 3.2.0 rescoring mass-checks
Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.
Justin Mason wrote:
> hi all --
>
> OK, if you're planning to send us mass-check logs for the 3.2.0 rescoring,
> now's the time!
>
> http://wiki.apache.org/spamassassin/RescoreDetails has all the details.
Why do the instructions have bayes auto learning and AWL turned off?
echo "bayes_auto_learn 0" > spamassassin/user_prefs
echo "lock_method flock" >> spamassassin/user_prefs
echo "bayes_store_module Mail::SpamAssassin::BayesStore::SDBM" >>
spamassassin/user_prefs
echo "use_auto_whitelist 0" >> spamassassin/user_prefs
echo "whitelist_bounce_relays example.com" >> spamassassin/user_prefs
Daryl
Re: NOTICE: 3.2.0 rescoring mass-checks
Posted by Fred Tarasevicius <te...@i-is.com>.
Hello Justin,
Thursday, January 25, 2007, 12:57:18 PM, you wrote:
> hi all --
> OK, if you're planning to send us mass-check logs for the 3.2.0 rescoring,
> now's the time!
OK, so we can start running the tests now? To ensure I am correct at
how to go about this, we just svn update the latest release, start the
mass-checks as outlined on the wiki page and send away when we are
done?
--
Best regards,
Fred mailto:tech2@i-is.com