You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2005/07/20 22:22:47 UTC

Re: NOTICE: rescore mass-checks

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Theo Van Dinter writes:
> On Wed, Jul 20, 2005 at 12:44:06PM -0700, Justin Mason wrote:
> > fyi -- we agreed that the submissions rsync for the rescoring mass-checks
> > would be open until the 27th, since Henry won't be able to look at
> > them until then anyway.
> 
> BTW, we're currently at:
> 
> 382291 ham, and 1068797 spam, though some spam may end up going away (no ham
> file associated with the name)

hmm -- do they have to have a ham file?  I don't think there's a need for
that to be a rule.

anyone know what kind of targets Henry is after for corpus size?
I have no idea what works well here. ;)

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFC3rKXMJF5cimLx9ARAozuAJ9P5ViIDyeDzE+CMLZ+k/uWd1bY2ACfTq2i
TEQHN0L83AyElnsuHinxZs8=
=thW0
-----END PGP SIGNATURE-----


Re: NOTICE: rescore mass-checks

Posted by Henry Stern <he...@stern.ca>.
I'm not sure what I'll *need* to make good scores.  Last time around, the 
results were pants (--reuse was broken), so I don't have much to go on as 
far as numbers are concerned.

Cheers,
Henry

On Wed, 20 Jul 2005, Justin Mason wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> Theo Van Dinter writes:
>> On Wed, Jul 20, 2005 at 12:44:06PM -0700, Justin Mason wrote:
>>> fyi -- we agreed that the submissions rsync for the rescoring mass-checks
>>> would be open until the 27th, since Henry won't be able to look at
>>> them until then anyway.
>>
>> BTW, we're currently at:
>>
>> 382291 ham, and 1068797 spam, though some spam may end up going away (no ham
>> file associated with the name)
>
> hmm -- do they have to have a ham file?  I don't think there's a need for
> that to be a rule.
>
> anyone know what kind of targets Henry is after for corpus size?
> I have no idea what works well here. ;)
>
> - --j.
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.5 (GNU/Linux)
> Comment: Exmh CVS
>
> iD8DBQFC3rKXMJF5cimLx9ARAozuAJ9P5ViIDyeDzE+CMLZ+k/uWd1bY2ACfTq2i
> TEQHN0L83AyElnsuHinxZs8=
> =thW0
> -----END PGP SIGNATURE-----
>

Re: NOTICE: rescore mass-checks

Posted by Duncan Findlay <du...@debian.org>.
On Wed, Jul 20, 2005 at 01:22:47PM -0700, Justin Mason wrote:
> hmm -- do they have to have a ham file?  I don't think there's a need for
> that to be a rule.

I think there is. Without spam, I'd be pretty leary of the quality of
the corpus, but specifically the quality of the BAYES results.

-- 
Duncan Findlay