You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Johnson, S" <sj...@edina.k12.mn.us> on 2005/05/10 18:04:47 UTC

Weighing spam with sa-learn

 

I'm looking at creating an email address to capture spam and only be
used for spam.  Since I can be guaranteed that all email I received to
this address is spam, is there a way to weigh this higher in the
sa-learn?

 



=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Confidentiality Notice

If the information in this electronic communication relates to an individual pupil, it is a confidential pupil record under Minnesota Law and may not be reviewed, distributed, or copied by any person other than the individual(s) to whom it is addressed. This electronic communication is intended solely for the use of the individual(s) to whom it is addressed. If you are not the intended recipient, any further review, dissemination, distribution, or copying of this electronic communication or any attachment thereto is strictly prohibited. If you have received an electronic communication in error, you should immediately return it to the sender and delete it from your system.


Re: Weighing spam with sa-learn

Posted by Matt Kettler <mk...@comcast.net>.
At 12:04 PM 5/10/2005, Johnson, S wrote:
>I'm looking at creating an email address to capture spam and only be used 
>for spam.  Since I can be guaranteed that all email I received to this 
>address is spam, is there a way to weigh this higher in the sa-learn?

No. Although exhaustive or weighted training are both interesting in some 
situations, SA currently uses a pure single-weight training method.

Really the only reason you should want to weigh them higher is if you're 
training less spam than ham, something which is very uncommon.

"middle of the road" tokens that exist equally in spam and ham will be 
biased by your training ratios. The use of Chi-squared combining covers up 
most problems from this, but if your effective training ratio gets way off 
(ie: greater than 1:20) then you might start having some FP problems on ham 
mail as common words start weighing in with a high probability of spam..





Re: Weighing spam with sa-learn

Posted by Craig McLean <cr...@craig.dnsalias.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

jdow wrote:
| From: "Craig McLean" <cr...@craig.dnsalias.com>
|
|>-----BEGIN PGP SIGNED MESSAGE-----
|>Hash: SHA1
|>
|>Johnson, S wrote:
|>
|>>
|>>I'm looking at creating an email address to capture spam and only be
|>>used for spam.  Since I can be guaranteed that all email I received to
|>>this address is spam, is there a way to weigh this higher in the
|
| sa-learn?
|
|>Spam is spam, and ham is ham. AFAIUnderstand that's all there is to it.
|>I have a spam-trap account which I've "advertised" on just about all the
|>crappy pages I can think of, and simply "sa-learn --spam" the whole inbox
|>every 3 hours.
|>I do the ham learning manually on my personal inbox once it's been checked
|>for suspect meat product, or let SA autolearn it.
|>
|>Spam-traps are useful for many reasons, one of which is to see what might
|>be getting through. It's one of the compelling reasons that clamav is now
|>installed here, and why I have a bundle of local rules...
|>
|>Kind Regards,
|>Craig.
|
|
| One also needs guaranteed ham for proper training. I hope he doesn't
| think all he needs to feed salearn is spam.
|
| {^_^}
|
|

jdow,
See above where I said:

"I do the ham learning manually on my personal inbox once it's been
checked for suspect meat product, or let SA autolearn it."

Kind Regards,
Craig.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFCgQhBMDDagS2VwJ4RAnLIAJ4/swOZmTyijwGIVgi6KCPTMvkwzACfbsYF
ReezH5XhED9pvCcngnVB2GM=
=Y+wX
-----END PGP SIGNATURE-----

Re: Weighing spam with sa-learn

Posted by jdow <jd...@earthlink.net>.
From: "Craig McLean" <cr...@craig.dnsalias.com>

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Johnson, S wrote:
> >
> >
> > I'm looking at creating an email address to capture spam and only be
> > used for spam.  Since I can be guaranteed that all email I received to
> > this address is spam, is there a way to weigh this higher in the
sa-learn?
> >
>
> Spam is spam, and ham is ham. AFAIUnderstand that's all there is to it.
> I have a spam-trap account which I've "advertised" on just about all the
> crappy pages I can think of, and simply "sa-learn --spam" the whole inbox
> every 3 hours.
> I do the ham learning manually on my personal inbox once it's been checked
> for suspect meat product, or let SA autolearn it.
>
> Spam-traps are useful for many reasons, one of which is to see what might
> be getting through. It's one of the compelling reasons that clamav is now
> installed here, and why I have a bundle of local rules...
>
> Kind Regards,
> Craig.

One also needs guaranteed ham for proper training. I hope he doesn't
think all he needs to feed salearn is spam.

{^_^}



Re: Weighing spam with sa-learn

Posted by Craig McLean <cr...@craig.dnsalias.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Johnson, S wrote:
>
>
> I’m looking at creating an email address to capture spam and only be
> used for spam.  Since I can be guaranteed that all email I received to
> this address is spam, is there a way to weigh this higher in the sa-learn?
>

Spam is spam, and ham is ham. AFAIUnderstand that's all there is to it.
I have a spam-trap account which I've "advertised" on just about all the
crappy pages I can think of, and simply "sa-learn --spam" the whole inbox
every 3 hours.
I do the ham learning manually on my personal inbox once it's been checked
for suspect meat product, or let SA autolearn it.

Spam-traps are useful for many reasons, one of which is to see what might
be getting through. It's one of the compelling reasons that clamav is now
installed here, and why I have a bundle of local rules...

Kind Regards,
Craig.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (FreeBSD)

iD8DBQFCgN3CMDDagS2VwJ4RAvcvAJ9XfsXztFpqL3mLaBWf4jbJkLOroACfY1Yl
KTkX/C+lNcAZFeOoFBlRtHk=
=PHke
-----END PGP SIGNATURE-----