You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Duane Hill <d....@yournetplus.com> on 2006/01/02 04:00:43 UTC

Per-User - Bayes Learning

Hello All,

I  have  e-mail  accounts  that  have  been sending Spam to a specific
e-mail  address  as  an attachment for some time now. Before they were
manually  gone  through as I didn't have anything specific set up on a
per-account basis.

Now  that  I have SA on our Win2K server storing everything in a MySQL
schema,  I  would  like  to automate the process more. I have a script
that  I  wrote  that  will take and strip out any attached message and
uses  sa-learn.  However,  sa-learn  seems  to  be  time consuming (at
minimum,  9 seconds per attached message submitted). Is there anything
that can be done to speed up the process?

--

"This message is made of 100% recycled electrons."


Re: Per-User - Bayes Learning

Posted by Duane Hill <d....@yournetplus.com>.
On Monday, January 2, 2006 at 3:56:40 AM, mkettler_sa@comcast.net confabulated:

> At 10:00 PM 1/1/2006, Duane Hill wrote:
>>Hello All,
>>
>>I  have  e-mail  accounts  that  have  been sending Spam to a specific
>>e-mail  address  as  an attachment for some time now. Before they were
>>manually  gone  through as I didn't have anything specific set up on a
>>per-account basis.
>>
>>Now  that  I have SA on our Win2K server storing everything in a MySQL
>>schema,  I  would  like  to automate the process more. I have a script
>>that  I  wrote  that  will take and strip out any attached message and
>>uses  sa-learn.  However,  sa-learn  seems  to  be  time consuming (at
>>minimum,  9 seconds per attached message submitted). Is there anything
>>that can be done to speed up the process?


> Are you using the mysql.pm bayes store module, or the default generic one?

Mail::SpamAssassin::BayesStore::MySQL is what I'm using currently.

> If you're using the generic sql.pm, I'd suggest switching. The learning
> time is cut by more than half.
> http://wiki.apache.org/spamassassin/BayesBenchmarkResults
> (1a and 1b are learning).

> Also, if you're using SA 3.1.0  you can learn using spamc -L, which will
> take advantage of spamd instead of spawning a whole new perl instance. Very
> useful if you do a lot of learning, but I'll warn you this is a newish
> feature and it might have some growing pains (I've not used it)

For this, I will have to load Cygwin (unless I don't find a spamd port
for Windows). Thanks for the suggestion. I will give it a whirl.

> http://spamassassin.apache.org/full/3.1.x/dist/doc/spamc.html

--

"This message is made of 100% recycled electrons."


Re: Per-User - Bayes Learning

Posted by Matt Kettler <mk...@comcast.net>.
At 10:00 PM 1/1/2006, Duane Hill wrote:
>Hello All,
>
>I  have  e-mail  accounts  that  have  been sending Spam to a specific
>e-mail  address  as  an attachment for some time now. Before they were
>manually  gone  through as I didn't have anything specific set up on a
>per-account basis.
>
>Now  that  I have SA on our Win2K server storing everything in a MySQL
>schema,  I  would  like  to automate the process more. I have a script
>that  I  wrote  that  will take and strip out any attached message and
>uses  sa-learn.  However,  sa-learn  seems  to  be  time consuming (at
>minimum,  9 seconds per attached message submitted). Is there anything
>that can be done to speed up the process?


Are you using the mysql.pm bayes store module, or the default generic one?

If you're using the generic sql.pm, I'd suggest switching. The learning 
time is cut by more than half.
http://wiki.apache.org/spamassassin/BayesBenchmarkResults
(1a and 1b are learning).

Also, if you're using SA 3.1.0  you can learn using spamc -L, which will 
take advantage of spamd instead of spawning a whole new perl instance. Very 
useful if you do a lot of learning, but I'll warn you this is a newish 
feature and it might have some growing pains (I've not used it)

http://spamassassin.apache.org/full/3.1.x/dist/doc/spamc.html