You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by James Lay <jl...@slave-tothe-box.net> on 2007/09/26 16:29:27 UTC

Purpose for SpamAssassin using MySQL

Hello all!

I saw a post a couple days ago about converting to MySQL with SpamAssassin
and wondered what the purpose would be for that?  Just reporting?  And if
so, is there a reporting package for use with MySQL and SpamAssassin?
Thanks for the assist.

James



Re: Purpose for SpamAssassin using MySQL

Posted by James Lay <jl...@slave-tothe-box.net>.


On 9/26/07 8:50 AM, "Chris St. Pierre" <st...@NebrWesleyan.edu> wrote:

> On Wed, 26 Sep 2007, James Lay wrote:
> 
>> I saw a post a couple days ago about converting to MySQL with SpamAssassin
>> and wondered what the purpose would be for that?  Just reporting?  And if
>> so, is there a reporting package for use with MySQL and SpamAssassin?
>> Thanks for the assist.
> 
> We have more than one MX.  Using a MySQL backend for Bayes and AWL
> lets me share that data between our MXes.
> 
> Chris St. Pierre
> Unix Systems Administrator
> Nebraska Wesleyan University
> 

Ah...never thought of that....danke :)

James



Re: Purpose for SpamAssassin using MySQL

Posted by "Chris St. Pierre" <st...@NebrWesleyan.edu>.
On Wed, 26 Sep 2007, James Lay wrote:

> I saw a post a couple days ago about converting to MySQL with SpamAssassin
> and wondered what the purpose would be for that?  Just reporting?  And if
> so, is there a reporting package for use with MySQL and SpamAssassin?
> Thanks for the assist.

We have more than one MX.  Using a MySQL backend for Bayes and AWL
lets me share that data between our MXes.

Chris St. Pierre
Unix Systems Administrator
Nebraska Wesleyan University


Re: Purpose for SpamAssassin using MySQL

Posted by Rajkumar S <ra...@gmail.com>.
On 10/3/07, Rob Mangiafico <rm...@lexiconn.com> wrote:
> Picking up on the point of one Bayes DB in MySQL vs. individual ones for
> each user, is it more effective in an ISP/host environment where you have
> diverse users to have them all share one Bayes DB with autolearn, or is it
> better if they each have their own Bayes data in MySQL (per user)?

When you are in an ISP environment, at whcih point does SA run? ie,
are you running SA when you receive the mail (aka simscan) or when you
deliver the mails (LDA like procmail) If I am not mistaken only LDA
knows to whom the mail is destined, after taking care of BCC, CC etc.
But the problem with running SA at LDA is that it is not possible to
reject the mail if it's spam (talking from my experience with qmail).
I can bounce the mail, but it's always better if I do not accept a
spam mail in the first place.

raj

Re: Purpose for SpamAssassin using MySQL

Posted by Michal Jeczalik <mi...@jeczalik.com>.
On Wed, 3 Oct 2007, Rob Mangiafico wrote:

> On Tue, 2 Oct 2007, [iso-8859-2] Micha? J?czalik wrote:
>> There are many. It allows you to share data between user accounts (IMHO it
>> doesn't make much sense to have separate bayes databases for each account,
>> at least they are of a 'massive' sort and users are not allowed to feed
>> their own spam/ham etc. - because they share mostly the same data and the
>> bayes is more up-to-date if one single database autolearns from many
>> mailboxes). It allows you to share data among several hosts. It allows
>> you to keep data on a remote host if you don't have enough space. Etc.
>
> Picking up on the point of one Bayes DB in MySQL vs. individual ones for
> each user, is it more effective in an ISP/host environment where you have
> diverse users to have them all share one Bayes DB with autolearn, or is it
> better if they each have their own Bayes data in MySQL (per user)?
>
> We're slowly converting to mysql for bayes, and have not decided yet which
> method would be best for our users and for the servers in general. Thanks.

Sorry for a late answer. Of course it's more effective. This was the major 
reason for me to do it. Then you have one bayes db, one autoexpire, you 
need space only for one db. If anything goes wrong (some disk failure, or 
db malfunction) you need to recreate only one db.

If you don't have any significant reason to have per-user bayes 
databases, then you should probably use one-for-all method.

And one more advantage - I'm not too much into SQL performance stuff, but 
one-for-all is probably faster, because the SQL engine doesn't have to 
look up for multiple (possibly thousands) different bayes databases and 
probably it's able to cache at least some of those bayes tokens. Remember 
that on a large system it's common to receive the same spam message to 
multiple mailboxes at one time.
-- 
Michał Jęczalik, +48.603.64.62.97
INFONAUTIC, +48.33.487.69.04


Re: Purpose for SpamAssassin using MySQL

Posted by bg...@idcomm.com.
Rob Mangiafico wrote:
> On Tue, 2 Oct 2007, [iso-8859-2] Micha³ Jêczalik wrote:
>> There are many. It allows you to share data between user accounts (IMHO it 
>> doesn't make much sense to have separate bayes databases for each account, 
>> at least they are of a 'massive' sort and users are not allowed to feed 
>> their own spam/ham etc. - because they share mostly the same data and the 
>> bayes is more up-to-date if one single database autolearns from many 
>> mailboxes). It allows you to share data among several hosts. It allows 
>> you to keep data on a remote host if you don't have enough space. Etc.
> 
> Picking up on the point of one Bayes DB in MySQL vs. individual ones for 
> each user, is it more effective in an ISP/host environment where you have 
> diverse users to have them all share one Bayes DB with autolearn, or is it 
> better if they each have their own Bayes data in MySQL (per user)?
> 
> We're slowly converting to mysql for bayes, and have not decided yet which 
> method would be best for our users and for the servers in general. Thanks.
> 
> Rob
> 
> 

Per-user Bayes should be more accurate for each user assuming assuming
the user can train false positive/negative, using the spam button to
"unsubscribe" doesn't impact other user's accuracy. However there is a
significant storage cost of per-user Bayes.

Re: Purpose for SpamAssassin using MySQL

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.
Rob Mangiafico wrote:
> Picking up on the point of one Bayes DB in MySQL vs. individual ones for 
> each user, is it more effective in an ISP/host environment where you have 
> diverse users to have them all share one Bayes DB with autolearn, or is it 
> better if they each have their own Bayes data in MySQL (per user)?

When I'm forced to use bayes in a large setup I prefer to go with per 
domain databases for domains with more than a couple of users and use a 
global database for all of the domains with only a few users each.

Daryl


Re: Purpose for SpamAssassin using MySQL

Posted by Rob Mangiafico <rm...@lexiconn.com>.
On Tue, 2 Oct 2007, [iso-8859-2] Micha³ Jêczalik wrote:
> There are many. It allows you to share data between user accounts (IMHO it 
> doesn't make much sense to have separate bayes databases for each account, 
> at least they are of a 'massive' sort and users are not allowed to feed 
> their own spam/ham etc. - because they share mostly the same data and the 
> bayes is more up-to-date if one single database autolearns from many 
> mailboxes). It allows you to share data among several hosts. It allows 
> you to keep data on a remote host if you don't have enough space. Etc.

Picking up on the point of one Bayes DB in MySQL vs. individual ones for 
each user, is it more effective in an ISP/host environment where you have 
diverse users to have them all share one Bayes DB with autolearn, or is it 
better if they each have their own Bayes data in MySQL (per user)?

We're slowly converting to mysql for bayes, and have not decided yet which 
method would be best for our users and for the servers in general. Thanks.

Rob


Re: Purpose for SpamAssassin using MySQL

Posted by Michał Jęczalik <mi...@jeczalik.com>.
On Wed, 26 Sep 2007, James Lay wrote:

> I saw a post a couple days ago about converting to MySQL with SpamAssassin
> and wondered what the purpose would be for that?  Just reporting?  And if

There are many. It allows you to share data between user accounts (IMHO it 
doesn't make much sense to have separate bayes databases for each account, 
at least they are of a 'massive' sort and users are not allowed to feed 
their own spam/ham etc. - because they share mostly the same data and the 
bayes is more up-to-date if one single database autolearns from many 
mailboxes). It allows you to share data among several hosts. It allows 
you to keep data on a remote host if you don't have enough space. Etc.

Perhaps if you are a single user on your machine, converting to sql 
storage is not worth any time spent to do this, but in a more complex 
enviroment it simplifies several issues.
-- 
Michał Jęczalik, +48.603.64.62.97
INFONAUTIC, +48.33.487.69.04


Re: Purpose for SpamAssassin using MySQL

Posted by Raquel <ra...@thericehouse.net>.
On Wed, 26 Sep 2007 08:29:27 -0600
James Lay <jl...@slave-tothe-box.net> wrote:

> Hello all!
> 
> I saw a post a couple days ago about converting to MySQL with
> SpamAssassin and wondered what the purpose would be for that? 
> Just reporting?  And if so, is there a reporting package for use
> with MySQL and SpamAssassin? Thanks for the assist.
> 
> James
> 

I can answer only for the reasons I made the change to MySQL.  I was
having trouble getting it to work right on the new server (Debian
Etch/Sendmail).  The permissions weren't right, no matter how hard I
tried or what I did.  After extensive googling I discovered I wasn't
alone.  The server would bog down with CPU and memory usage going
through the roof.  Because of client needs I couldn't wait any
longer.  I switched to using MySQL and all that went away.  It also
gives me centralized configuration for all clients.  It also allows
me to write a PHP program for clients to tweak their configuration,
or to use one that is already available.

-- 
Raquel
============================================================
I hold it, that a little rebellion, now and then, is a good thing,
and as necessary in the political world as storms in the physical.
  --Thomas Jefferson