You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2004/09/11 02:55:57 UTC

[Bug 3771] New: PostgreSQL Specific Bayes Storage Module

http://bugzilla.spamassassin.org/show_bug.cgi?id=3771

           Summary: PostgreSQL Specific Bayes Storage Module
           Product: Spamassassin
           Version: SVN Trunk (Latest Devel Version)
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: normal
          Priority: P5
         Component: Learner
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: parkerm@pobox.com


The current SQL implementation is simply not very PostgreSQL friendly, so there
is a need for a more specific implementation.

Part of the work involved is chaning the token column in bayes_token from
char(5) to a bytea type.  This means that we have to convert some of the SQL
calls to use bind_param so that we can specify the proper type, for instance:
$sth->bind_param(2, $token, { pg_type => DBD::Pg::PG_BYTEA });

This caused the performance to go from impossible (ie I killed it after about 15
hrs on the first operation in my benchmark) to somewhat livable.

The next part is figuring out how to make it even faster.  A few ideas batted
around are:

1) Transactions: turn off autocommit and start a transaction when you tie the DB
and commit when you call untie.  PostgreSQL apparently works very well in this
model.

2) Some specific PL/pgSQL code for the _put_token method.  It's pretty
complicated and would probably do well to be implemented a little closer to the
database.

Something else to consider, although I can't think of a good way to implement is
a call to vacuum analyze built into the code.  I was only able to get sort of
decent performance after I setup a cronjob to run vacuum analyze once a minute
on the bayes tables, pretty sad really.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.