You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2005/08/06 06:23:12 UTC

[Bug 4519] New: Add multi_tok_count_change to BayesStore API

http://bugzilla.spamassassin.org/show_bug.cgi?id=4519

           Summary: Add multi_tok_count_change to BayesStore API
           Product: Spamassassin
           Version: SVN Trunk (Latest Devel Version)
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: Learner
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: parkerm@pobox.com


Adding a multi_tok_count_change method to the BayesStore API that allows passing
of multiple tokens to change their ham/spam count and atime values.

This allows for a large speed up in the MySQL and PgSQL storage modules.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4519] [review] Add multi_tok_count_change to BayesStore API

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4519


parkerm@pobox.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|enhancement                 |critical






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4519] [review] Add multi_tok_count_change to BayesStore API

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4519





------- Additional Comments From duncf@debian.org  2005-08-10 22:03 -------
+1



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4519] Add multi_tok_count_change to BayesStore API

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4519





------- Additional Comments From parkerm@pobox.com  2005-08-06 00:49 -------
Fixed it by sorting the tokens before passing them into the stored procedure. 
Running benchmarks now, I'll upload the patch later.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4519] Add multi_tok_count_change to BayesStore API

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4519


parkerm@pobox.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
Attachment #3057 is|0                           |1
           obsolete|                            |




------- Additional Comments From parkerm@pobox.com  2005-08-06 23:37 -------
Created an attachment (id=3058)
 --> (http://bugzilla.spamassassin.org/attachment.cgi?id=3058&action=view)
Patch File




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4519] Add multi_tok_count_change to BayesStore API

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4519





------- Additional Comments From parkerm@pobox.com  2005-08-05 23:43 -------
I tried pulling out the tok_touch_all code into a stored procedure, looping over
the tokens and updating one at a time, but this did not help.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4519] [review] Add multi_tok_count_change to BayesStore API

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4519


parkerm@pobox.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Status Whiteboard|Needs 1 more vote for new   |ready to commit
                   |patch                       |






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4519] [review] Add multi_tok_count_change to BayesStore API

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4519


parkerm@pobox.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Status Whiteboard|ready to commit             |Needs 1 more vote for new
                   |                            |patch






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4519] [review] Add multi_tok_count_change to BayesStore API

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4519





------- Additional Comments From jm@jmason.org  2005-08-09 00:05 -------
+1

stored SQL procedures are scary ;)



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4519] [review] Add multi_tok_count_change to BayesStore API

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4519


parkerm@pobox.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
Attachment #3058 is|0                           |1
           obsolete|                            |




------- Additional Comments From parkerm@pobox.com  2005-08-07 13:21 -------
Created an attachment (id=3059)
 --> (http://bugzilla.spamassassin.org/attachment.cgi?id=3059&action=view)
Patch File for 3.1 Branch

Committed to HEAD.

Here is the 3.1 branch specific patch.

Don't be scared of the size, a large portion of the size is due to copying code
that has been working very well in the MySQL.pm module to PgSQL.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4519] [review] Add multi_tok_count_change to BayesStore API

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4519


parkerm@pobox.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|Add multi_tok_count_change  |[review] Add
                   |to BayesStore API           |multi_tok_count_change to
                   |                            |BayesStore API




------- Additional Comments From parkerm@pobox.com  2005-08-06 23:38 -------
I'm very excited about this change.  It really speeds up BayesSQL by a large
factor 2x for MySQL and 7x for PostgreSQL.

Please review for inclusion in 3.1




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4519] [review] Add multi_tok_count_change to BayesStore API

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4519


duncf@debian.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Status Whiteboard|                            |ready to commit




------- Additional Comments From duncf@debian.org  2005-08-10 19:24 -------
+1

I'm not going to claim that any of the SQL stuff works, just that this patch
doesnt appear to break anything that wasnt broken before. :-)



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4519] Add multi_tok_count_change to BayesStore API

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4519





------- Additional Comments From parkerm@pobox.com  2005-08-05 21:59 -------
Created an attachment (id=3057)
 --> (http://bugzilla.spamassassin.org/attachment.cgi?id=3057&action=view)
Patch File - Does Not Work

Here is the current patch.  It is broken for the PgSQL case, getting deadlocks
in the _put_tokens and tok_touch_all methods.

Need to figure that out before we can do anything else.  Although the
benchmarks look very good.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4519] [review] Add multi_tok_count_change to BayesStore API

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4519


parkerm@pobox.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED




------- Additional Comments From parkerm@pobox.com  2005-08-11 00:02 -------
Sending        lib/Mail/SpamAssassin/Bayes.pm
Sending        lib/Mail/SpamAssassin/BayesStore/DBM.pm
Sending        lib/Mail/SpamAssassin/BayesStore/MySQL.pm
Sending        lib/Mail/SpamAssassin/BayesStore/PgSQL.pm
Sending        lib/Mail/SpamAssassin/BayesStore/SQL.pm
Sending        lib/Mail/SpamAssassin/BayesStore.pm
Sending        sql/bayes_pg.sql
Transmitting file data .......
Committed revision 231411.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4519] [review] Add multi_tok_count_change to BayesStore API

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4519





------- Additional Comments From spamassassin@dostech.ca  2005-08-10 23:58 -------
+1 stored procedures kick ass ;)



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4519] Add multi_tok_count_change to BayesStore API

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4519


parkerm@pobox.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|Undefined                   |3.1.0






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4519] [review] Add multi_tok_count_change to BayesStore API

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4519





------- Additional Comments From matt.s@aptalaska.net  2005-08-09 10:27 -------
I have this code running on one of my servers and it works 10x better than
3.1.0pre4.  This patch makes bayes/SQL fast enough to move away from the bdb
locking headache.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4519] [review] Add multi_tok_count_change to BayesStore API

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4519





------- Additional Comments From matt.s@aptalaska.net  2005-08-07 20:37 -------
I have been working with Michael on this patch.  It has been tested by both of
us and is found to be a great deal faster when doing bayes against SQL.  This
patch greatly strengthens bayes support because the SQL modules now perform well
enough to be much more useful in big sites.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4519] [review] Add multi_tok_count_change to BayesStore API

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4519


parkerm@pobox.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
Attachment #3059 is|0                           |1
           obsolete|                            |




------- Additional Comments From parkerm@pobox.com  2005-08-10 22:00 -------
Created an attachment (id=3073)
 --> (http://bugzilla.spamassassin.org/attachment.cgi?id=3073&action=view)
Patch File

This patch fixes an additional bug in SQL.pm which ended up without a
_put_tokens method.

Now it has one.  Same as the other patch, with just a few changes in SQL.pm.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.