You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2007/09/28 16:38:26 UTC

[Bug 5661] New: speed up SQL queries by utilizing indexes

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5661

           Summary: speed up SQL queries by utilizing indexes
           Product: Spamassassin
           Version: SVN Trunk (Latest Devel Version)
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P5
         Component: spamc/spamd
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: micah@riseup.net


The SQL query in /usr/share/perl5/Mail/SpamAssassin/BayesStore/SQL.pm line
243 calculates the expire delta, but in a way that can't use
an index:

   my $sql = "SELECT count(*)
                FROM bayes_token
               WHERE id = ?
                AND (? - atime) > ?";



Changing this as follows would make it so it can utilize the index, thus
examining less rows (which could be locked). Mysql can't look up $newest_atime -
atime from the index because that value is made up, so it uses the "id" portion
of the index on (id, atime)... if it doesn't have to calculate the value for
$newest_atime + $something for every row and only needs to check the index then
it would be much faster.

Index: lib/Mail/SpamAssassin/BayesStore/SQL.pm
===================================================================
--- lib/Mail/SpamAssassin/BayesStore/SQL.pm     (revision 579950)
+++ lib/Mail/SpamAssassin/BayesStore/SQL.pm     (working copy)
@@ -241,7 +241,7 @@
   my $sql = "SELECT count(*)
                FROM bayes_token
               WHERE id = ?
-                AND (? - atime) > ?";
+                AND atime < ?";

   my $sth = $self->{_dbh}->prepare_cached($sql);

@@ -251,7 +251,7 @@
   }

   for (my $i = 1; $i <= $max_expire_mult; $i<<=1) {
-    my $rc = $sth->execute($self->{_userid}, $newest_atime, $start * $i);
+    my $rc = $sth->execute($self->{_userid}, $newest_atime - $start * $i);

     unless ($rc) {
       dbg("bayes: calculate_expire_delta: SQL error:
".$self->{_dbh}->errstr());

Thanks to Mark Martinec and Nils for the patch and analysis.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5661] [review] speed up SQL queries by utilizing indexes

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5661


Mark.Martinec@ijs.si changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|speed up SQL queries by     |[review] speed up SQL
                   |utilizing indexes           |queries by utilizing indexes
   Target Milestone|Undefined                   |3.2.4




------- Additional Comments From Mark.Martinec@ijs.si  2007-09-28 11:52 -------
Committed to trunk:
  $ svn -m 'simplified SELECT FROM bayes_token to be able
            to use index, see bug 5661' ci
Sending lib/Mail/SpamAssassin/BayesStore/SQL.pm
Committed revision 580455.

This seems simple enough, tentatively setting target to 3.2.4.
Time for a review?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5661] [review] speed up SQL queries by utilizing indexes

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5661





------- Additional Comments From Mark.Martinec@ijs.si  2007-10-02 10:07 -------
Created an attachment (id=4141)
 --> (http://issues.apache.org/SpamAssassin/attachment.cgi?id=4141&action=view)
proposed patch to 3.2.3




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5661] [review] speed up SQL queries by utilizing indexes

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5661


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED




------- Additional Comments From jm@jmason.org  2007-12-16 13:30 -------
committed to 3.2.x: r604713



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5661] [review] speed up SQL queries by utilizing indexes

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5661





------- Additional Comments From jm@jmason.org  2007-10-02 09:52 -------
Mark, could you create a patch against 3.2.4, and attach it?  I'm guessing it's
the one that's inline in the first comment, but it's better to be explicit.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5661] [review] speed up SQL queries by utilizing indexes

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5661


spamassassin@dostech.ca changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Status Whiteboard|needs 1 vote for 3.2        |go




------- Additional Comments From spamassassin@dostech.ca  2007-11-06 13:30 -------
+1



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5661] [review] speed up SQL queries by utilizing indexes

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5661


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Status Whiteboard|                            |needs 1 vote for 3.2




------- Additional Comments From jm@jmason.org  2007-10-07 11:23 -------
+1



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.