You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Dave Brosius (Issue Comment Edited) (JIRA)" <ji...@apache.org> on 2012/03/24 19:58:25 UTC
[jira] [Issue Comment Edited] (CASSANDRA-4023) Improve BloomFilter deserialization performance

    [ https://issues.apache.org/jira/browse/CASSANDRA-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237632#comment-13237632 ] 

Dave Brosius edited comment on CASSANDRA-4023 at 3/24/12 6:58 PM:
------------------------------------------------------------------

Seems like this is the fix to me (at least in the simple case where the int read in deserialize is 0) more would need to be done if not.

diff --git a/src/java/org/apache/cassandra/io/sstable/SSTableReader.java b/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
index 9e92220..959a66d 100644
--- a/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
+++ b/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
@@ -361,7 +361,7 @@ public class SSTableReader extends SSTable
                 int len = ByteBufferUtil.readShortLength(input);
 
                 boolean firstKey = left == null;
-                boolean lastKey = indexPosition + DBConstants.SHORT_SIZE + len + DBConstants.LONG_SIZE == indexSize;
+                boolean lastKey = indexPosition + DBConstants.SHORT_SIZE + len + DBConstants.LONG_SIZE + (descriptor.hasPromotedIndexes ? DBConstants.INT_SIZE : 0) == indexSize;
                 boolean shouldAddEntry = indexSummary.shouldAddEntry();
                 if (shouldAddEntry || cacheLoading || recreatebloom || firstKey || lastKey)
                 {
                
      was (Author: dbrosius@apache.org):
    Seems like this is the fix to me

diff --git a/src/java/org/apache/cassandra/io/sstable/SSTableReader.java b/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
index 9e92220..959a66d 100644
--- a/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
+++ b/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
@@ -361,7 +361,7 @@ public class SSTableReader extends SSTable
                 int len = ByteBufferUtil.readShortLength(input);
 
                 boolean firstKey = left == null;
-                boolean lastKey = indexPosition + DBConstants.SHORT_SIZE + len + DBConstants.LONG_SIZE == indexSize;
+                boolean lastKey = indexPosition + DBConstants.SHORT_SIZE + len + DBConstants.LONG_SIZE + (descriptor.hasPromotedIndexes ? DBConstants.INT_SIZE : 0) == indexSize;
                 boolean shouldAddEntry = indexSummary.shouldAddEntry();
                 if (shouldAddEntry || cacheLoading || recreatebloom || firstKey || lastKey)
                 {
                  
> Improve BloomFilter deserialization performance
> -----------------------------------------------
>
>                 Key: CASSANDRA-4023
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4023
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.0.1
>            Reporter: Joaquin Casares
>            Assignee: Yuki Morishita
>            Priority: Minor
>              Labels: datastax_qa
>             Fix For: 1.0.9, 1.1.0
>
>         Attachments: 0001-fix-loading-promoted-row-index.patch, 4023.txt, cassandra-1.0-4023-v2.txt, cassandra-1.0-4023-v3.txt
>
>
> The difference of startup times between a 0.8.7 cluster and 1.0.7 cluster with the same amount of data is 4x greater in 1.0.7.
> It seems as though 1.0.7 loads the BloomFilter through a series of reading longs out in a multithreaded process while 0.8.7 reads the entire object.
> Perhaps we should update the new BloomFilter to do reading in batch as well?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira