You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Pavel Yaskevich (JIRA)" <ji...@apache.org> on 2011/07/10 23:41:59 UTC

[jira] [Created] (CASSANDRA-2879) Make SSTableWriter.append(...) methods seekless.

Make SSTableWriter.append(...) methods seekless.
------------------------------------------------

                 Key: CASSANDRA-2879
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2879
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
            Reporter: Pavel Yaskevich
            Assignee: Pavel Yaskevich
             Fix For: 1.0


as we already have a CF.serializedSize() method we don't need to reserve a place to store data size when we write data to SSTable. Compaction should be seekless too because we calculate data size before we write actual content.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2879) Make SSTableWriter.append(...) methods seekless.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pavel Yaskevich updated CASSANDRA-2879:
---------------------------------------

    Attachment: CASSANDRA-2879.patch

rebased with latest trunk (last commit 81f1e56062a51e67ebe5a657ba94d3f37a1903e6)

> Make SSTableWriter.append(...) methods seekless.
> ------------------------------------------------
>
>                 Key: CASSANDRA-2879
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2879
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>             Fix For: 1.0
>
>         Attachments: CASSANDRA-2879.patch
>
>
> as we already have a CF.serializedSize() method we don't need to reserve a place to store data size when we write data to SSTable. Compaction should be seekless too because we calculate data size before we write actual content.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2879) Make SSTableWriter.append(...) methods seekless.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pavel Yaskevich updated CASSANDRA-2879:
---------------------------------------

    Attachment: CASSANDRA-2879-v4.patch

Instead of counting int required to store BF size it was counting int for CF.id()

> Make SSTableWriter.append(...) methods seekless.
> ------------------------------------------------
>
>                 Key: CASSANDRA-2879
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2879
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>             Fix For: 1.0
>
>         Attachments: 2879-v3.txt, CASSANDRA-2879-v2.patch, CASSANDRA-2879-v4.patch, CASSANDRA-2879.patch
>
>
> as we already have a CF.serializedSize() method we don't need to reserve a place to store data size when we write data to SSTable. Compaction should be seekless too because we calculate data size before we write actual content.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2879) Make SSTableWriter.append(...) methods seekless.

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2879:
--------------------------------------

    Attachment: 2879-v3.txt

v3 attached that cleans up some special cases.

I also "fixed" serializedSizeForSSTable to not include the 4 bytes for the CF id, which is not written to disk.  But, this change breaks the tests spectacularly, so clearly it is now undercounting for some reason I do not understand.

> Make SSTableWriter.append(...) methods seekless.
> ------------------------------------------------
>
>                 Key: CASSANDRA-2879
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2879
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>             Fix For: 1.0
>
>         Attachments: 2879-v3.txt, CASSANDRA-2879-DBConstants-names-refactoring.patch, CASSANDRA-2879-v2.patch, CASSANDRA-2879.patch
>
>
> as we already have a CF.serializedSize() method we don't need to reserve a place to store data size when we write data to SSTable. Compaction should be seekless too because we calculate data size before we write actual content.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2879) Make SSTableWriter.append(...) methods seekless.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pavel Yaskevich updated CASSANDRA-2879:
---------------------------------------

    Attachment:     (was: CASSANDRA-2879-DBConstants-names-refactoring.patch)

> Make SSTableWriter.append(...) methods seekless.
> ------------------------------------------------
>
>                 Key: CASSANDRA-2879
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2879
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>             Fix For: 1.0
>
>         Attachments: 2879-v3.txt, CASSANDRA-2879-v2.patch, CASSANDRA-2879.patch
>
>
> as we already have a CF.serializedSize() method we don't need to reserve a place to store data size when we write data to SSTable. Compaction should be seekless too because we calculate data size before we write actual content.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2879) Make SSTableWriter.append(...) methods seekless.

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064251#comment-13064251 ] 

Hudson commented on CASSANDRA-2879:
-----------------------------------

Integrated in Cassandra #955 (See [https://builds.apache.org/job/Cassandra/955/])
    optimize away seek when compacting wide rows
patch by Pavel Yaskevich and jbellis for CASSANDRA-2879

jbellis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1145818
Files : 
* /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamily.java
* /cassandra/trunk/CHANGES.txt
* /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnIndexer.java
* /cassandra/trunk/src/java/org/apache/cassandra/utils/BloomFilterSerializer.java
* /cassandra/trunk/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilySerializer.java
* /cassandra/trunk/src/java/org/apache/cassandra/utils/BloomFilter.java


> Make SSTableWriter.append(...) methods seekless.
> ------------------------------------------------
>
>                 Key: CASSANDRA-2879
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2879
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>             Fix For: 1.0
>
>         Attachments: 2879-v3.txt, CASSANDRA-2879-DBConstants-name-refactoring.patch, CASSANDRA-2879-v2.patch, CASSANDRA-2879-v4.patch, CASSANDRA-2879.patch
>
>
> as we already have a CF.serializedSize() method we don't need to reserve a place to store data size when we write data to SSTable. Compaction should be seekless too because we calculate data size before we write actual content.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2879) Make SSTableWriter.append(...) methods seekless.

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064078#comment-13064078 ] 

Jonathan Ellis commented on CASSANDRA-2879:
-------------------------------------------

It sounds like there's a bug in either RI or CF size computation.  We need to figure out where that's coming from and address it there.

> Make SSTableWriter.append(...) methods seekless.
> ------------------------------------------------
>
>                 Key: CASSANDRA-2879
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2879
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>             Fix For: 1.0
>
>         Attachments: CASSANDRA-2879.patch
>
>
> as we already have a CF.serializedSize() method we don't need to reserve a place to store data size when we write data to SSTable. Compaction should be seekless too because we calculate data size before we write actual content.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2879) Make SSTableWriter.append(...) methods seekless.

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064243#comment-13064243 ] 

Jonathan Ellis commented on CASSANDRA-2879:
-------------------------------------------

also committed

> Make SSTableWriter.append(...) methods seekless.
> ------------------------------------------------
>
>                 Key: CASSANDRA-2879
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2879
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>             Fix For: 1.0
>
>         Attachments: 2879-v3.txt, CASSANDRA-2879-DBConstants-name-refactoring.patch, CASSANDRA-2879-v2.patch, CASSANDRA-2879-v4.patch, CASSANDRA-2879.patch
>
>
> as we already have a CF.serializedSize() method we don't need to reserve a place to store data size when we write data to SSTable. Compaction should be seekless too because we calculate data size before we write actual content.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2879) Make SSTableWriter.append(...) methods seekless.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pavel Yaskevich updated CASSANDRA-2879:
---------------------------------------

    Attachment: CASSANDRA-2879-v2.patch

I have overlooked that append(DecoratedKey, ColumnFamily) does not call serialize where boolean is added to the header of the CF, so now I've removed -1 from BFS and added CF.serializedSizeForSSTable()

> Make SSTableWriter.append(...) methods seekless.
> ------------------------------------------------
>
>                 Key: CASSANDRA-2879
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2879
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>             Fix For: 1.0
>
>         Attachments: CASSANDRA-2879-v2.patch, CASSANDRA-2879.patch
>
>
> as we already have a CF.serializedSize() method we don't need to reserve a place to store data size when we write data to SSTable. Compaction should be seekless too because we calculate data size before we write actual content.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2879) Make SSTableWriter.append(...) methods seekless.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pavel Yaskevich updated CASSANDRA-2879:
---------------------------------------

    Attachment: CASSANDRA-2879-DBConstants-names-refactoring.patch

removes underscores from DBConstants names, apply after v2.

> Make SSTableWriter.append(...) methods seekless.
> ------------------------------------------------
>
>                 Key: CASSANDRA-2879
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2879
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>             Fix For: 1.0
>
>         Attachments: CASSANDRA-2879-DBConstants-names-refactoring.patch, CASSANDRA-2879-v2.patch, CASSANDRA-2879.patch
>
>
> as we already have a CF.serializedSize() method we don't need to reserve a place to store data size when we write data to SSTable. Compaction should be seekless too because we calculate data size before we write actual content.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2879) Make SSTableWriter.append(...) methods seekless.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064052#comment-13064052 ] 

Pavel Yaskevich commented on CASSANDRA-2879:
--------------------------------------------

I've double checked CF/columns and ColumnIndexer serializedSize() methods before adding -1 to BF, because size of RowIndex.serializedSize() + cf.serializedSize() was always generating one byte bigger size than actual data size computed by endPosition - (sizePosition + 8).

> Make SSTableWriter.append(...) methods seekless.
> ------------------------------------------------
>
>                 Key: CASSANDRA-2879
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2879
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>             Fix For: 1.0
>
>         Attachments: CASSANDRA-2879.patch
>
>
> as we already have a CF.serializedSize() method we don't need to reserve a place to store data size when we write data to SSTable. Compaction should be seekless too because we calculate data size before we write actual content.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2879) Make SSTableWriter.append(...) methods seekless.

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064048#comment-13064048 ] 

Jonathan Ellis commented on CASSANDRA-2879:
-------------------------------------------

where is the -1 coming from in BF.serializedSize?

> Make SSTableWriter.append(...) methods seekless.
> ------------------------------------------------
>
>                 Key: CASSANDRA-2879
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2879
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>             Fix For: 1.0
>
>         Attachments: CASSANDRA-2879.patch
>
>
> as we already have a CF.serializedSize() method we don't need to reserve a place to store data size when we write data to SSTable. Compaction should be seekless too because we calculate data size before we write actual content.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2879) Make SSTableWriter.append(...) methods seekless.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pavel Yaskevich updated CASSANDRA-2879:
---------------------------------------

    Attachment: CASSANDRA-2879-DBConstants-name-refactoring.patch

DBConstants name refactoring (underscores removed)

> Make SSTableWriter.append(...) methods seekless.
> ------------------------------------------------
>
>                 Key: CASSANDRA-2879
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2879
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>             Fix For: 1.0
>
>         Attachments: 2879-v3.txt, CASSANDRA-2879-DBConstants-name-refactoring.patch, CASSANDRA-2879-v2.patch, CASSANDRA-2879-v4.patch, CASSANDRA-2879.patch
>
>
> as we already have a CF.serializedSize() method we don't need to reserve a place to store data size when we write data to SSTable. Compaction should be seekless too because we calculate data size before we write actual content.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira