You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "sankalp kohli (JIRA)" <ji...@apache.org> on 2012/10/20 02:34:11 UTC

[jira] [Comment Edited] (CASSANDRA-4784) Create separate sstables for each token range handled by a node

    [ https://issues.apache.org/jira/browse/CASSANDRA-4784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480267#comment-13480267 ] 

sankalp kohli edited comment on CASSANDRA-4784 at 10/20/12 12:33 AM:
---------------------------------------------------------------------

vnodes will improve the performance, but still we need to go through application layer to filter out data from each sstable that needs to be transferred. This will affect the CPU and page cache and create short lived java objects. I have another JIRA which states how a new connection is created for each sstable transferred. 

My point is that this change will make the bootstrap of a node fastest in theory. This is the reason many people restore the data from backup and then run a repair instead of bootstrapping a node and streaming the data. 
                
      was (Author: kohlisankalp):
    vnodes will improve the performance, but still we need to go through application layer to filter out data from each sstable that needs to be transferred. This will affect the CPU and page cache and create short lived java objects. I have another JIRA which states how a new connection is created for each sstable transferred. 

My point is that this change will make the bootstrap of a node theoretically faster than you can get. This is the reason many people restore the data from backup and then run a repair instead of bootstrapping a node and streaming the data. 
                  
> Create separate sstables for each token range handled by a node
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-4784
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4784
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: sankalp kohli
>            Priority: Minor
>              Labels: perfomance
>
> Currently, each sstable has data for all the ranges that node is handling. If we change that and rather have separate sstables for each range that node is handling, it can lead to some improvements.
> Improvements
> 1) Node rebuild will be very fast as sstables can be directly copied over to the bootstrapping node. It will minimize any application level logic. We can directly use Linux native methods to transfer sstables without using CPU and putting less pressure on the serving node. I think in theory it will be the fastest way to transfer data. 
> 2) Backup can only transfer sstables for a node which belong to its primary keyrange. 
> 3) ETL process can only copy one replica of data and will be much faster. 
> Changes:
> We can split the writes into multiple memtables for each range it is handling. The sstables being flushed from these can have details of which range of data it is handling.
> There will be no change I think for any reads as they work with interleaved data anyway. But may be we can improve there as well? 
> Complexities:
> The change does not look very complicated. I am not taking into account how it will work when ranges are being changed for nodes. 
> Vnodes might make this work more complicated. We can also have a bit on each sstable which says whether it is primary data or not. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira