You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Zhaojun Zhang (JIRA)" <ji...@apache.org> on 2016/08/09 08:08:20 UTC

[jira] [Created] (CASSANDRA-12416) sstableloader to stream sstables in a sorted order

Zhaojun Zhang created CASSANDRA-12416:
-----------------------------------------

             Summary: sstableloader to stream sstables in a sorted order
                 Key: CASSANDRA-12416
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12416
             Project: Cassandra
          Issue Type: Wish
            Reporter: Zhaojun Zhang


Within each sstable, the data is sorted. However, this is not true across multiple sstables. We have a workflow which will create a read-only cluster by bulk loading data from sstables (written by cqlsstablewirter) to cassandra cluster. We don't want to trigger compaction, and the best way to do so is to write data in a sorted order, which requires us to do a global sort across all data sources using an external sort algorithm. If we are able to use sstableloader to load data into clusters in order, we don't need to do such global sort, which will dramatically simply our implementation and code redundancy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)