You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Mat Brown <ma...@brewster.com> on 2012/08/09 16:14:19 UTC

Key order check in sstable2json

Hello,

We've noticed that when passing multiple -k arguments to the
sstable2json utility, we pretty much always get an IOException with
"Key out of order!". Looking at this:
https://github.com/apache/cassandra/blob/cassandra-1.0.10/src/java/org/apache/cassandra/tools/SSTableExport.java#L241
it looks like it's iterating over the keys in the order given, and
then enforcing partitioner ordering of the keys. Is that correct? If
so, why? The original patch states that the key ordering check is to
detect corrupt sstables, and it does provide that benefit
where the check runs in the enumerateKeys method, but I don't see
any advantage to enforcing key order in the export method, other than
I suppose making scanning for the next key maximally efficient.

Anyway, with the current situation, it seems that the only way to pass
multiple key arguments to sstable2json would be to use sstablekeys
first to get the key order, grep for the keys I'm interested in, and
then pass those in order to sstable2json? Is this worth it, or would
it be comparably efficient to just call sstable2json on one key at a
time?

Thanks,
Mat

Re: Key order check in sstable2json

Posted by Tyler Hobbs <ty...@datastax.com>.
Sounds like bad behavior.  Can you open a JIRA ticket for that (once jira
is back up :) ?

On Thu, Aug 9, 2012 at 9:14 AM, Mat Brown <ma...@brewster.com> wrote:

> Hello,
>
> We've noticed that when passing multiple -k arguments to the
> sstable2json utility, we pretty much always get an IOException with
> "Key out of order!". Looking at this:
>
> https://github.com/apache/cassandra/blob/cassandra-1.0.10/src/java/org/apache/cassandra/tools/SSTableExport.java#L241
> it looks like it's iterating over the keys in the order given, and
> then enforcing partitioner ordering of the keys. Is that correct? If
> so, why? The original patch states that the key ordering check is to
> detect corrupt sstables, and it does provide that benefit
> where the check runs in the enumerateKeys method, but I don't see
> any advantage to enforcing key order in the export method, other than
> I suppose making scanning for the next key maximally efficient.
>
> Anyway, with the current situation, it seems that the only way to pass
> multiple key arguments to sstable2json would be to use sstablekeys
> first to get the key order, grep for the keys I'm interested in, and
> then pass those in order to sstable2json? Is this worth it, or would
> it be comparably efficient to just call sstable2json on one key at a
> time?
>
> Thanks,
> Mat
>



-- 
Tyler Hobbs
DataStax <http://datastax.com/>