You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Ilya Maykov <iv...@gmail.com> on 2010/07/16 05:19:53 UTC

Seeing very weird results on 0.6.2 when paginating through a ColumnFamily with get_slice()

Hi all,

I'm trying to debug some pretty weird behavior when paginating through
a ColumnFamily with get_slice(). It basically looks like Cassandra
does not respect the limit parameter in the SlicePredicate, sometimes
returning more than limit columns. It also sometimes silently drops
columns. I'm reading using QUORUM, and all data was written using
QUORUM as well. The client is in Ruby.

I am seeing basically non-deterministic behavior when paginating
through a column family (which is not being written concurrently).
Here is example output from my pagination method (code below) with
some extra debug prints added:

irb(main):005:0> blah = get_entire_column_family(@cassandra,
"some_key", "some_cf", 100)
get_entire_column_family(@cass, "some_key", "some_cf", 100) ...
100/6648 ... 199/6648 ... 354/6648 ... 453/6648 ... 552/6648 ...
689/6648 ... 788/6648 ... 887/6648 ... 1048/6648 ... 1147/6648 ...
1246/6648 ... 1377/6648 ... 1476/6648 ... 1575/6648 ... 1674/6648 ...
1773/6648 ... 1908/6648 ... 2051/6648 ... 2150/6648 ... 2249/6648 ...
2348/6648 ... <snip> ... 6127/6648 ... 6127/6648 ... 6127/6648 ...
6127/6648 ... 6127/6648 ...

The N/6648 is just printing the retrieved columns / total columns at
each step of the pagination loop. It should be going up by 99 on each
iteration after the first (because the start of the next slice == the
last value in the current slice). But sometimes it jumps, indicating
that more than 100 values were returned from a single get_slice() call
(i.e. ... 199/6648 ... 354/6648 ...). And when it gets to the end of
the column family, we end up with fewer than 6648 columns on the
client side and the code gets stuck in an infinite loop.

I've tried this several times from an interactive Ruby session and got
a different number of columns each time:
6536/6648
6127/6648
6514/6648
However, once I set the limit to be > num_columns and read the entire
row as a single page, everything worked. And follow-up paginated reads
also return the entire row successfully. Not sure if that's because
the entire row is now in cache, or because something was wrong and
read-repair has fixed it. But since all of our reads and writes are
done using QUORUM, read-repair shouldn't matter, right?

Here is the pagination code:

    def get_entire_column_family(cassandra, row_key, column_family,
limit_per_slice)
      column_parent = CassandraThrift::ColumnParent.new(:column_family
=> column_family, :super_column => nil)
      num_columns = cassandra.get_count(@keyspace, row_key,
column_parent, CassandraThrift::ConsistencyLevel::QUORUM)
      predicate = CassandraThrift::SlicePredicate.new
      predicate.slice_range = CassandraThrift::SliceRange.new(:start
=> "", :finish => "", :reversed => false, :count => limit_per_slice)
      slice = cassandra.get_slice(@keyspace, row_key, column_parent,
predicate, CassandraThrift::ConsistencyLevel::QUORUM)
      result = slice
      while result.size < num_columns
        predicate = CassandraThrift::SlicePredicate.new
        predicate.slice_range = CassandraThrift::SliceRange.new(:start
=> result.last.column.name,
            :finish => "", :reversed => false, :count => limit_per_slice)
        slice = cassandra.get_slice(@keyspace, row_key, column_parent,
predicate, CassandraThrift::ConsistencyLevel::QUORUM)
        # Because the start parameter to get_slice() is inclusive, we
should already have the first column of the
        # new slice in our result. We don't want to have 2 copies of
it, so drop it from the slice before concatenating.
        unless slice.nil? || slice.empty?
          if result.last.column.name == slice.first.column.name
            result.concat slice[1 .. slice.size-1]
          else
            result.concat slice
          end
        end
      end  # while
      return result
    end

I guess I have several questions:
1) Is this the proper way to paginate through a large column family
for a single row key? If not, what is the proper way? Some of our rows
are very big (hundreds of thousands of columns in the worst case), and
pagination is a must.
2) Could this behavior be expected under some conditions (i.e. maybe
presence of tombstones or hints from when a node was down or other
weirdness?)
2) Is this a known bug? (maybe related to
https://issues.apache.org/jira/browse/CASSANDRA-1145 and/or
https://issues.apache.org/jira/browse/CASSANDRA-1042 ?)
3) If this is not a known bug, how should I proceed with investigating it?

Thanks,

-- Ilya

Re: Seeing very weird results on 0.6.2 when paginating through a ColumnFamily with get_slice()

Posted by Ilya Maykov <iv...@gmail.com>.
The column names are arbitrary strings, so it's not obvious what the
"next" value should be at any step. So, I just set the start of the
next page to the end of the last page and eliminate the duplicate
value when joining the 2 pages together.

The paging direction does not matter in my case, as I just want to get
the entire column family. And in the vast majority of cases (probably
99%+) the paginated get method returns the correct results. But
sometimes it does not.

-- Ilya



On Thu, Jul 15, 2010 at 8:35 PM, Paul Brown <pa...@gmail.com> wrote:
>
> You should make sure that your directions and interval endpoints are chosen correctly.  I recall the semantics of the call being like an old-school for with the descending flag as a step of +1 or -1.
>
> --
> Spelling by mobile.
>
> On Jul 15, 2010, at 20:19, Ilya Maykov <iv...@gmail.com> wrote:
>
>> Hi all,
>>
>> I'm trying to debug some pretty weird behavior when paginating through
>> a ColumnFamily with get_slice(). It basically looks like Cassandra
>> does not respect the limit parameter in the SlicePredicate, sometimes
>> returning more than limit columns. It also sometimes silently drops
>> columns. I'm reading using QUORUM, and all data was written using
>> QUORUM as well. The client is in Ruby.
>>
>> I am seeing basically non-deterministic behavior when paginating
>> through a column family (which is not being written concurrently).
>> Here is example output from my pagination method (code below) with
>> some extra debug prints added:
>>
>> irb(main):005:0> blah = get_entire_column_family(@cassandra,
>> "some_key", "some_cf", 100)
>> get_entire_column_family(@cass, "some_key", "some_cf", 100) ...
>> 100/6648 ... 199/6648 ... 354/6648 ... 453/6648 ... 552/6648 ...
>> 689/6648 ... 788/6648 ... 887/6648 ... 1048/6648 ... 1147/6648 ...
>> 1246/6648 ... 1377/6648 ... 1476/6648 ... 1575/6648 ... 1674/6648 ...
>> 1773/6648 ... 1908/6648 ... 2051/6648 ... 2150/6648 ... 2249/6648 ...
>> 2348/6648 ... <snip> ... 6127/6648 ... 6127/6648 ... 6127/6648 ...
>> 6127/6648 ... 6127/6648 ...
>>
>> The N/6648 is just printing the retrieved columns / total columns at
>> each step of the pagination loop. It should be going up by 99 on each
>> iteration after the first (because the start of the next slice == the
>> last value in the current slice). But sometimes it jumps, indicating
>> that more than 100 values were returned from a single get_slice() call
>> (i.e. ... 199/6648 ... 354/6648 ...). And when it gets to the end of
>> the column family, we end up with fewer than 6648 columns on the
>> client side and the code gets stuck in an infinite loop.
>>
>> I've tried this several times from an interactive Ruby session and got
>> a different number of columns each time:
>> 6536/6648
>> 6127/6648
>> 6514/6648
>> However, once I set the limit to be > num_columns and read the entire
>> row as a single page, everything worked. And follow-up paginated reads
>> also return the entire row successfully. Not sure if that's because
>> the entire row is now in cache, or because something was wrong and
>> read-repair has fixed it. But since all of our reads and writes are
>> done using QUORUM, read-repair shouldn't matter, right?
>>
>> Here is the pagination code:
>>
>>    def get_entire_column_family(cassandra, row_key, column_family,
>> limit_per_slice)
>>      column_parent = CassandraThrift::ColumnParent.new(:column_family
>> => column_family, :super_column => nil)
>>      num_columns = cassandra.get_count(@keyspace, row_key,
>> column_parent, CassandraThrift::ConsistencyLevel::QUORUM)
>>      predicate = CassandraThrift::SlicePredicate.new
>>      predicate.slice_range = CassandraThrift::SliceRange.new(:start
>> => "", :finish => "", :reversed => false, :count => limit_per_slice)
>>      slice = cassandra.get_slice(@keyspace, row_key, column_parent,
>> predicate, CassandraThrift::ConsistencyLevel::QUORUM)
>>      result = slice
>>      while result.size < num_columns
>>        predicate = CassandraThrift::SlicePredicate.new
>>        predicate.slice_range = CassandraThrift::SliceRange.new(:start
>> => result.last.column.name,
>>            :finish => "", :reversed => false, :count => limit_per_slice)
>>        slice = cassandra.get_slice(@keyspace, row_key, column_parent,
>> predicate, CassandraThrift::ConsistencyLevel::QUORUM)
>>        # Because the start parameter to get_slice() is inclusive, we
>> should already have the first column of the
>>        # new slice in our result. We don't want to have 2 copies of
>> it, so drop it from the slice before concatenating.
>>        unless slice.nil? || slice.empty?
>>          if result.last.column.name == slice.first.column.name
>>            result.concat slice[1 .. slice.size-1]
>>          else
>>            result.concat slice
>>          end
>>        end
>>      end  # while
>>      return result
>>    end
>>
>> I guess I have several questions:
>> 1) Is this the proper way to paginate through a large column family
>> for a single row key? If not, what is the proper way? Some of our rows
>> are very big (hundreds of thousands of columns in the worst case), and
>> pagination is a must.
>> 2) Could this behavior be expected under some conditions (i.e. maybe
>> presence of tombstones or hints from when a node was down or other
>> weirdness?)
>> 2) Is this a known bug? (maybe related to
>> https://issues.apache.org/jira/browse/CASSANDRA-1145 and/or
>> https://issues.apache.org/jira/browse/CASSANDRA-1042 ?)
>> 3) If this is not a known bug, how should I proceed with investigating it?
>>
>> Thanks,
>>
>> -- Ilya
>

Re: Seeing very weird results on 0.6.2 when paginating through a ColumnFamily with get_slice()

Posted by Paul Brown <pa...@gmail.com>.
You should make sure that your directions and interval endpoints are chosen correctly.  I recall the semantics of the call being like an old-school for with the descending flag as a step of +1 or -1.

--
Spelling by mobile.

On Jul 15, 2010, at 20:19, Ilya Maykov <iv...@gmail.com> wrote:

> Hi all,
> 
> I'm trying to debug some pretty weird behavior when paginating through
> a ColumnFamily with get_slice(). It basically looks like Cassandra
> does not respect the limit parameter in the SlicePredicate, sometimes
> returning more than limit columns. It also sometimes silently drops
> columns. I'm reading using QUORUM, and all data was written using
> QUORUM as well. The client is in Ruby.
> 
> I am seeing basically non-deterministic behavior when paginating
> through a column family (which is not being written concurrently).
> Here is example output from my pagination method (code below) with
> some extra debug prints added:
> 
> irb(main):005:0> blah = get_entire_column_family(@cassandra,
> "some_key", "some_cf", 100)
> get_entire_column_family(@cass, "some_key", "some_cf", 100) ...
> 100/6648 ... 199/6648 ... 354/6648 ... 453/6648 ... 552/6648 ...
> 689/6648 ... 788/6648 ... 887/6648 ... 1048/6648 ... 1147/6648 ...
> 1246/6648 ... 1377/6648 ... 1476/6648 ... 1575/6648 ... 1674/6648 ...
> 1773/6648 ... 1908/6648 ... 2051/6648 ... 2150/6648 ... 2249/6648 ...
> 2348/6648 ... <snip> ... 6127/6648 ... 6127/6648 ... 6127/6648 ...
> 6127/6648 ... 6127/6648 ...
> 
> The N/6648 is just printing the retrieved columns / total columns at
> each step of the pagination loop. It should be going up by 99 on each
> iteration after the first (because the start of the next slice == the
> last value in the current slice). But sometimes it jumps, indicating
> that more than 100 values were returned from a single get_slice() call
> (i.e. ... 199/6648 ... 354/6648 ...). And when it gets to the end of
> the column family, we end up with fewer than 6648 columns on the
> client side and the code gets stuck in an infinite loop.
> 
> I've tried this several times from an interactive Ruby session and got
> a different number of columns each time:
> 6536/6648
> 6127/6648
> 6514/6648
> However, once I set the limit to be > num_columns and read the entire
> row as a single page, everything worked. And follow-up paginated reads
> also return the entire row successfully. Not sure if that's because
> the entire row is now in cache, or because something was wrong and
> read-repair has fixed it. But since all of our reads and writes are
> done using QUORUM, read-repair shouldn't matter, right?
> 
> Here is the pagination code:
> 
>    def get_entire_column_family(cassandra, row_key, column_family,
> limit_per_slice)
>      column_parent = CassandraThrift::ColumnParent.new(:column_family
> => column_family, :super_column => nil)
>      num_columns = cassandra.get_count(@keyspace, row_key,
> column_parent, CassandraThrift::ConsistencyLevel::QUORUM)
>      predicate = CassandraThrift::SlicePredicate.new
>      predicate.slice_range = CassandraThrift::SliceRange.new(:start
> => "", :finish => "", :reversed => false, :count => limit_per_slice)
>      slice = cassandra.get_slice(@keyspace, row_key, column_parent,
> predicate, CassandraThrift::ConsistencyLevel::QUORUM)
>      result = slice
>      while result.size < num_columns
>        predicate = CassandraThrift::SlicePredicate.new
>        predicate.slice_range = CassandraThrift::SliceRange.new(:start
> => result.last.column.name,
>            :finish => "", :reversed => false, :count => limit_per_slice)
>        slice = cassandra.get_slice(@keyspace, row_key, column_parent,
> predicate, CassandraThrift::ConsistencyLevel::QUORUM)
>        # Because the start parameter to get_slice() is inclusive, we
> should already have the first column of the
>        # new slice in our result. We don't want to have 2 copies of
> it, so drop it from the slice before concatenating.
>        unless slice.nil? || slice.empty?
>          if result.last.column.name == slice.first.column.name
>            result.concat slice[1 .. slice.size-1]
>          else
>            result.concat slice
>          end
>        end
>      end  # while
>      return result
>    end
> 
> I guess I have several questions:
> 1) Is this the proper way to paginate through a large column family
> for a single row key? If not, what is the proper way? Some of our rows
> are very big (hundreds of thousands of columns in the worst case), and
> pagination is a must.
> 2) Could this behavior be expected under some conditions (i.e. maybe
> presence of tombstones or hints from when a node was down or other
> weirdness?)
> 2) Is this a known bug? (maybe related to
> https://issues.apache.org/jira/browse/CASSANDRA-1145 and/or
> https://issues.apache.org/jira/browse/CASSANDRA-1042 ?)
> 3) If this is not a known bug, how should I proceed with investigating it?
> 
> Thanks,
> 
> -- Ilya