You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Kevin Burton <bu...@spinn3r.com> on 2014/06/30 20:17:51 UTC

Read 75k live rows in a query that should only return 500 (in queue-like table).

I have a queue-like table which is reading 75k Iive rows… and then only
returning 500.

… I'm trying to figure out why this could be.

Following this:

http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets

I BELIEVE that I'm doing everything right.

Essentially my schema is:

bucket: int
sequence: long
value: text…
primary key( bucket, sequence )

… value is just a big chunk of html.

sequence is a timestamp essentially.

I have 100 buckets… and that's the partition key.  So I can stick these
buckets across 100 servers token ranges.

The query is specified to only read values greater than a sequence value…

I'm avoiding tombstone columns ("When you know where your live columns
begin") by specifying the after parameter and setting gc_grace_seconds=0 …

The query is essentially:

SELECT * FROM content WHERE bucket=98 AND sequence>1403995838000000000
ORDER BY sequence ASC LIMIT 500;

.. I got rid of the " ORDER BY sequence ASC " as a test and that didn't
change anything.

The main problem seems to be"Read 75568 live and 0 tombstones cells"

… and I have NO idea why it's reading so many rows!

I woud have hoped that cassandra would do a binary search of the row, find
the column it needs, then read then next 500 sequentially.

But that doesn't seem to be the case.

Any thoughts?



 activity
                                     | timestamp    | source      |
source_elapsed
-----------------------------------------------------------------------------------------------------------------+--------------+-------------+----------------

                  execute_cql3_query | 12:17:49,886 | 10.24.23.94 |
     0
 Parsing SELECT * FROM content WHERE bucket=98 AND
sequence>1403995838000000000 ORDER BY sequence ASC LIMIT 500; |
12:17:49,886 | 10.24.23.94 |             65

                 Preparing statement | 12:17:49,886 | 10.24.23.94 |
   149

                      Row cache miss | 12:17:49,886 | 10.24.23.94 |
   338

 Executing single-partition query on content | 12:17:49,887 | 10.24.23.94 |
           402

        Acquiring sstable references | 12:17:49,887 | 10.24.23.94 |
   464

         Merging memtable tombstones | 12:17:49,887 | 10.24.23.94 |
   514
                                                           Partition index
with 35 entries found for sstable 156 | 12:17:49,888 | 10.24.23.94 |
    2344

 Seeking to partition beginning in data file | 12:17:49,888 | 10.24.23.94 |
          2360

       Key cache hit for sstable 155 | 12:17:49,889 | 10.24.23.94 |
  3006

 Seeking to partition beginning in data file | 12:17:49,889 | 10.24.23.94 |
          3020
                                       Skipped 0/2 non-slice-intersecting
sstables, included 0 due to tombstones | 12:17:49,890 | 10.24.23.94 |
    3577

Merging data from memtables and 2 sstables | 12:17:49,890 | 10.24.23.94 |
        3602

Read 75568 live and 0 tombstoned cells | 12:17:50,777 | 10.24.23.94 |
  890823

Read 501 live and 0 tombstoned cells | 12:17:51,055 | 10.24.23.94 |
 1169184

                    Request complete | 12:17:51,069 | 10.24.23.94 |
 1183383

-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>