You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Edward Capriolo (JIRA)" <ji...@apache.org> on 2014/04/04 16:30:15 UTC
[jira] [Comment Edited] (CASSANDRA-6982) start_column in get_page_slice has odd behaivor

    [ https://issues.apache.org/jira/browse/CASSANDRA-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13959995#comment-13959995 ] 

Edward Capriolo edited comment on CASSANDRA-6982 at 4/4/14 2:29 PM:
--------------------------------------------------------------------

This is not as much a bug as thrift is allowing users to do something that does not work. We should reject that request.

I will explain
This works:
{code}
     KeyRange kr = new KeyRange();
      kr.setCount(3);
      kr.setStart_key(ByteBufferUtil.bytes("aslice"));
      kr.setEnd_key(ByteBufferUtil.bytes("aslice"));  
       List<KeySlice> t = server.get_paged_slice("Standard1", kr, ByteBufferUtil.bytes("c"), ConsistencyLevel.ONE);
{code}

When you specify a start token and a start column you get what you would expect.

This is correct but may not be intuitive.
{code}
     KeyRange kr = new KeyRange();
      kr.setCount(3);
      kr.setStart_key(ByteBufferUtil.bytes(""));
      kr.setEnd_key(ByteBufferUtil.bytes(""));  
      List<KeySlice> t = server.get_paged_slice("Standard1", kr, ByteBufferUtil.bytes("c"), ConsistencyLevel.ONE);
{code}

Your slice is starting before the row in question. You get back columns a,b,c  not c,d,e.

The problem comes using token. With Murmur3 and Random Partitioner the relation between tokens->keys is not one to one. The pig unit tests fires up ByteOrderPartitioner, the rest of our testing is Murmer3.

Here are the things we should not allow: (not sure if I am supposed to hex encode so I tried both)
{quote}
      KeyRange kr = new KeyRange();
      kr.setCount(3);
      kr.setStart_token(ByteBufferUtil.bytesToHex(ByteBuffer.wrap(l.token.toString().getBytes())));
      kr.setEnd_token(ByteBufferUtil.bytesToHex(ByteBuffer.wrap(l.token.toString().getBytes())));
      List<KeySlice> t = server.get_paged_slice("Standard1", kr, ByteBufferUtil.bytes("c"), ConsistencyLevel.ONE);
{quote}      

{quote}
      Murmur3Partitioner m  = new Murmur3Partitioner();
      LongToken l = m.getToken(ByteBufferUtil.bytes("aslice"));
            KeyRange kr = new KeyRange();
      kr.setCount(3);
      kr.setStart_token(l.toString());
      kr.setEnd_token(l.toString());
      List<KeySlice> t = server.get_paged_slice("Standard1", kr, ByteBufferUtil.bytes("c"), ConsistencyLevel.ONE);
{quote}

Because the relationship of token to key is not 1 to 1. There is no way to start at a specific row. Since you can not start at a specific row the start_column is meaningless.

I *think* we should reject a KeyRange using a start_token and a start_column when the partitioner does not provide 1 to 1 tokens. We should throw an InvalidRequestException.


was (Author: appodictic):
This is not as much a bug as thrift is allowing users to do something that does not work. We should reject that request.

I will explain
This works:
{code}
     KeyRange kr = new KeyRange();
      kr.setCount(3);
      kr.setStart_key(ByteBufferUtil.bytes("aslice"));
      kr.setEnd_key(ByteBufferUtil.bytes("aslice"));  
       List<KeySlice> t = server.get_paged_slice("Standard1", kr, ByteBufferUtil.bytes("c"), ConsistencyLevel.ONE);
{code}

When you specify a start token and a start column you get what you would expect.

This is correct but may not be intuitive.
{code}
     KeyRange kr = new KeyRange();
      kr.setCount(3);
      kr.setStart_key(ByteBufferUtil.bytes(""));
      kr.setEnd_key(ByteBufferUtil.bytes(""));  
      List<KeySlice> t = server.get_paged_slice("Standard1", kr, ByteBufferUtil.bytes("c"), ConsistencyLevel.ONE);
{code}

Your slice is starting before the row in question. You get back columns a,b,c  not c,d,e.

The problem comes using token. With Murmur3 and Random Partitioner the relation between tokens->keys is not one to one. The pig unit tests fires up ByteOrderPartitioner, the rest of our testing is Murmer3.

Here are the things we should not allow: (not sure if I am supposed to hex encode so I tried both)
{quote}
      KeyRange kr = new KeyRange();
      kr.setCount(3);
      kr.setStart_token(ByteBufferUtil.bytesToHex(ByteBuffer.wrap(l.token.toString().getBytes())));
      kr.setEnd_token(ByteBufferUtil.bytesToHex(ByteBuffer.wrap(l.token.toString().getBytes())));
      List<KeySlice> t = server.get_paged_slice("Standard1", kr, ByteBufferUtil.bytes("c"), ConsistencyLevel.ONE);
{quote}      

{quote}
      Murmur3Partitioner m  = new Murmur3Partitioner();
      LongToken l = m.getToken(ByteBufferUtil.bytes("aslice"));
            KeyRange kr = new KeyRange();
      kr.setCount(3);
      kr.setStart_token(l.toString());
      kr.setEnd_token(l.toString());
      List<KeySlice> t = server.get_paged_slice("Standard1", kr, ByteBufferUtil.bytes("c"), ConsistencyLevel.ONE);
{quote}

Because the relationship of token to key is not 1 to 1. There is no way to start at a specific row. Since you can not start at a specific row the start_column is meaningless.

I *think* we should reject a KeyRange using a start_token and a start_column. We should throw an InvalidRequestException.

> start_column in get_page_slice has odd behaivor
> -----------------------------------------------
>
>                 Key: CASSANDRA-6982
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6982
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>            Priority: Critical
>
> get_paged_slice is described as so:
> {code}
>  /**
>    returns a range of columns, wrapping to the next rows if necessary to collect max_results.
>   */
>   list<KeySlice> get_paged_slice(1:required string column_family,
>                                  2:required KeyRange range,
>                                  3:required binary start_column,
>                                  4:required ConsistencyLevel consistency_level=ConsistencyLevel.ONE)
>                  throws (1:InvalidRequestException ire, 2:UnavailableException ue, 3:TimedOutException te),
> {code}
> The term max_results is not defined, I take it to mean key_range.count.
> The larger issue I have found is that start_column seems to be ignored in some cases.
> testNormal() produces this error
> junit.framework.ComparisonFailure: null expected:<[c]> but was:<[a]>
> The problem seems to be KeyRanges that use tokens and not keys.
> {code}
> KeyRange kr = new KeyRange();
>       kr.setCount(3);
>       kr.setStart_token("");
>       kr.setEnd_token("");   
> {code}
> A failing test is here:
> https://github.com/edwardcapriolo/cassandra/compare/pg?expand=1
> Is this a bug? It feels like one, or is this just undefined behaviour. If it is a bug I would like to fix. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)