You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Mark Robson (JIRA)" <ji...@apache.org> on 2009/08/06 12:36:15 UTC

[jira] Created: (CASSANDRA-348) Range scan over two nodes returns wrong data

Range scan over two nodes returns wrong data
--------------------------------------------

                 Key: CASSANDRA-348
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
             Project: Cassandra
          Issue Type: Bug
    Affects Versions: 0.4
            Reporter: Mark Robson


I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.

I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results

./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
['00', '01', '10', '11', '20', '21', '30', '31']

./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']

 ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']

./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']

All of which returned as I expected.

But when I range scan the whole lot (0-g) then I get:

./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
[ '00',
  '90',
  '91',
  'a0',
  'a1',
  'b0',
  'b1',
  'c0',
  'c1',
  'd0',
  'd1',
  'e0',
  'e1',
  'f0',
  'f1']

Where have 01-81 gone?

I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-348:
-------------------------------------

    Attachment: 348-2.patch

This patch fixes the main problem.  There are two things going on in this patch:

 - we switch from trying to get the next endpoint by increasing offset to asking tokenMetadata for "the next one."  this will always be correct where the offset approach will not (usually you want offset to just be 1, but sometimes you have to keep increasing it if no results are found but the range is still not finished)
 - we merge results differently when the endpoint responsible for where the ring wraps is involved, since that endpoint can hold keys from both the beginning and end of the range.

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348-2.patch, 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Mark Robson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740100#action_12740100 ] 

Mark Robson commented on CASSANDRA-348:
---------------------------------------

cassandra/Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 '' g 100
[ '00',
  '90',
..
  'f1']

Debug logs:

NODE 1:

DEBUG - get_key_range
DEBUG - reading RangeCommand(table='Keyspace1', columnFamily=Standard1, startWith='', stopAt='g', maxResults=100) from 2208@127.0.0.1:7000
DEBUG - Sending RangeReply(keys=[00, 90, 91, a0, a1, b0, b1, c0, c1, d0, d1, e0, e1, f0, f1], completed=false) to 2208@127.0.0.1:7000
DEBUG - Processing response on an async result from 2208@127.0.0.1:7000
DEBUG - reading RangeCommand(table='Keyspace1', columnFamily=Standard1, startWith='f1', stopAt='g', maxResults=85) from 2209@127.0.0.2:7000
DEBUG - Processing response on an async result from 2209@127.0.0.2:7000

NODE 2:
DEBUG - Sending RangeReply(keys=[], completed=false) to 2209@127.0.0.1:7000


> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348.diff, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740142#action_12740142 ] 

Jonathan Ellis commented on CASSANDRA-348:
------------------------------------------

Node selection code is working as designed but it is not quite what getKeyRange expects.

The node selection is "pick the node whose token is nearest to the decorated key, _always rounding up_."  so what you end up with here is 3 range sections:

["", 000000000] node A (the one with token 00000000)
(00000000, 88888888] node B (the one with token 88888888)
(88888888, infinity) node A again

so, key 00 goes on node A, but 01-88 go on node B.  then 09-ff go on node A again.

we could hack around this in getKeyRange but it seems like the Right Fix is to make it so A has ['', 88888888) and B has [88888888, inf), no?

what do you think, Jun?  is there any inherent advantage to "round up" instead of "round down" that I have forgotten?

[yeah, we're ignoring RackAware for now]

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348.diff, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Mark Robson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739980#action_12739980 ] 

Mark Robson commented on CASSANDRA-348:
---------------------------------------

Relevant config:

node1:

    <Partitioner>org.apache.cassandra.dht.OrderPreservingPartitioner</Partitioner>
    <InitialToken>00000000</InitialToken>

node2:

    <Partitioner>org.apache.cassandra.dht.OrderPreservingPartitioner</Partitioner>
    <InitialToken>88888888</InitialToken>

Most of the rest is as shipped.

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Mark Robson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740098#action_12740098 ] 

Mark Robson commented on CASSANDRA-348:
---------------------------------------

Even more weird:

 cassandra/Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 00000000 g 100
[ '90',
  '91',
  'a0',
  'a1',
  'b0',
  'b1',
  'c0',
  'c1',
  'd0',
  'd1',
  'e0',
  'e1',
  'f0',
  'f1']

Now it misses everything from 0 to 81

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348.diff, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Mark Robson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740455#action_12740455 ] 

Mark Robson commented on CASSANDRA-348:
---------------------------------------

The LoadAndScan.py script succeeds when there is a single node, and various cases fail when there are more nodes with tokens which overlap the range 0000-ffff 

I have tried it with 1,2 and 4 nodes, the more nodes the more failure cases.

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-348:
-------------------------------------

    Fix Version/s: 0.4

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>            Assignee: Jonathan Ellis
>             Fix For: 0.4
>
>         Attachments: 348-2-fixup-2.patch, 348-2-fixup.patch, 348-2.patch, 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis reassigned CASSANDRA-348:
----------------------------------------

    Assignee: Jonathan Ellis

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>            Assignee: Jonathan Ellis
>         Attachments: 348-2-fixup-2.patch, 348-2-fixup.patch, 348-2.patch, 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Mark Robson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740685#action_12740685 ] 

Mark Robson commented on CASSANDRA-348:
---------------------------------------

With 348-2.patch I get

DEBUG - get_key_range
DEBUG - reading RangeCommand(table='Keyspace1', columnFamily=Standard1, startWith='0', stopAt='1', maxResults=1000) from 593@127.0.0.1:7000
DEBUG - Sending RangeReply(keys=[0000], completed=false) to 593@127.0.0.1:7000
DEBUG - Processing response on an async result from 593@127.0.0.1:7000
DEBUG - reading RangeCommand(table='Keyspace1', columnFamily=Standard1, startWith='0', stopAt='1', maxResults=999) from 594@127.0.0.2:7000
DEBUG - Processing response on an async result from 594@127.0.0.2:7000
ERROR - Internal error processing get_key_range
java.lang.UnsupportedOperationException
        at java.util.Collections$UnmodifiableCollection.addAll(Collections.java:1044)
        at org.apache.cassandra.service.StorageProxy.getKeyRange(StorageProxy.java:673)
        at org.apache.cassandra.service.CassandraServer.get_key_range(CassandraServer.java:557)
        at org.apache.cassandra.service.Cassandra$Processor$get_key_range.process(Cassandra.java:1095)
        at org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:758)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:636)

When attempting to do a range scan which crosses over nodes.

Also get the warning:

    [javac] Note: /home/mark/cassandra/cassandra-trunk/src/java/org/apache/cassandra/tools/KeyChecker.java uses or overrides a deprecated API.

At compile-time.


> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348-2.patch, 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Mark Robson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Robson updated CASSANDRA-348:
----------------------------------

    Comment: was deleted

(was: This patch goes on top of 348-2.patch to fix the exception I described earlier. My Java is not good so this may not be the right way of doing it, but it works for me.)

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348-2-fixup.patch, 348-2.patch, 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Mark Robson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Robson updated CASSANDRA-348:
----------------------------------

    Attachment: 348-2-fixup.patch

This patch goes on top of 348-2.patch and fixes the exception I described.

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348-2-fixup.patch, 348-2.patch, 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Mark Robson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740091#action_12740091 ] 

Mark Robson commented on CASSANDRA-348:
---------------------------------------

I am aware that the keys are strings :)

Keys should presumably not HAVE to be within the range of two tokens in the ring - keys outside the range will be stored anyway?

I tried the above, same result:

cassandra/Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 '' g 100
[ '00',
  '90',
  '91',
  'a0',
  'a1',
  'b0',
  'b1',
  'c0',
  'c1',
  'd0',
  'd1',
  'e0',
  'e1',
  'f0',
  'f1']


> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348.diff, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740881#action_12740881 ] 

Hudson commented on CASSANDRA-348:
----------------------------------

Integrated in Cassandra #161 (See [http://hudson.zones.apache.org/hudson/job/Cassandra/161/])
    - switch from trying to get the next endpoint by increasing offset to asking tokenMetadata for "the next
one." this will always be correct where the offset approach will not (usually you want offset to just be 1,
but sometimes you have to keep increasing it if no results are found but the range is still not finished)
 - merge results differently when the endpoint responsible for where the ring wraps is involved, since
that endpoint can hold keys from both the beginning and end of the range.

patch by jbellis; tested by Mark Robson for 


> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>            Assignee: Jonathan Ellis
>         Attachments: 348-2-fixup-2.patch, 348-2-fixup.patch, 348-2.patch, 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Mark Robson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740751#action_12740751 ] 

Mark Robson commented on CASSANDRA-348:
---------------------------------------

Technically this is fixed as I can't reproduce it in a two-node setup any more. On the other hand, some range scans still return missing results in a three-node setup.

So either, close this and open a new one for the three-node case, or continue to work on a solution.

I fired up three nodes with tokens 0,4,8 then used the attached LoadAndScan.py.

This gives 3 errors out of 120 get_key_range commands. When run on two nodes (0,8) it passes, as on a single node.

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348-2-fixup-2.patch, 348-2-fixup.patch, 348-2.patch, 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740758#action_12740758 ] 

Jonathan Ellis commented on CASSANDRA-348:
------------------------------------------

can you post the debug logs from a 3-node failure as before?

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348-2-fixup-2.patch, 348-2-fixup.patch, 348-2.patch, 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Mark Robson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742569#action_12742569 ] 

Mark Robson commented on CASSANDRA-348:
---------------------------------------

With the -3 patch, LoadAndScan.py now passes with 3 and 4 nodes if ReplicationFactor=1.

Unfortunately, setting ReplicationFactor=2 now breaks it.


> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>            Assignee: Jonathan Ellis
>             Fix For: 0.4
>
>         Attachments: 348-2-fixup-2.patch, 348-2-fixup.patch, 348-2.patch, 348-3.patch, 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742820#action_12742820 ] 

Hudson commented on CASSANDRA-348:
----------------------------------

Integrated in Cassandra #166 (See [http://hudson.zones.apache.org/hudson/job/Cassandra/166/])
    give up on trying to optimize startWith -- it's basically impossible when replication factor > 1 b/c of the range wrap point.
patch by jbellis; tested by Mark Robson for 


> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>            Assignee: Jonathan Ellis
>             Fix For: 0.4
>
>         Attachments: 348-2-fixup-2.patch, 348-2-fixup.patch, 348-2.patch, 348-3-v2.patch, 348-3.patch, 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740159#action_12740159 ] 

Jun Rao commented on CASSANDRA-348:
-----------------------------------

Interesting. It seems the problem is that you started with a key that's in the middle btw 2 adjacent tokens and you need to go back to the very first node to complete the full scan. The current code seems to stop as soon as you hit the first node again. It seems that this will happen whether you roundup or rounddown. So, maybe we should let the first node be scanned twice, one at the beginning and another at the end.


> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348.diff, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-348:
-------------------------------------

    Attachment: 348.diff

this patch fixes a minor bug (probably not the cause of your problems) and adds debug logging.  can you try with this patch and post the debug statements involving RangeCommand and RangeReply?

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348.diff, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740531#action_12740531 ] 

Hudson commented on CASSANDRA-348:
----------------------------------

Integrated in Cassandra #160 (See [http://hudson.zones.apache.org/hudson/job/Cassandra/160/])
    fix range query buglet; add debug logging
patch by jbellis; tested by Mark Robson for 


> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Mark Robson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742732#action_12742732 ] 

Mark Robson commented on CASSANDRA-348:
---------------------------------------

Jonathan, 

The latest patch passes every range scan I have thrown at it, including with replication > 1

So it all looks good to me.

I will hopefully incorporate my tests into the suite soon.

Mark

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>            Assignee: Jonathan Ellis
>             Fix For: 0.4
>
>         Attachments: 348-2-fixup-2.patch, 348-2-fixup.patch, 348-2.patch, 348-3-v2.patch, 348-3.patch, 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740077#action_12740077 ] 

Jonathan Ellis commented on CASSANDRA-348:
------------------------------------------

remember these are string keys, not really numeric.  '0' is not part of the ['00000000' , '88888888' ) range.  (neither is the key '00' of course.)

i bet you get all the keys if you query for '', 'g' instead of '0', 'g'.

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348.diff, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740115#action_12740115 ] 

Jun Rao commented on CASSANDRA-348:
-----------------------------------

The node selection code seems to be designed only for RackUnaware.

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348.diff, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740723#action_12740723 ] 

Jonathan Ellis commented on CASSANDRA-348:
------------------------------------------

Committed -2 with a simpler fix for the readonly list.

If you can find a way to reproduce the remaining bug in a 2-node setup that will make it easier to debug.

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348-2-fixup-2.patch, 348-2-fixup.patch, 348-2.patch, 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Mark Robson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740711#action_12740711 ] 

Mark Robson commented on CASSANDRA-348:
---------------------------------------

With the combined efforts of your patch and my patch, range scans are a lot *closer* to working correctly. My system test program runs successfully with two nodes 0 and 8, but still fails a few test cases when there are four nodes 0,4,8,c

However, the remaining test cases which are failing are ones where a range covers at least three nodes.These are unlikely to happen to anyone in production unless their nodes are very close together or their keys very sparse and they're doing massive range scans.

But it would be nice if we covered all cases.

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348-2-fixup-2.patch, 348-2-fixup.patch, 348-2.patch, 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Mark Robson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Robson updated CASSANDRA-348:
----------------------------------

    Attachment: LoadAndScan.py

I have attached a python script LoadAndScan.py which uses the thrift interface to load a bunch of test data then do lots of range scans to check the results are right.

This can be made into an automated system test, you are free to use it.

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Mark Robson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Robson updated CASSANDRA-348:
----------------------------------

    Attachment: setup.cas

This is a cassandra-cli script used to load the test data which gets the results above.



> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Mark Robson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Robson updated CASSANDRA-348:
----------------------------------

    Attachment: 348-2-fixup-2.patch

This patch 348-2-fixup-2.patch supersedes the previous one and fixes another case where it was trying to modify a readonly list.

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348-2-fixup-2.patch, 348-2-fixup.patch, 348-2.patch, 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Mark Robson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740065#action_12740065 ] 

Mark Robson commented on CASSANDRA-348:
---------------------------------------

I've applied the patch and the bug is still there, here is the debug output:

NODE 1 debug output:

DEBUG - get_key_range
DEBUG - reading RangeCommand(table='Keyspace1', columnFamily=Standard1, startWith='0', stopAt='g', maxResults=100) from 58@127.0.0.1:7000
DEBUG - Sending RangeReply(keys=[00, 90, 91, a0, a1, b0, b1, c0, c1, d0, d1, e0, e1, f0, f1], completed=false) to 58@127.0.0.1:7000
DEBUG - Processing response on an async result from 58@127.0.0.1:7000
DEBUG - reading RangeCommand(table='Keyspace1', columnFamily=Standard1, startWith='f1', stopAt='g', maxResults=85) from 59@127.0.0.2:7000
DEBUG - Processing response on an async result from 59@127.0.0.2:7000

NODE 2 debug output:

DEBUG - Sending RangeReply(keys=[], completed=false) to 59@127.0.0.1:7000

bin/nodeprobe -host localhost ring
DEBUG - Loading settings from bin/../conf/storage-conf.xml
Token(00000000)                                 1 127.0.0.2      |<--|
Token(88888888)                                 1 127.0.0.1      |-->|


> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348.diff, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Michael Greene (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Greene updated CASSANDRA-348:
-------------------------------------

    Component/s: Core

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>            Assignee: Jonathan Ellis
>             Fix For: 0.4
>
>         Attachments: 348-2-fixup-2.patch, 348-2-fixup.patch, 348-2.patch, 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-348:
-------------------------------------

    Attachment: 348-3-v2.patch

fixes replication > 1 bugs

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>            Assignee: Jonathan Ellis
>             Fix For: 0.4
>
>         Attachments: 348-2-fixup-2.patch, 348-2-fixup.patch, 348-2.patch, 348-3-v2.patch, 348-3.patch, 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740110#action_12740110 ] 

Jonathan Ellis commented on CASSANDRA-348:
------------------------------------------

looks like a bug in the node selection code.

i'll commit the bugfix+logging patch for now.

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348.diff, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740216#action_12740216 ] 

Jonathan Ellis commented on CASSANDRA-348:
------------------------------------------

You are right, we're going to have this problem either way we round the keys to tokens.  Take this example, I was wrong about how the tokens would work, it would be

[00000000, 88888888) A
[88888888, inf) and ['', 00000000) B

so either way starting from '' you're going to have to re-scan part of the same range when you wrap.

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>         Attachments: 348.diff, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-348) Range scan over two nodes returns wrong data

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-348:
-------------------------------------

    Attachment: 348-3.patch

with the -3 patch, LoadAndScan.py passes all tests on 3 nodes for me.

> Range scan over two nodes returns wrong data
> --------------------------------------------
>
>                 Key: CASSANDRA-348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-348
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.4
>            Reporter: Mark Robson
>            Assignee: Jonathan Ellis
>             Fix For: 0.4
>
>         Attachments: 348-2-fixup-2.patch, 348-2-fixup.patch, 348-2.patch, 348-3.patch, 348.diff, LoadAndScan.py, setup.cas
>
>
> I've got two nodes with tokens 00000000 and 88888888. I add 16 rows in which are spread over them, then do a key range scan.
> I can scan part of the range successfully, but if I try to scan the entire range of keys (0-f) then I get unexpected results
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 31 1000
> ['00', '01', '10', '11', '20', '21', '30', '31']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 3 81 1000
> ['30', '31', '40', '41', '50', '51', '60', '61', '70', '71', '80', '81']
>  ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 7 b1 1000
> ['70', '71', '80', '81', '90', '91', 'a0', 'a1', 'b0', 'b1']
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 a g 1000
> ['a0', 'a1', 'b0', 'b1', 'c0', 'c1', 'd0', 'd1', 'e0', 'e1', 'f0', 'f1']
> All of which returned as I expected.
> But when I range scan the whole lot (0-g) then I get:
> ./Cassandra-remote -h localhost:9160 get_key_range Keyspace1 Standard1 0 g 1000
> [ '00',
>   '90',
>   '91',
>   'a0',
>   'a1',
>   'b0',
>   'b1',
>   'c0',
>   'c1',
>   'd0',
>   'd1',
>   'e0',
>   'e1',
>   'f0',
>   'f1']
> Where have 01-81 gone?
> I'll attach the data loading script.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.