You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Zhongxiang Zheng (JIRA)" <ji...@apache.org> on 2017/01/16 12:29:26 UTC
[jira] [Created] (CASSANDRA-13125) Duplicate rows after upgrading
from 2.1.16 to 3.0.10/3.9
Zhongxiang Zheng created CASSANDRA-13125:
--------------------------------------------
Summary: Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9
Key: CASSANDRA-13125
URL: https://issues.apache.org/jira/browse/CASSANDRA-13125
Project: Cassandra
Issue Type: Bug
Reporter: Zhongxiang Zheng
I found that rows are splitting and duplicated after upgrading the cluster from 2.1.x to 3.0.x.
I found the way to reproduce the problem as below.
{code}
$ ccm create test -v 2.1.16 -n 3 -s
Current cluster is now: test
$ ccm node1 cqlsh -e "CREATE KEYSPACE test WITH replication = {'class':'SimpleStrategy', 'replication_factor':3}"
$ ccm node1 cqlsh -e "CREATE TABLE test.test (id text PRIMARY KEY, value1 set<text>, value2 set<text>);"
# Upgrade node1
$ for i in 1; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm node${i} start;ccm node${i} nodetool upgradesstables; done
# Insert a row through node1(3.0.10)
$ ccm node1 cqlsh -e "INSERT INTO test.test (id, value1, value2) values ('aaa', {'aaa', 'bbb'}, {'ccc', 'ddd'});"
# Insert a row through node2(2.1.16)
$ ccm node2 cqlsh -e "INSERT INTO test.test (id, value1, value2) values ('bbb', {'aaa', 'bbb'}, {'ccc', 'ddd'});"
# The row inserted from node1 is splitting
$ ccm node1 cqlsh -e "SELECT * FROM test.test ;"
id | value1 | value2
-----+----------------+----------------
aaa | null | null
aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'}
bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'}
$ for i in 1 2; do ccm node${i} nodetool flush; done
# Results of sstable2json of node2. The row inserted from node1(3.0.10) is different from the row inserted from node2(2.1.16).
$ ccm node2 json -k test -c test
running
['/home/zzheng/.ccm/test/node2/data0/test/test-5406ee80dbdb11e6a175f57c4c7c85f3/test-test-ka-1-Data.db']
-- test-test-ka-1-Data.db -----
[
{"key": "aaa",
"cells": [["","",1484564624769577],
["value1","value2:!",1484564624769576,"t",1484564624],
["value1:616161","",1484564624769577],
["value1:626262","",1484564624769577],
["value2:636363","",1484564624769577],
["value2:646464","",1484564624769577]]},
{"key": "bbb",
"cells": [["","",1484564634508029],
["value1:_","value1:!",1484564634508028,"t",1484564634],
["value1:616161","",1484564634508029],
["value1:626262","",1484564634508029],
["value2:_","value2:!",1484564634508028,"t",1484564634],
["value2:636363","",1484564634508029],
["value2:646464","",1484564634508029]]}
]
# Upgrade node2,3
$ for i in `seq 2 3`; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm node${i} start;ccm node${i} nodetool upgradesstables; done
# After upgrade node2,3, the row inserted from node1 is splitting in node2,3
$ ccm node2 cqlsh -e "SELECT * FROM test.test ;"
id | value1 | value2
-----+----------------+----------------
aaa | null | null
aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'}
bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'}
(3 rows)
# Results of sstabledump
# node1
[
{
"partition" : {
"key" : [ "aaa" ],
"position" : 0
},
"rows" : [
{
"type" : "row",
"position" : 17,
"liveness_info" : { "tstamp" : "2017-01-16T11:03:44.769577Z" },
"cells" : [
{ "name" : "value1", "deletion_info" : { "marked_deleted" : "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } },
{ "name" : "value1", "path" : [ "aaa" ], "value" : "" },
{ "name" : "value1", "path" : [ "bbb" ], "value" : "" },
{ "name" : "value2", "deletion_info" : { "marked_deleted" : "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } },
{ "name" : "value2", "path" : [ "ccc" ], "value" : "" },
{ "name" : "value2", "path" : [ "ddd" ], "value" : "" }
]
}
]
},
{
"partition" : {
"key" : [ "bbb" ],
"position" : 48
},
"rows" : [
{
"type" : "row",
"position" : 65,
"liveness_info" : { "tstamp" : "2017-01-16T11:03:54.508029Z" },
"cells" : [
{ "name" : "value1", "deletion_info" : { "marked_deleted" : "2017-01-16T11:03:54.508028Z", "local_delete_time" : "2017-01-16T11:03:54Z" } },
{ "name" : "value1", "path" : [ "aaa" ], "value" : "" },
{ "name" : "value1", "path" : [ "bbb" ], "value" : "" },
{ "name" : "value2", "deletion_info" : { "marked_deleted" : "2017-01-16T11:03:54.508028Z", "local_delete_time" : "2017-01-16T11:03:54Z" } },
{ "name" : "value2", "path" : [ "ccc" ], "value" : "" },
{ "name" : "value2", "path" : [ "ddd" ], "value" : "" }
]
}
]
}
]
# node2
[
{
"partition" : {
"key" : [ "aaa" ],
"position" : 0
},
"rows" : [
{
"type" : "row",
"position" : 17,
"liveness_info" : { "tstamp" : "2017-01-16T11:03:44.769577Z" },
"cells" : [ ]
},
{
"type" : "row",
"position" : 22,
"deletion_info" : { "marked_deleted" : "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" },
"cells" : [
{ "name" : "value1", "path" : [ "aaa" ], "value" : "", "tstamp" : "2017-01-16T11:03:44.769577Z" },
{ "name" : "value1", "path" : [ "bbb" ], "value" : "", "tstamp" : "2017-01-16T11:03:44.769577Z" },
{ "name" : "value2", "path" : [ "ccc" ], "value" : "", "tstamp" : "2017-01-16T11:03:44.769577Z" },
{ "name" : "value2", "path" : [ "ddd" ], "value" : "", "tstamp" : "2017-01-16T11:03:44.769577Z" }
]
}
]
},
{
"partition" : {
"key" : [ "bbb" ],
"position" : 57
},
"rows" : [
{
"type" : "row",
"position" : 74,
"liveness_info" : { "tstamp" : "2017-01-16T11:03:54.508029Z" },
"cells" : [
{ "name" : "value1", "deletion_info" : { "marked_deleted" : "2017-01-16T11:03:54.508028Z", "local_delete_time" : "2017-01-16T11:03:54Z" } },
{ "name" : "value1", "path" : [ "aaa" ], "value" : "" },
{ "name" : "value1", "path" : [ "bbb" ], "value" : "" },
{ "name" : "value2", "deletion_info" : { "marked_deleted" : "2017-01-16T11:03:54.508028Z", "local_delete_time" : "2017-01-16T11:03:54Z" } },
{ "name" : "value2", "path" : [ "ccc" ], "value" : "" },
{ "name" : "value2", "path" : [ "ddd" ], "value" : "" }
]
}
]
}
]
{code}
Another example of row splitting is as follows.
{code}
$ ccm create test2 -v 2.1.16 -n 3 -s
Current cluster is now: test2
$ ccm node1 cqlsh -e "CREATE KEYSPACE test WITH replication = {'class':'SimpleStrategy', 'replication_factor':3}"
$ ccm node1 cqlsh -e "CREATE TABLE test.text_set_set (id text PRIMARY KEY, value1 text, value2 set<text>, value3 set<text>);"
$ for i in `seq 1`; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm node${i} start;ccm node${i} nodetool upgradesstables; done
$ ccm node1 cqlsh -e "INSERT INTO test.text_set_set (id, value1, value2, value3) values ('aaa', 'aaa', {'aaa', 'bbb'}, {'ccc', 'ddd'});"
$ ccm node1 cqlsh -e "SELECT * FROM test.text_set_set;"
id | value1 | value2 | value3
-----+--------+----------------+----------------
aaa | aaa | null | null
aaa | null | {'aaa', 'bbb'} | {'ccc', 'ddd'}
(2 rows)
{code}
As far as I investigated, the occurrence conditions are as follows.
* Table schema contains multiple collections.
* Insert a row, which values of the collection column are not null through 3.x node while both 2.1 and 3.x nodes exist in a cluster.
* Rows in sstables of node which version was 2.1 at the time the row was inserted is splitting after upgrading to 3.x.
Thanks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)