You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by "olek.stasiak@gmail.com" <ol...@gmail.com> on 2014/03/11 13:30:44 UTC
Problems with adding datacenter and schema version disagreement
Hi All,
I've faced an issue with cassandra 2.0.5.
I've 6 node cluster with random partitioner, still using tokens
instead of vnodes.
Cause we're changing hardware we decide to migrate cluster to 6 new
machines and change partitioning options to vnode rather then
token-based.
I've followed instruction on site:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html
and started cassandra on 6 new nodes in new DC. Everything seems to
work correctly, nodes were seen from all others as up and normal.
Then i performed nodetool repair -pr on the first of new nodes.
But process falls into infinite loop, sending/receiving merkle trees
over and over. It hangs on one very small KS it there were no hope it
will stop sometime (process was running whole night).
So I decided to stop the repair and restart cass on this particular
new node. after restart 'Ive tried repair one more time with another
small KS, but it also falls into infinite loop.
So i decided to break the procedure of adding datacenter, remove nodes
from new DC and start all from scratch.
After running removenode on all new nodes I've wiped data dir and
start cassandra on new node once again. During the start messages
"org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
cfId=98bb99a2-42f2-3fcd-af67-208a4faae5fa"
appears in logs. Google said, that they may mean problems with schema
versions consistency, so I performed describe cluster in cassandra-cli
and i get:
Cluster Information:
Name: Metadata Cluster
Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
76198f8b-663f-3434-8860-251ebc6f50c4: [150.254.164.4]
f48d3512-e299-3508-a29d-0844a0293f3a: [150.254.164.3]
16ad2e35-1eef-32f0-995c-e2cbd4c18abf: [150.254.164.6]
72352017-9b0d-3b29-8c55-ed86f30363c5: [150.254.164.1]
7f1faa84-0821-3311-9232-9407500591cc: [150.254.164.5]
85cd0ebc-5d33-3bec-a682-8c5880ee2fa1: [150.254.164.2]
So now I have 6 diff schema version for cluster. But how it can
happened? How can I take my cluster to consistent state?
What did I wrong during extending cluster, so nodetool falls into infinite loop?
At the first sight data looks ok, I can read from cluster and I'm
getting expected output.
best regards
Aleksander
Re: Problems with adding datacenter and schema version disagreement
Posted by Umut Kocasaraç <uk...@gmail.com>.
We have upgraded our cassandra version to 2.0.7 and problem has been solved.
Thanks Russ,
On Wed, Apr 16, 2014 at 6:46 PM, Russell Hatch <rh...@datastax.com> wrote:
> I think you might be seeing the issue reported in
> https://issues.apache.org/jira/browse/CASSANDRA-6971
>
> If that's the case, it looks like a fix will be in 2.0.7
>
> Thanks,
>
> Russ
>
>
> On Tue, Apr 15, 2014 at 11:48 PM, Umut Kocasaraç <uk...@gmail.com>wrote:
>
>> Hi Olek,
>>
>> Could you solve the problem. Because we are experiencing exactly same
>> issue. We have 4 nodes and all of them are in different schema. Our
>> cassandra version is 2.0.6.
>>
>> Umut
>>
>>
>> On Fri, Mar 21, 2014 at 12:26 AM, Robert Coli <rc...@eventbrite.com>wrote:
>>
>>> On Thu, Mar 20, 2014 at 2:23 PM, olek.stasiak@gmail.com <
>>> olek.stasiak@gmail.com> wrote:
>>>
>>>> Bump one more time, could anybody help me?
>>>>
>>>
>>> Unfortunately, while I can advise you how to resolve "my cluster has
>>> multiple schema versions", I have no assistance to offer for "my cluster
>>> ends up with split schema versions when I do normal operations."
>>>
>>> I suggest opening up a JIRA in the apache cassandra JIRA describing what
>>> you did and how the result differed from what you expected.
>>>
>>> =Rob
>>>
>>>
>>
>>
>
Re: Problems with adding datacenter and schema version disagreement
Posted by Russell Hatch <rh...@datastax.com>.
I think you might be seeing the issue reported in
https://issues.apache.org/jira/browse/CASSANDRA-6971
If that's the case, it looks like a fix will be in 2.0.7
Thanks,
Russ
On Tue, Apr 15, 2014 at 11:48 PM, Umut Kocasaraç <uk...@gmail.com>wrote:
> Hi Olek,
>
> Could you solve the problem. Because we are experiencing exactly same
> issue. We have 4 nodes and all of them are in different schema. Our
> cassandra version is 2.0.6.
>
> Umut
>
>
> On Fri, Mar 21, 2014 at 12:26 AM, Robert Coli <rc...@eventbrite.com>wrote:
>
>> On Thu, Mar 20, 2014 at 2:23 PM, olek.stasiak@gmail.com <
>> olek.stasiak@gmail.com> wrote:
>>
>>> Bump one more time, could anybody help me?
>>>
>>
>> Unfortunately, while I can advise you how to resolve "my cluster has
>> multiple schema versions", I have no assistance to offer for "my cluster
>> ends up with split schema versions when I do normal operations."
>>
>> I suggest opening up a JIRA in the apache cassandra JIRA describing what
>> you did and how the result differed from what you expected.
>>
>> =Rob
>>
>>
>
>
Re: Problems with adding datacenter and schema version disagreement
Posted by Umut Kocasaraç <uk...@gmail.com>.
Hi Olek,
Could you solve the problem. Because we are experiencing exactly same
issue. We have 4 nodes and all of them are in different schema. Our
cassandra version is 2.0.6.
Umut
On Fri, Mar 21, 2014 at 12:26 AM, Robert Coli <rc...@eventbrite.com> wrote:
> On Thu, Mar 20, 2014 at 2:23 PM, olek.stasiak@gmail.com <
> olek.stasiak@gmail.com> wrote:
>
>> Bump one more time, could anybody help me?
>>
>
> Unfortunately, while I can advise you how to resolve "my cluster has
> multiple schema versions", I have no assistance to offer for "my cluster
> ends up with split schema versions when I do normal operations."
>
> I suggest opening up a JIRA in the apache cassandra JIRA describing what
> you did and how the result differed from what you expected.
>
> =Rob
>
>
Re: Problems with adding datacenter and schema version disagreement
Posted by Robert Coli <rc...@eventbrite.com>.
On Thu, Mar 20, 2014 at 2:23 PM, olek.stasiak@gmail.com <
olek.stasiak@gmail.com> wrote:
> Bump one more time, could anybody help me?
>
Unfortunately, while I can advise you how to resolve "my cluster has
multiple schema versions", I have no assistance to offer for "my cluster
ends up with split schema versions when I do normal operations."
I suggest opening up a JIRA in the apache cassandra JIRA describing what
you did and how the result differed from what you expected.
=Rob
Re: Problems with adding datacenter and schema version disagreement
Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
Bump one more time, could anybody help me?
regards
Olek
2014-03-19 16:44 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
> Bump, could anyone comment this behaviour, is it correct, or should I
> create Jira task for this problems?
> regards
> Olek
>
> 2014-03-18 16:49 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
>> Oh, one more question: what should be configuration for storing
>> system_traces keyspace? Should it be replicated or stored locally?
>> Regards
>> Olek
>>
>> 2014-03-18 16:47 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
>>> Ok, i've dropped all system keyspaces, rebuild cluster and recover
>>> schema, now everything looks ok.
>>> But main goal of operations was to add new datacenter to cluster.
>>> After starting node in new cluster two schema versions appear, one
>>> version is held by 6 nodes of first datacenter, second one is in newly
>>> added node in new datacenter. Sth like this:
>>> nodetool status
>>> Datacenter: datacenter1
>>> =======================
>>> Status=Up/Down
>>> |/ State=Normal/Leaving/Joining/Moving
>>> -- Address Load Tokens Owns Host ID
>>> Rack
>>> UN 192.168.1.1 50.19 GB 1 0,5%
>>> c9323f38-d9c4-4a69-96e3-76cd4e1a204e rack1
>>> UN 192.168.1.2 54.83 GB 1 0,3%
>>> ad1de2a9-2149-4f4a-aec6-5087d9d3acbb rack1
>>> UN 192.168.1.3 51.14 GB 1 0,6%
>>> 0ceef523-93fe-4684-ba4b-4383106fe3d1 rack1
>>> UN 192.168.1.4 54.31 GB 1 0,7%
>>> 39d15471-456d-44da-bdc8-221f3c212c78 rack1
>>> UN 192.168.1.5 53.36 GB 1 0,3%
>>> 7fed25a5-e018-43df-b234-47c2f118879b rack1
>>> UN 192.168.1.6 39.89 GB 1 0,1%
>>> 9f54fad6-949a-4fa9-80da-87efd62f3260 rack1
>>> Datacenter: DC1
>>> ===============
>>> Status=Up/Down
>>> |/ State=Normal/Leaving/Joining/Moving
>>> -- Address Load Tokens Owns Host ID
>>> Rack
>>> UN 192.168.1.7 100.77 KB 256 97,4%
>>> ddb1f913-d075-4840-9665-3ba64eda0558 RAC1
>>>
>>> describe cluster;
>>> Cluster Information:
>>> Name: Metadata Cluster
>>> Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
>>> Partitioner: org.apache.cassandra.dht.RandomPartitioner
>>> Schema versions:
>>> 8fe34841-4f2a-3c05-97f2-15dd413d71dc: [192.168.1.7]
>>>
>>> 4ad381b6-df5a-3cbc-ba5a-0234b74d2383: [192.168.1.1, 192.168.1.2,
>>> 192.168.1.3, 192.168.1.4, 192.168.1.5, 192.168.1.6]
>>>
>>> All keyspaces are now configured to keep data in datacenter1.
>>> I assume, that It's not correct behaviour, is it true?
>>> Could you help me, how can I safely add new DC to the cluster?
>>>
>>> Regards
>>> Aleksander
>>>
>>>
>>> 2014-03-14 18:28 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
>>>> Ok, I'll do this during the weekend, I'll give you a feedback on Monday.
>>>> Regards
>>>> Aleksander
>>>>
>>>> 14 mar 2014 18:15 "Robert Coli" <rc...@eventbrite.com> napisał(a):
>>>>
>>>>> On Fri, Mar 14, 2014 at 12:40 AM, olek.stasiak@gmail.com
>>>>> <ol...@gmail.com> wrote:
>>>>>>
>>>>>> OK, I see, so the data files stay in place, i have to just stop
>>>>>> cassandra on whole cluster, remove system schema and then start
>>>>>> cluster and recreate all keyspaces with all column families? Data will
>>>>>> be than loaded automatically from existing ssstables, right?
>>>>>
>>>>>
>>>>> Right. If you have clients reading while loading the schema, they may get
>>>>> exceptions.
>>>>>
>>>>>>
>>>>>> So one more question: what about KS system_traces? should it be
>>>>>> removed and recreted? What data it's holding?
>>>>>
>>>>>
>>>>> It's holding data about tracing, a profiling feature. It's safe to nuke.
>>>>>
>>>>> =Rob
>>>>>
Re: Problems with adding datacenter and schema version disagreement
Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
Bump, could anyone comment this behaviour, is it correct, or should I
create Jira task for this problems?
regards
Olek
2014-03-18 16:49 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
> Oh, one more question: what should be configuration for storing
> system_traces keyspace? Should it be replicated or stored locally?
> Regards
> Olek
>
> 2014-03-18 16:47 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
>> Ok, i've dropped all system keyspaces, rebuild cluster and recover
>> schema, now everything looks ok.
>> But main goal of operations was to add new datacenter to cluster.
>> After starting node in new cluster two schema versions appear, one
>> version is held by 6 nodes of first datacenter, second one is in newly
>> added node in new datacenter. Sth like this:
>> nodetool status
>> Datacenter: datacenter1
>> =======================
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> -- Address Load Tokens Owns Host ID
>> Rack
>> UN 192.168.1.1 50.19 GB 1 0,5%
>> c9323f38-d9c4-4a69-96e3-76cd4e1a204e rack1
>> UN 192.168.1.2 54.83 GB 1 0,3%
>> ad1de2a9-2149-4f4a-aec6-5087d9d3acbb rack1
>> UN 192.168.1.3 51.14 GB 1 0,6%
>> 0ceef523-93fe-4684-ba4b-4383106fe3d1 rack1
>> UN 192.168.1.4 54.31 GB 1 0,7%
>> 39d15471-456d-44da-bdc8-221f3c212c78 rack1
>> UN 192.168.1.5 53.36 GB 1 0,3%
>> 7fed25a5-e018-43df-b234-47c2f118879b rack1
>> UN 192.168.1.6 39.89 GB 1 0,1%
>> 9f54fad6-949a-4fa9-80da-87efd62f3260 rack1
>> Datacenter: DC1
>> ===============
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> -- Address Load Tokens Owns Host ID
>> Rack
>> UN 192.168.1.7 100.77 KB 256 97,4%
>> ddb1f913-d075-4840-9665-3ba64eda0558 RAC1
>>
>> describe cluster;
>> Cluster Information:
>> Name: Metadata Cluster
>> Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
>> Partitioner: org.apache.cassandra.dht.RandomPartitioner
>> Schema versions:
>> 8fe34841-4f2a-3c05-97f2-15dd413d71dc: [192.168.1.7]
>>
>> 4ad381b6-df5a-3cbc-ba5a-0234b74d2383: [192.168.1.1, 192.168.1.2,
>> 192.168.1.3, 192.168.1.4, 192.168.1.5, 192.168.1.6]
>>
>> All keyspaces are now configured to keep data in datacenter1.
>> I assume, that It's not correct behaviour, is it true?
>> Could you help me, how can I safely add new DC to the cluster?
>>
>> Regards
>> Aleksander
>>
>>
>> 2014-03-14 18:28 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
>>> Ok, I'll do this during the weekend, I'll give you a feedback on Monday.
>>> Regards
>>> Aleksander
>>>
>>> 14 mar 2014 18:15 "Robert Coli" <rc...@eventbrite.com> napisał(a):
>>>
>>>> On Fri, Mar 14, 2014 at 12:40 AM, olek.stasiak@gmail.com
>>>> <ol...@gmail.com> wrote:
>>>>>
>>>>> OK, I see, so the data files stay in place, i have to just stop
>>>>> cassandra on whole cluster, remove system schema and then start
>>>>> cluster and recreate all keyspaces with all column families? Data will
>>>>> be than loaded automatically from existing ssstables, right?
>>>>
>>>>
>>>> Right. If you have clients reading while loading the schema, they may get
>>>> exceptions.
>>>>
>>>>>
>>>>> So one more question: what about KS system_traces? should it be
>>>>> removed and recreted? What data it's holding?
>>>>
>>>>
>>>> It's holding data about tracing, a profiling feature. It's safe to nuke.
>>>>
>>>> =Rob
>>>>
Re: Problems with adding datacenter and schema version disagreement
Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
Oh, one more question: what should be configuration for storing
system_traces keyspace? Should it be replicated or stored locally?
Regards
Olek
2014-03-18 16:47 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
> Ok, i've dropped all system keyspaces, rebuild cluster and recover
> schema, now everything looks ok.
> But main goal of operations was to add new datacenter to cluster.
> After starting node in new cluster two schema versions appear, one
> version is held by 6 nodes of first datacenter, second one is in newly
> added node in new datacenter. Sth like this:
> nodetool status
> Datacenter: datacenter1
> =======================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> -- Address Load Tokens Owns Host ID
> Rack
> UN 192.168.1.1 50.19 GB 1 0,5%
> c9323f38-d9c4-4a69-96e3-76cd4e1a204e rack1
> UN 192.168.1.2 54.83 GB 1 0,3%
> ad1de2a9-2149-4f4a-aec6-5087d9d3acbb rack1
> UN 192.168.1.3 51.14 GB 1 0,6%
> 0ceef523-93fe-4684-ba4b-4383106fe3d1 rack1
> UN 192.168.1.4 54.31 GB 1 0,7%
> 39d15471-456d-44da-bdc8-221f3c212c78 rack1
> UN 192.168.1.5 53.36 GB 1 0,3%
> 7fed25a5-e018-43df-b234-47c2f118879b rack1
> UN 192.168.1.6 39.89 GB 1 0,1%
> 9f54fad6-949a-4fa9-80da-87efd62f3260 rack1
> Datacenter: DC1
> ===============
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> -- Address Load Tokens Owns Host ID
> Rack
> UN 192.168.1.7 100.77 KB 256 97,4%
> ddb1f913-d075-4840-9665-3ba64eda0558 RAC1
>
> describe cluster;
> Cluster Information:
> Name: Metadata Cluster
> Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
> Partitioner: org.apache.cassandra.dht.RandomPartitioner
> Schema versions:
> 8fe34841-4f2a-3c05-97f2-15dd413d71dc: [192.168.1.7]
>
> 4ad381b6-df5a-3cbc-ba5a-0234b74d2383: [192.168.1.1, 192.168.1.2,
> 192.168.1.3, 192.168.1.4, 192.168.1.5, 192.168.1.6]
>
> All keyspaces are now configured to keep data in datacenter1.
> I assume, that It's not correct behaviour, is it true?
> Could you help me, how can I safely add new DC to the cluster?
>
> Regards
> Aleksander
>
>
> 2014-03-14 18:28 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
>> Ok, I'll do this during the weekend, I'll give you a feedback on Monday.
>> Regards
>> Aleksander
>>
>> 14 mar 2014 18:15 "Robert Coli" <rc...@eventbrite.com> napisał(a):
>>
>>> On Fri, Mar 14, 2014 at 12:40 AM, olek.stasiak@gmail.com
>>> <ol...@gmail.com> wrote:
>>>>
>>>> OK, I see, so the data files stay in place, i have to just stop
>>>> cassandra on whole cluster, remove system schema and then start
>>>> cluster and recreate all keyspaces with all column families? Data will
>>>> be than loaded automatically from existing ssstables, right?
>>>
>>>
>>> Right. If you have clients reading while loading the schema, they may get
>>> exceptions.
>>>
>>>>
>>>> So one more question: what about KS system_traces? should it be
>>>> removed and recreted? What data it's holding?
>>>
>>>
>>> It's holding data about tracing, a profiling feature. It's safe to nuke.
>>>
>>> =Rob
>>>
Re: Problems with adding datacenter and schema version disagreement
Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
Ok, i've dropped all system keyspaces, rebuild cluster and recover
schema, now everything looks ok.
But main goal of operations was to add new datacenter to cluster.
After starting node in new cluster two schema versions appear, one
version is held by 6 nodes of first datacenter, second one is in newly
added node in new datacenter. Sth like this:
nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID
Rack
UN 192.168.1.1 50.19 GB 1 0,5%
c9323f38-d9c4-4a69-96e3-76cd4e1a204e rack1
UN 192.168.1.2 54.83 GB 1 0,3%
ad1de2a9-2149-4f4a-aec6-5087d9d3acbb rack1
UN 192.168.1.3 51.14 GB 1 0,6%
0ceef523-93fe-4684-ba4b-4383106fe3d1 rack1
UN 192.168.1.4 54.31 GB 1 0,7%
39d15471-456d-44da-bdc8-221f3c212c78 rack1
UN 192.168.1.5 53.36 GB 1 0,3%
7fed25a5-e018-43df-b234-47c2f118879b rack1
UN 192.168.1.6 39.89 GB 1 0,1%
9f54fad6-949a-4fa9-80da-87efd62f3260 rack1
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID
Rack
UN 192.168.1.7 100.77 KB 256 97,4%
ddb1f913-d075-4840-9665-3ba64eda0558 RAC1
describe cluster;
Cluster Information:
Name: Metadata Cluster
Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
8fe34841-4f2a-3c05-97f2-15dd413d71dc: [192.168.1.7]
4ad381b6-df5a-3cbc-ba5a-0234b74d2383: [192.168.1.1, 192.168.1.2,
192.168.1.3, 192.168.1.4, 192.168.1.5, 192.168.1.6]
All keyspaces are now configured to keep data in datacenter1.
I assume, that It's not correct behaviour, is it true?
Could you help me, how can I safely add new DC to the cluster?
Regards
Aleksander
2014-03-14 18:28 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
> Ok, I'll do this during the weekend, I'll give you a feedback on Monday.
> Regards
> Aleksander
>
> 14 mar 2014 18:15 "Robert Coli" <rc...@eventbrite.com> napisał(a):
>
>> On Fri, Mar 14, 2014 at 12:40 AM, olek.stasiak@gmail.com
>> <ol...@gmail.com> wrote:
>>>
>>> OK, I see, so the data files stay in place, i have to just stop
>>> cassandra on whole cluster, remove system schema and then start
>>> cluster and recreate all keyspaces with all column families? Data will
>>> be than loaded automatically from existing ssstables, right?
>>
>>
>> Right. If you have clients reading while loading the schema, they may get
>> exceptions.
>>
>>>
>>> So one more question: what about KS system_traces? should it be
>>> removed and recreted? What data it's holding?
>>
>>
>> It's holding data about tracing, a profiling feature. It's safe to nuke.
>>
>> =Rob
>>
Re: Problems with adding datacenter and schema version disagreement
Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
Ok, I'll do this during the weekend, I'll give you a feedback on Monday.
Regards
Aleksander
14 mar 2014 18:15 "Robert Coli" <rc...@eventbrite.com> napisał(a):
> On Fri, Mar 14, 2014 at 12:40 AM, olek.stasiak@gmail.com <
> olek.stasiak@gmail.com> wrote:
>
>> OK, I see, so the data files stay in place, i have to just stop
>> cassandra on whole cluster, remove system schema and then start
>> cluster and recreate all keyspaces with all column families? Data will
>> be than loaded automatically from existing ssstables, right?
>>
>
> Right. If you have clients reading while loading the schema, they may get
> exceptions.
>
>
>> So one more question: what about KS system_traces? should it be
>> removed and recreted? What data it's holding?
>>
>
> It's holding data about tracing, a profiling feature. It's safe to nuke.
>
> =Rob
>
>
Re: Problems with adding datacenter and schema version disagreement
Posted by Robert Coli <rc...@eventbrite.com>.
On Fri, Mar 14, 2014 at 12:40 AM, olek.stasiak@gmail.com <
olek.stasiak@gmail.com> wrote:
> OK, I see, so the data files stay in place, i have to just stop
> cassandra on whole cluster, remove system schema and then start
> cluster and recreate all keyspaces with all column families? Data will
> be than loaded automatically from existing ssstables, right?
>
Right. If you have clients reading while loading the schema, they may get
exceptions.
> So one more question: what about KS system_traces? should it be
> removed and recreted? What data it's holding?
>
It's holding data about tracing, a profiling feature. It's safe to nuke.
=Rob
Re: Problems with adding datacenter and schema version disagreement
Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
OK, I see, so the data files stay in place, i have to just stop
cassandra on whole cluster, remove system schema and then start
cluster and recreate all keyspaces with all column families? Data will
be than loaded automatically from existing ssstables, right?
So one more question: what about KS system_traces? should it be
removed and recreted? What data it's holding?
best regards
Aleksander
2014-03-14 0:14 GMT+01:00 Robert Coli <rc...@eventbrite.com>:
> On Thu, Mar 13, 2014 at 1:20 PM, olek.stasiak@gmail.com
> <ol...@gmail.com> wrote:
>>
>> Huh,
>> you mean json dump?
>
>
> If you're using cassandra-cli, I mean the output of "show schema;"
>
> If you're using CQLsh, there is an analogous way to show all schema.
>
> 1) dump schema to a file via one of the above tools
> 2) stop cassandra and nuke system keyspaces everywhere
> 3) start cassandra, coalesce cluster
> 4) load schema
>
> =Rob
>
Re: Problems with adding datacenter and schema version disagreement
Posted by Robert Coli <rc...@eventbrite.com>.
On Thu, Mar 13, 2014 at 1:20 PM, olek.stasiak@gmail.com <
olek.stasiak@gmail.com> wrote:
> Huh,
> you mean json dump?
>
If you're using cassandra-cli, I mean the output of "show schema;"
If you're using CQLsh, there is an analogous way to show all schema.
1) dump schema to a file via one of the above tools
2) stop cassandra and nuke system keyspaces everywhere
3) start cassandra, coalesce cluster
4) load schema
=Rob
Re: Problems with adding datacenter and schema version disagreement
Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
Huh,
you mean json dump?
Regards
Aleksander
2014-03-13 18:59 GMT+01:00 Robert Coli <rc...@eventbrite.com>:
> On Thu, Mar 13, 2014 at 2:05 AM, olek.stasiak@gmail.com
> <ol...@gmail.com> wrote:
>>
>> Bump, are there any solutions to bring my cluster back to schema
>> consistency?
>> I've 6 node cluster with exactly six versions of schema, how to deal with
>> it?
>
>
> The simplest way, which is most likely to actually work, is to down all
> nodes, nuke schema, and reload it from a dump.
>
> =Rob
>
Re: Problems with adding datacenter and schema version disagreement
Posted by Robert Coli <rc...@eventbrite.com>.
On Thu, Mar 13, 2014 at 2:05 AM, olek.stasiak@gmail.com <
olek.stasiak@gmail.com> wrote:
> Bump, are there any solutions to bring my cluster back to schema
> consistency?
> I've 6 node cluster with exactly six versions of schema, how to deal with
> it?
>
The simplest way, which is most likely to actually work, is to down all
nodes, nuke schema, and reload it from a dump.
=Rob
Re: Problems with adding datacenter and schema version disagreement
Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
Bump, are there any solutions to bring my cluster back to schema consistency?
I've 6 node cluster with exactly six versions of schema, how to deal with it?
regards
Aleksander
2014-03-11 14:36 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
> Didn't help :)
> thanks and regards
> Aleksander
>
> 2014-03-11 14:14 GMT+01:00 Duncan Sands <du...@gmail.com>:
>> On 11/03/14 14:00, olek.stasiak@gmail.com wrote:
>>>
>>> I plan to install 2.0.6 as soon as it will be available in datastax rpm
>>> repo.
>>> But how to deal with schema inconsistency on such scale?
>>
>>
>> Does it get better if you restart all the nodes? In my case restarting just
>> some of the nodes didn't help, but restarting all nodes did.
>>
>> Ciao, Duncan.
Re: Problems with adding datacenter and schema version disagreement
Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
Didn't help :)
thanks and regards
Aleksander
2014-03-11 14:14 GMT+01:00 Duncan Sands <du...@gmail.com>:
> On 11/03/14 14:00, olek.stasiak@gmail.com wrote:
>>
>> I plan to install 2.0.6 as soon as it will be available in datastax rpm
>> repo.
>> But how to deal with schema inconsistency on such scale?
>
>
> Does it get better if you restart all the nodes? In my case restarting just
> some of the nodes didn't help, but restarting all nodes did.
>
> Ciao, Duncan.
Re: Problems with adding datacenter and schema version disagreement
Posted by Duncan Sands <du...@gmail.com>.
On 11/03/14 14:00, olek.stasiak@gmail.com wrote:
> I plan to install 2.0.6 as soon as it will be available in datastax rpm repo.
> But how to deal with schema inconsistency on such scale?
Does it get better if you restart all the nodes? In my case restarting just
some of the nodes didn't help, but restarting all nodes did.
Ciao, Duncan.
Re: Problems with adding datacenter and schema version disagreement
Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
I plan to install 2.0.6 as soon as it will be available in datastax rpm repo.
But how to deal with schema inconsistency on such scale?
best regards
Aleksander
2014-03-11 13:40 GMT+01:00 Duncan Sands <du...@gmail.com>:
> Hi Aleksander, this may be related to CASSANDRA-6799 and CASSANDRA-6700 (if
> it is caused by CASSANDRA-6700 then you are in luck: it is fixed in 2.0.6).
>
> Best wishes, Duncan.
>
>
> On 11/03/14 13:30, olek.stasiak@gmail.com wrote:
>>
>> Hi All,
>> I've faced an issue with cassandra 2.0.5.
>> I've 6 node cluster with random partitioner, still using tokens
>> instead of vnodes.
>> Cause we're changing hardware we decide to migrate cluster to 6 new
>> machines and change partitioning options to vnode rather then
>> token-based.
>> I've followed instruction on site:
>>
>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html
>> and started cassandra on 6 new nodes in new DC. Everything seems to
>> work correctly, nodes were seen from all others as up and normal.
>> Then i performed nodetool repair -pr on the first of new nodes.
>> But process falls into infinite loop, sending/receiving merkle trees
>> over and over. It hangs on one very small KS it there were no hope it
>> will stop sometime (process was running whole night).
>> So I decided to stop the repair and restart cass on this particular
>> new node. after restart 'Ive tried repair one more time with another
>> small KS, but it also falls into infinite loop.
>> So i decided to break the procedure of adding datacenter, remove nodes
>> from new DC and start all from scratch.
>> After running removenode on all new nodes I've wiped data dir and
>> start cassandra on new node once again. During the start messages
>> "org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
>> cfId=98bb99a2-42f2-3fcd-af67-208a4faae5fa"
>> appears in logs. Google said, that they may mean problems with schema
>> versions consistency, so I performed describe cluster in cassandra-cli
>> and i get:
>> Cluster Information:
>> Name: Metadata Cluster
>> Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
>> Partitioner: org.apache.cassandra.dht.RandomPartitioner
>> Schema versions:
>> 76198f8b-663f-3434-8860-251ebc6f50c4: [150.254.164.4]
>>
>> f48d3512-e299-3508-a29d-0844a0293f3a: [150.254.164.3]
>>
>> 16ad2e35-1eef-32f0-995c-e2cbd4c18abf: [150.254.164.6]
>>
>> 72352017-9b0d-3b29-8c55-ed86f30363c5: [150.254.164.1]
>>
>> 7f1faa84-0821-3311-9232-9407500591cc: [150.254.164.5]
>>
>> 85cd0ebc-5d33-3bec-a682-8c5880ee2fa1: [150.254.164.2]
>>
>> So now I have 6 diff schema version for cluster. But how it can
>> happened? How can I take my cluster to consistent state?
>> What did I wrong during extending cluster, so nodetool falls into infinite
>> loop?
>> At the first sight data looks ok, I can read from cluster and I'm
>> getting expected output.
>> best regards
>> Aleksander
>>
>
Re: Problems with adding datacenter and schema version disagreement
Posted by Duncan Sands <du...@gmail.com>.
Hi Aleksander, this may be related to CASSANDRA-6799 and CASSANDRA-6700 (if it
is caused by CASSANDRA-6700 then you are in luck: it is fixed in 2.0.6).
Best wishes, Duncan.
On 11/03/14 13:30, olek.stasiak@gmail.com wrote:
> Hi All,
> I've faced an issue with cassandra 2.0.5.
> I've 6 node cluster with random partitioner, still using tokens
> instead of vnodes.
> Cause we're changing hardware we decide to migrate cluster to 6 new
> machines and change partitioning options to vnode rather then
> token-based.
> I've followed instruction on site:
> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html
> and started cassandra on 6 new nodes in new DC. Everything seems to
> work correctly, nodes were seen from all others as up and normal.
> Then i performed nodetool repair -pr on the first of new nodes.
> But process falls into infinite loop, sending/receiving merkle trees
> over and over. It hangs on one very small KS it there were no hope it
> will stop sometime (process was running whole night).
> So I decided to stop the repair and restart cass on this particular
> new node. after restart 'Ive tried repair one more time with another
> small KS, but it also falls into infinite loop.
> So i decided to break the procedure of adding datacenter, remove nodes
> from new DC and start all from scratch.
> After running removenode on all new nodes I've wiped data dir and
> start cassandra on new node once again. During the start messages
> "org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
> cfId=98bb99a2-42f2-3fcd-af67-208a4faae5fa"
> appears in logs. Google said, that they may mean problems with schema
> versions consistency, so I performed describe cluster in cassandra-cli
> and i get:
> Cluster Information:
> Name: Metadata Cluster
> Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
> Partitioner: org.apache.cassandra.dht.RandomPartitioner
> Schema versions:
> 76198f8b-663f-3434-8860-251ebc6f50c4: [150.254.164.4]
>
> f48d3512-e299-3508-a29d-0844a0293f3a: [150.254.164.3]
>
> 16ad2e35-1eef-32f0-995c-e2cbd4c18abf: [150.254.164.6]
>
> 72352017-9b0d-3b29-8c55-ed86f30363c5: [150.254.164.1]
>
> 7f1faa84-0821-3311-9232-9407500591cc: [150.254.164.5]
>
> 85cd0ebc-5d33-3bec-a682-8c5880ee2fa1: [150.254.164.2]
>
> So now I have 6 diff schema version for cluster. But how it can
> happened? How can I take my cluster to consistent state?
> What did I wrong during extending cluster, so nodetool falls into infinite loop?
> At the first sight data looks ok, I can read from cluster and I'm
> getting expected output.
> best regards
> Aleksander
>