You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by "olek.stasiak@gmail.com" <ol...@gmail.com> on 2014/03/11 13:30:44 UTC

Problems with adding datacenter and schema version disagreement

Hi All,
I've faced an issue with cassandra 2.0.5.
I've 6 node cluster with random partitioner, still using tokens
instead of vnodes.
Cause we're changing hardware we decide to migrate cluster to 6 new
machines and change partitioning options to vnode rather then
token-based.
I've followed instruction on site:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html
and started cassandra on 6 new nodes in new DC. Everything seems to
work correctly, nodes were seen from all others as up and normal.
Then i performed nodetool repair -pr on the first of new nodes.
But process falls into infinite loop, sending/receiving merkle trees
over and over. It hangs on one very small KS it there were no hope it
will stop sometime (process was running whole night).
So I decided to stop the repair and restart cass on this particular
new node. after restart 'Ive tried repair one more time with another
small KS, but it also falls into infinite loop.
So i decided to break the procedure of adding datacenter, remove nodes
from new DC and start all from scratch.
After running removenode on all new nodes I've wiped data dir and
start cassandra on new node once again. During the start messages
"org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
cfId=98bb99a2-42f2-3fcd-af67-208a4faae5fa"
appears in logs. Google said, that they may mean problems with schema
versions consistency, so I performed describe cluster in cassandra-cli
and i get:
Cluster Information:
   Name: Metadata Cluster
   Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions:
76198f8b-663f-3434-8860-251ebc6f50c4: [150.254.164.4]

f48d3512-e299-3508-a29d-0844a0293f3a: [150.254.164.3]

16ad2e35-1eef-32f0-995c-e2cbd4c18abf: [150.254.164.6]

72352017-9b0d-3b29-8c55-ed86f30363c5: [150.254.164.1]

7f1faa84-0821-3311-9232-9407500591cc: [150.254.164.5]

85cd0ebc-5d33-3bec-a682-8c5880ee2fa1: [150.254.164.2]

So now I have 6 diff schema version for cluster. But how it can
happened? How can I take my cluster to consistent state?
What did I wrong during extending cluster, so nodetool falls into infinite loop?
At the first sight data looks ok, I can read from cluster and I'm
getting expected output.
best regards
Aleksander

Re: Problems with adding datacenter and schema version disagreement

Posted by Umut Kocasaraç <uk...@gmail.com>.
We have upgraded our cassandra version to 2.0.7 and problem has been solved.

Thanks Russ,


On Wed, Apr 16, 2014 at 6:46 PM, Russell Hatch <rh...@datastax.com> wrote:

> I think you might be seeing the issue reported in
> https://issues.apache.org/jira/browse/CASSANDRA-6971
>
> If that's the case, it looks like a fix will be in 2.0.7
>
> Thanks,
>
> Russ
>
>
> On Tue, Apr 15, 2014 at 11:48 PM, Umut Kocasaraç <uk...@gmail.com>wrote:
>
>> Hi Olek,
>>
>> Could you solve the problem. Because we are experiencing exactly same
>> issue. We have 4 nodes and all of them are in different schema. Our
>> cassandra version is 2.0.6.
>>
>> Umut
>>
>>
>> On Fri, Mar 21, 2014 at 12:26 AM, Robert Coli <rc...@eventbrite.com>wrote:
>>
>>> On Thu, Mar 20, 2014 at 2:23 PM, olek.stasiak@gmail.com <
>>> olek.stasiak@gmail.com> wrote:
>>>
>>>> Bump one more time, could anybody help me?
>>>>
>>>
>>> Unfortunately, while I can advise you how to resolve "my cluster has
>>> multiple schema versions", I have no assistance to offer for "my cluster
>>> ends up with split schema versions when I do normal operations."
>>>
>>> I suggest opening up a JIRA in the apache cassandra JIRA describing what
>>> you did and how the result differed from what you expected.
>>>
>>> =Rob
>>>
>>>
>>
>>
>

Re: Problems with adding datacenter and schema version disagreement

Posted by Russell Hatch <rh...@datastax.com>.
I think you might be seeing the issue reported in
https://issues.apache.org/jira/browse/CASSANDRA-6971

If that's the case, it looks like a fix will be in 2.0.7

Thanks,

Russ


On Tue, Apr 15, 2014 at 11:48 PM, Umut Kocasaraç <uk...@gmail.com>wrote:

> Hi Olek,
>
> Could you solve the problem. Because we are experiencing exactly same
> issue. We have 4 nodes and all of them are in different schema. Our
> cassandra version is 2.0.6.
>
> Umut
>
>
> On Fri, Mar 21, 2014 at 12:26 AM, Robert Coli <rc...@eventbrite.com>wrote:
>
>> On Thu, Mar 20, 2014 at 2:23 PM, olek.stasiak@gmail.com <
>> olek.stasiak@gmail.com> wrote:
>>
>>> Bump one more time, could anybody help me?
>>>
>>
>> Unfortunately, while I can advise you how to resolve "my cluster has
>> multiple schema versions", I have no assistance to offer for "my cluster
>> ends up with split schema versions when I do normal operations."
>>
>> I suggest opening up a JIRA in the apache cassandra JIRA describing what
>> you did and how the result differed from what you expected.
>>
>> =Rob
>>
>>
>
>

Re: Problems with adding datacenter and schema version disagreement

Posted by Umut Kocasaraç <uk...@gmail.com>.
Hi Olek,

Could you solve the problem. Because we are experiencing exactly same
issue. We have 4 nodes and all of them are in different schema. Our
cassandra version is 2.0.6.

Umut


On Fri, Mar 21, 2014 at 12:26 AM, Robert Coli <rc...@eventbrite.com> wrote:

> On Thu, Mar 20, 2014 at 2:23 PM, olek.stasiak@gmail.com <
> olek.stasiak@gmail.com> wrote:
>
>> Bump one more time, could anybody help me?
>>
>
> Unfortunately, while I can advise you how to resolve "my cluster has
> multiple schema versions", I have no assistance to offer for "my cluster
> ends up with split schema versions when I do normal operations."
>
> I suggest opening up a JIRA in the apache cassandra JIRA describing what
> you did and how the result differed from what you expected.
>
> =Rob
>
>

Re: Problems with adding datacenter and schema version disagreement

Posted by Robert Coli <rc...@eventbrite.com>.
On Thu, Mar 20, 2014 at 2:23 PM, olek.stasiak@gmail.com <
olek.stasiak@gmail.com> wrote:

> Bump one more time, could anybody help me?
>

Unfortunately, while I can advise you how to resolve "my cluster has
multiple schema versions", I have no assistance to offer for "my cluster
ends up with split schema versions when I do normal operations."

I suggest opening up a JIRA in the apache cassandra JIRA describing what
you did and how the result differed from what you expected.

=Rob

Re: Problems with adding datacenter and schema version disagreement

Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
Bump one more time, could anybody help me?
regards
Olek

2014-03-19 16:44 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
> Bump, could anyone comment this behaviour, is it correct, or should I
> create Jira task for this problems?
> regards
> Olek
>
> 2014-03-18 16:49 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
>> Oh, one more question: what should be configuration for storing
>> system_traces keyspace? Should it be replicated or stored locally?
>> Regards
>> Olek
>>
>> 2014-03-18 16:47 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
>>> Ok, i've dropped all system keyspaces, rebuild cluster and recover
>>> schema, now everything looks ok.
>>> But main goal of operations was to add new datacenter to cluster.
>>> After starting node in new cluster two schema versions appear, one
>>> version is held by 6 nodes of first datacenter, second one is in newly
>>> added node in new datacenter. Sth like this:
>>> nodetool status
>>> Datacenter: datacenter1
>>> =======================
>>> Status=Up/Down
>>> |/ State=Normal/Leaving/Joining/Moving
>>> --  Address        Load       Tokens  Owns   Host ID
>>>             Rack
>>> UN  192.168.1.1  50.19 GB   1       0,5%
>>> c9323f38-d9c4-4a69-96e3-76cd4e1a204e  rack1
>>> UN  192.168.1.2  54.83 GB   1       0,3%
>>> ad1de2a9-2149-4f4a-aec6-5087d9d3acbb  rack1
>>> UN  192.168.1.3  51.14 GB   1       0,6%
>>> 0ceef523-93fe-4684-ba4b-4383106fe3d1  rack1
>>> UN  192.168.1.4  54.31 GB   1       0,7%
>>> 39d15471-456d-44da-bdc8-221f3c212c78  rack1
>>> UN  192.168.1.5  53.36 GB   1       0,3%
>>> 7fed25a5-e018-43df-b234-47c2f118879b  rack1
>>> UN  192.168.1.6  39.89 GB   1       0,1%
>>> 9f54fad6-949a-4fa9-80da-87efd62f3260  rack1
>>> Datacenter: DC1
>>> ===============
>>> Status=Up/Down
>>> |/ State=Normal/Leaving/Joining/Moving
>>> --  Address        Load       Tokens  Owns   Host ID
>>>             Rack
>>> UN  192.168.1.7  100.77 KB  256     97,4%
>>> ddb1f913-d075-4840-9665-3ba64eda0558  RAC1
>>>
>>> describe cluster;
>>> Cluster Information:
>>>    Name: Metadata Cluster
>>>    Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
>>>    Partitioner: org.apache.cassandra.dht.RandomPartitioner
>>>    Schema versions:
>>> 8fe34841-4f2a-3c05-97f2-15dd413d71dc: [192.168.1.7]
>>>
>>> 4ad381b6-df5a-3cbc-ba5a-0234b74d2383: [192.168.1.1, 192.168.1.2,
>>> 192.168.1.3, 192.168.1.4, 192.168.1.5, 192.168.1.6]
>>>
>>> All keyspaces are now configured to keep data in datacenter1.
>>> I assume, that It's not correct behaviour, is it true?
>>> Could you help me, how can I safely add new DC to the cluster?
>>>
>>> Regards
>>> Aleksander
>>>
>>>
>>> 2014-03-14 18:28 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
>>>> Ok, I'll do this during the weekend, I'll give you a feedback on Monday.
>>>> Regards
>>>> Aleksander
>>>>
>>>> 14 mar 2014 18:15 "Robert Coli" <rc...@eventbrite.com> napisał(a):
>>>>
>>>>> On Fri, Mar 14, 2014 at 12:40 AM, olek.stasiak@gmail.com
>>>>> <ol...@gmail.com> wrote:
>>>>>>
>>>>>> OK, I see, so the data files stay in place, i have to just stop
>>>>>> cassandra on whole cluster, remove system schema and then start
>>>>>> cluster and recreate all keyspaces with all column families? Data will
>>>>>> be than loaded automatically from existing ssstables, right?
>>>>>
>>>>>
>>>>> Right. If you have clients reading while loading the schema, they may get
>>>>> exceptions.
>>>>>
>>>>>>
>>>>>> So one more question: what about KS system_traces? should it be
>>>>>> removed and recreted? What data it's holding?
>>>>>
>>>>>
>>>>> It's holding data about tracing, a profiling feature. It's safe to nuke.
>>>>>
>>>>> =Rob
>>>>>

Re: Problems with adding datacenter and schema version disagreement

Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
Bump, could anyone comment this behaviour, is it correct, or should I
create Jira task for this problems?
regards
Olek

2014-03-18 16:49 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
> Oh, one more question: what should be configuration for storing
> system_traces keyspace? Should it be replicated or stored locally?
> Regards
> Olek
>
> 2014-03-18 16:47 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
>> Ok, i've dropped all system keyspaces, rebuild cluster and recover
>> schema, now everything looks ok.
>> But main goal of operations was to add new datacenter to cluster.
>> After starting node in new cluster two schema versions appear, one
>> version is held by 6 nodes of first datacenter, second one is in newly
>> added node in new datacenter. Sth like this:
>> nodetool status
>> Datacenter: datacenter1
>> =======================
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address        Load       Tokens  Owns   Host ID
>>             Rack
>> UN  192.168.1.1  50.19 GB   1       0,5%
>> c9323f38-d9c4-4a69-96e3-76cd4e1a204e  rack1
>> UN  192.168.1.2  54.83 GB   1       0,3%
>> ad1de2a9-2149-4f4a-aec6-5087d9d3acbb  rack1
>> UN  192.168.1.3  51.14 GB   1       0,6%
>> 0ceef523-93fe-4684-ba4b-4383106fe3d1  rack1
>> UN  192.168.1.4  54.31 GB   1       0,7%
>> 39d15471-456d-44da-bdc8-221f3c212c78  rack1
>> UN  192.168.1.5  53.36 GB   1       0,3%
>> 7fed25a5-e018-43df-b234-47c2f118879b  rack1
>> UN  192.168.1.6  39.89 GB   1       0,1%
>> 9f54fad6-949a-4fa9-80da-87efd62f3260  rack1
>> Datacenter: DC1
>> ===============
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address        Load       Tokens  Owns   Host ID
>>             Rack
>> UN  192.168.1.7  100.77 KB  256     97,4%
>> ddb1f913-d075-4840-9665-3ba64eda0558  RAC1
>>
>> describe cluster;
>> Cluster Information:
>>    Name: Metadata Cluster
>>    Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
>>    Partitioner: org.apache.cassandra.dht.RandomPartitioner
>>    Schema versions:
>> 8fe34841-4f2a-3c05-97f2-15dd413d71dc: [192.168.1.7]
>>
>> 4ad381b6-df5a-3cbc-ba5a-0234b74d2383: [192.168.1.1, 192.168.1.2,
>> 192.168.1.3, 192.168.1.4, 192.168.1.5, 192.168.1.6]
>>
>> All keyspaces are now configured to keep data in datacenter1.
>> I assume, that It's not correct behaviour, is it true?
>> Could you help me, how can I safely add new DC to the cluster?
>>
>> Regards
>> Aleksander
>>
>>
>> 2014-03-14 18:28 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
>>> Ok, I'll do this during the weekend, I'll give you a feedback on Monday.
>>> Regards
>>> Aleksander
>>>
>>> 14 mar 2014 18:15 "Robert Coli" <rc...@eventbrite.com> napisał(a):
>>>
>>>> On Fri, Mar 14, 2014 at 12:40 AM, olek.stasiak@gmail.com
>>>> <ol...@gmail.com> wrote:
>>>>>
>>>>> OK, I see, so the data files stay in place, i have to just stop
>>>>> cassandra on whole cluster, remove system schema and then start
>>>>> cluster and recreate all keyspaces with all column families? Data will
>>>>> be than loaded automatically from existing ssstables, right?
>>>>
>>>>
>>>> Right. If you have clients reading while loading the schema, they may get
>>>> exceptions.
>>>>
>>>>>
>>>>> So one more question: what about KS system_traces? should it be
>>>>> removed and recreted? What data it's holding?
>>>>
>>>>
>>>> It's holding data about tracing, a profiling feature. It's safe to nuke.
>>>>
>>>> =Rob
>>>>

Re: Problems with adding datacenter and schema version disagreement

Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
Oh, one more question: what should be configuration for storing
system_traces keyspace? Should it be replicated or stored locally?
Regards
Olek

2014-03-18 16:47 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
> Ok, i've dropped all system keyspaces, rebuild cluster and recover
> schema, now everything looks ok.
> But main goal of operations was to add new datacenter to cluster.
> After starting node in new cluster two schema versions appear, one
> version is held by 6 nodes of first datacenter, second one is in newly
> added node in new datacenter. Sth like this:
> nodetool status
> Datacenter: datacenter1
> =======================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address        Load       Tokens  Owns   Host ID
>             Rack
> UN  192.168.1.1  50.19 GB   1       0,5%
> c9323f38-d9c4-4a69-96e3-76cd4e1a204e  rack1
> UN  192.168.1.2  54.83 GB   1       0,3%
> ad1de2a9-2149-4f4a-aec6-5087d9d3acbb  rack1
> UN  192.168.1.3  51.14 GB   1       0,6%
> 0ceef523-93fe-4684-ba4b-4383106fe3d1  rack1
> UN  192.168.1.4  54.31 GB   1       0,7%
> 39d15471-456d-44da-bdc8-221f3c212c78  rack1
> UN  192.168.1.5  53.36 GB   1       0,3%
> 7fed25a5-e018-43df-b234-47c2f118879b  rack1
> UN  192.168.1.6  39.89 GB   1       0,1%
> 9f54fad6-949a-4fa9-80da-87efd62f3260  rack1
> Datacenter: DC1
> ===============
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address        Load       Tokens  Owns   Host ID
>             Rack
> UN  192.168.1.7  100.77 KB  256     97,4%
> ddb1f913-d075-4840-9665-3ba64eda0558  RAC1
>
> describe cluster;
> Cluster Information:
>    Name: Metadata Cluster
>    Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
>    Partitioner: org.apache.cassandra.dht.RandomPartitioner
>    Schema versions:
> 8fe34841-4f2a-3c05-97f2-15dd413d71dc: [192.168.1.7]
>
> 4ad381b6-df5a-3cbc-ba5a-0234b74d2383: [192.168.1.1, 192.168.1.2,
> 192.168.1.3, 192.168.1.4, 192.168.1.5, 192.168.1.6]
>
> All keyspaces are now configured to keep data in datacenter1.
> I assume, that It's not correct behaviour, is it true?
> Could you help me, how can I safely add new DC to the cluster?
>
> Regards
> Aleksander
>
>
> 2014-03-14 18:28 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
>> Ok, I'll do this during the weekend, I'll give you a feedback on Monday.
>> Regards
>> Aleksander
>>
>> 14 mar 2014 18:15 "Robert Coli" <rc...@eventbrite.com> napisał(a):
>>
>>> On Fri, Mar 14, 2014 at 12:40 AM, olek.stasiak@gmail.com
>>> <ol...@gmail.com> wrote:
>>>>
>>>> OK, I see, so the data files stay in place, i have to just stop
>>>> cassandra on whole cluster, remove system schema and then start
>>>> cluster and recreate all keyspaces with all column families? Data will
>>>> be than loaded automatically from existing ssstables, right?
>>>
>>>
>>> Right. If you have clients reading while loading the schema, they may get
>>> exceptions.
>>>
>>>>
>>>> So one more question: what about KS system_traces? should it be
>>>> removed and recreted? What data it's holding?
>>>
>>>
>>> It's holding data about tracing, a profiling feature. It's safe to nuke.
>>>
>>> =Rob
>>>

Re: Problems with adding datacenter and schema version disagreement

Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
Ok, i've dropped all system keyspaces, rebuild cluster and recover
schema, now everything looks ok.
But main goal of operations was to add new datacenter to cluster.
After starting node in new cluster two schema versions appear, one
version is held by 6 nodes of first datacenter, second one is in newly
added node in new datacenter. Sth like this:
nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns   Host ID
            Rack
UN  192.168.1.1  50.19 GB   1       0,5%
c9323f38-d9c4-4a69-96e3-76cd4e1a204e  rack1
UN  192.168.1.2  54.83 GB   1       0,3%
ad1de2a9-2149-4f4a-aec6-5087d9d3acbb  rack1
UN  192.168.1.3  51.14 GB   1       0,6%
0ceef523-93fe-4684-ba4b-4383106fe3d1  rack1
UN  192.168.1.4  54.31 GB   1       0,7%
39d15471-456d-44da-bdc8-221f3c212c78  rack1
UN  192.168.1.5  53.36 GB   1       0,3%
7fed25a5-e018-43df-b234-47c2f118879b  rack1
UN  192.168.1.6  39.89 GB   1       0,1%
9f54fad6-949a-4fa9-80da-87efd62f3260  rack1
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns   Host ID
            Rack
UN  192.168.1.7  100.77 KB  256     97,4%
ddb1f913-d075-4840-9665-3ba64eda0558  RAC1

describe cluster;
Cluster Information:
   Name: Metadata Cluster
   Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions:
8fe34841-4f2a-3c05-97f2-15dd413d71dc: [192.168.1.7]

4ad381b6-df5a-3cbc-ba5a-0234b74d2383: [192.168.1.1, 192.168.1.2,
192.168.1.3, 192.168.1.4, 192.168.1.5, 192.168.1.6]

All keyspaces are now configured to keep data in datacenter1.
I assume, that It's not correct behaviour, is it true?
Could you help me, how can I safely add new DC to the cluster?

Regards
Aleksander


2014-03-14 18:28 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
> Ok, I'll do this during the weekend, I'll give you a feedback on Monday.
> Regards
> Aleksander
>
> 14 mar 2014 18:15 "Robert Coli" <rc...@eventbrite.com> napisał(a):
>
>> On Fri, Mar 14, 2014 at 12:40 AM, olek.stasiak@gmail.com
>> <ol...@gmail.com> wrote:
>>>
>>> OK, I see, so the data files stay in place, i have to just stop
>>> cassandra on whole cluster, remove system schema and then start
>>> cluster and recreate all keyspaces with all column families? Data will
>>> be than loaded automatically from existing ssstables, right?
>>
>>
>> Right. If you have clients reading while loading the schema, they may get
>> exceptions.
>>
>>>
>>> So one more question: what about KS system_traces? should it be
>>> removed and recreted? What data it's holding?
>>
>>
>> It's holding data about tracing, a profiling feature. It's safe to nuke.
>>
>> =Rob
>>

Re: Problems with adding datacenter and schema version disagreement

Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
Ok, I'll do this during the weekend, I'll give you a feedback on Monday.
Regards
Aleksander
14 mar 2014 18:15 "Robert Coli" <rc...@eventbrite.com> napisał(a):

> On Fri, Mar 14, 2014 at 12:40 AM, olek.stasiak@gmail.com <
> olek.stasiak@gmail.com> wrote:
>
>> OK, I see, so the data files stay in place, i have to just stop
>> cassandra on whole cluster, remove system schema and then start
>> cluster and recreate all keyspaces with all column families? Data will
>> be than loaded automatically from existing ssstables, right?
>>
>
> Right. If you have clients reading while loading the schema, they may get
> exceptions.
>
>
>> So one more question: what about KS system_traces? should it be
>> removed and recreted? What data it's holding?
>>
>
> It's holding data about tracing, a profiling feature. It's safe to nuke.
>
> =Rob
>
>

Re: Problems with adding datacenter and schema version disagreement

Posted by Robert Coli <rc...@eventbrite.com>.
On Fri, Mar 14, 2014 at 12:40 AM, olek.stasiak@gmail.com <
olek.stasiak@gmail.com> wrote:

> OK, I see, so the data files stay in place, i have to just stop
> cassandra on whole cluster, remove system schema and then start
> cluster and recreate all keyspaces with all column families? Data will
> be than loaded automatically from existing ssstables, right?
>

Right. If you have clients reading while loading the schema, they may get
exceptions.


> So one more question: what about KS system_traces? should it be
> removed and recreted? What data it's holding?
>

It's holding data about tracing, a profiling feature. It's safe to nuke.

=Rob

Re: Problems with adding datacenter and schema version disagreement

Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
OK, I see, so the data files stay in place, i have to just stop
cassandra on whole cluster, remove system schema and then start
cluster and recreate all keyspaces with all column families? Data will
be than loaded automatically from existing ssstables, right?
So one more question: what about KS system_traces? should it be
removed and recreted? What data it's holding?
best regards
Aleksander

2014-03-14 0:14 GMT+01:00 Robert Coli <rc...@eventbrite.com>:
> On Thu, Mar 13, 2014 at 1:20 PM, olek.stasiak@gmail.com
> <ol...@gmail.com> wrote:
>>
>> Huh,
>> you mean json dump?
>
>
> If you're using cassandra-cli, I mean the output of "show schema;"
>
> If you're using CQLsh, there is an analogous way to show all schema.
>
> 1) dump schema to a file via one of the above tools
> 2) stop cassandra and nuke system keyspaces everywhere
> 3) start cassandra, coalesce cluster
> 4) load schema
>
> =Rob
>

Re: Problems with adding datacenter and schema version disagreement

Posted by Robert Coli <rc...@eventbrite.com>.
On Thu, Mar 13, 2014 at 1:20 PM, olek.stasiak@gmail.com <
olek.stasiak@gmail.com> wrote:

> Huh,
> you mean json dump?
>

If you're using cassandra-cli, I mean the output of "show schema;"

If you're using CQLsh, there is an analogous way to show all schema.

1) dump schema to a file via one of the above tools
2) stop cassandra and nuke system keyspaces everywhere
3) start cassandra, coalesce cluster
4) load schema

=Rob

Re: Problems with adding datacenter and schema version disagreement

Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
Huh,
you mean json dump?
Regards
Aleksander

2014-03-13 18:59 GMT+01:00 Robert Coli <rc...@eventbrite.com>:
> On Thu, Mar 13, 2014 at 2:05 AM, olek.stasiak@gmail.com
> <ol...@gmail.com> wrote:
>>
>> Bump, are there any solutions to bring my cluster back to schema
>> consistency?
>> I've 6 node cluster with exactly six versions of schema, how to deal with
>> it?
>
>
> The simplest way, which is most likely to actually work, is to down all
> nodes, nuke schema, and reload it from a dump.
>
> =Rob
>

Re: Problems with adding datacenter and schema version disagreement

Posted by Robert Coli <rc...@eventbrite.com>.
On Thu, Mar 13, 2014 at 2:05 AM, olek.stasiak@gmail.com <
olek.stasiak@gmail.com> wrote:

> Bump, are there any solutions to bring my cluster back to schema
> consistency?
> I've 6 node cluster with exactly six versions of schema, how to deal with
> it?
>

The simplest way, which is most likely to actually work, is to down all
nodes, nuke schema, and reload it from a dump.

=Rob

Re: Problems with adding datacenter and schema version disagreement

Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
Bump, are there any solutions to bring my cluster back to schema consistency?
I've 6 node cluster with exactly six versions of schema, how to deal with it?
regards
Aleksander

2014-03-11 14:36 GMT+01:00 olek.stasiak@gmail.com <ol...@gmail.com>:
> Didn't help :)
> thanks and regards
> Aleksander
>
> 2014-03-11 14:14 GMT+01:00 Duncan Sands <du...@gmail.com>:
>> On 11/03/14 14:00, olek.stasiak@gmail.com wrote:
>>>
>>> I plan to install 2.0.6 as soon as it will be available in datastax rpm
>>> repo.
>>> But how to deal with schema inconsistency on such scale?
>>
>>
>> Does it get better if you restart all the nodes?  In my case restarting just
>> some of the nodes didn't help, but restarting all nodes did.
>>
>> Ciao, Duncan.

Re: Problems with adding datacenter and schema version disagreement

Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
Didn't help :)
thanks and regards
Aleksander

2014-03-11 14:14 GMT+01:00 Duncan Sands <du...@gmail.com>:
> On 11/03/14 14:00, olek.stasiak@gmail.com wrote:
>>
>> I plan to install 2.0.6 as soon as it will be available in datastax rpm
>> repo.
>> But how to deal with schema inconsistency on such scale?
>
>
> Does it get better if you restart all the nodes?  In my case restarting just
> some of the nodes didn't help, but restarting all nodes did.
>
> Ciao, Duncan.

Re: Problems with adding datacenter and schema version disagreement

Posted by Duncan Sands <du...@gmail.com>.
On 11/03/14 14:00, olek.stasiak@gmail.com wrote:
> I plan to install 2.0.6 as soon as it will be available in datastax rpm repo.
> But how to deal with schema inconsistency on such scale?

Does it get better if you restart all the nodes?  In my case restarting just 
some of the nodes didn't help, but restarting all nodes did.

Ciao, Duncan.

Re: Problems with adding datacenter and schema version disagreement

Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
I plan to install 2.0.6 as soon as it will be available in datastax rpm repo.
But how to deal with schema inconsistency on such scale?
best regards
Aleksander

2014-03-11 13:40 GMT+01:00 Duncan Sands <du...@gmail.com>:
> Hi Aleksander, this may be related to CASSANDRA-6799 and CASSANDRA-6700 (if
> it is caused by CASSANDRA-6700 then you are in luck: it is fixed in 2.0.6).
>
> Best wishes, Duncan.
>
>
> On 11/03/14 13:30, olek.stasiak@gmail.com wrote:
>>
>> Hi All,
>> I've faced an issue with cassandra 2.0.5.
>> I've 6 node cluster with random partitioner, still using tokens
>> instead of vnodes.
>> Cause we're changing hardware we decide to migrate cluster to 6 new
>> machines and change partitioning options to vnode rather then
>> token-based.
>> I've followed instruction on site:
>>
>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html
>> and started cassandra on 6 new nodes in new DC. Everything seems to
>> work correctly, nodes were seen from all others as up and normal.
>> Then i performed nodetool repair -pr on the first of new nodes.
>> But process falls into infinite loop, sending/receiving merkle trees
>> over and over. It hangs on one very small KS it there were no hope it
>> will stop sometime (process was running whole night).
>> So I decided to stop the repair and restart cass on this particular
>> new node. after restart 'Ive tried repair one more time with another
>> small KS, but it also falls into infinite loop.
>> So i decided to break the procedure of adding datacenter, remove nodes
>> from new DC and start all from scratch.
>> After running removenode on all new nodes I've wiped data dir and
>> start cassandra on new node once again. During the start messages
>> "org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
>> cfId=98bb99a2-42f2-3fcd-af67-208a4faae5fa"
>> appears in logs. Google said, that they may mean problems with schema
>> versions consistency, so I performed describe cluster in cassandra-cli
>> and i get:
>> Cluster Information:
>>     Name: Metadata Cluster
>>     Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
>>     Partitioner: org.apache.cassandra.dht.RandomPartitioner
>>     Schema versions:
>> 76198f8b-663f-3434-8860-251ebc6f50c4: [150.254.164.4]
>>
>> f48d3512-e299-3508-a29d-0844a0293f3a: [150.254.164.3]
>>
>> 16ad2e35-1eef-32f0-995c-e2cbd4c18abf: [150.254.164.6]
>>
>> 72352017-9b0d-3b29-8c55-ed86f30363c5: [150.254.164.1]
>>
>> 7f1faa84-0821-3311-9232-9407500591cc: [150.254.164.5]
>>
>> 85cd0ebc-5d33-3bec-a682-8c5880ee2fa1: [150.254.164.2]
>>
>> So now I have 6 diff schema version for cluster. But how it can
>> happened? How can I take my cluster to consistent state?
>> What did I wrong during extending cluster, so nodetool falls into infinite
>> loop?
>> At the first sight data looks ok, I can read from cluster and I'm
>> getting expected output.
>> best regards
>> Aleksander
>>
>

Re: Problems with adding datacenter and schema version disagreement

Posted by Duncan Sands <du...@gmail.com>.
Hi Aleksander, this may be related to CASSANDRA-6799 and CASSANDRA-6700 (if it 
is caused by CASSANDRA-6700 then you are in luck: it is fixed in 2.0.6).

Best wishes, Duncan.

On 11/03/14 13:30, olek.stasiak@gmail.com wrote:
> Hi All,
> I've faced an issue with cassandra 2.0.5.
> I've 6 node cluster with random partitioner, still using tokens
> instead of vnodes.
> Cause we're changing hardware we decide to migrate cluster to 6 new
> machines and change partitioning options to vnode rather then
> token-based.
> I've followed instruction on site:
> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html
> and started cassandra on 6 new nodes in new DC. Everything seems to
> work correctly, nodes were seen from all others as up and normal.
> Then i performed nodetool repair -pr on the first of new nodes.
> But process falls into infinite loop, sending/receiving merkle trees
> over and over. It hangs on one very small KS it there were no hope it
> will stop sometime (process was running whole night).
> So I decided to stop the repair and restart cass on this particular
> new node. after restart 'Ive tried repair one more time with another
> small KS, but it also falls into infinite loop.
> So i decided to break the procedure of adding datacenter, remove nodes
> from new DC and start all from scratch.
> After running removenode on all new nodes I've wiped data dir and
> start cassandra on new node once again. During the start messages
> "org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
> cfId=98bb99a2-42f2-3fcd-af67-208a4faae5fa"
> appears in logs. Google said, that they may mean problems with schema
> versions consistency, so I performed describe cluster in cassandra-cli
> and i get:
> Cluster Information:
>     Name: Metadata Cluster
>     Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
>     Partitioner: org.apache.cassandra.dht.RandomPartitioner
>     Schema versions:
> 76198f8b-663f-3434-8860-251ebc6f50c4: [150.254.164.4]
>
> f48d3512-e299-3508-a29d-0844a0293f3a: [150.254.164.3]
>
> 16ad2e35-1eef-32f0-995c-e2cbd4c18abf: [150.254.164.6]
>
> 72352017-9b0d-3b29-8c55-ed86f30363c5: [150.254.164.1]
>
> 7f1faa84-0821-3311-9232-9407500591cc: [150.254.164.5]
>
> 85cd0ebc-5d33-3bec-a682-8c5880ee2fa1: [150.254.164.2]
>
> So now I have 6 diff schema version for cluster. But how it can
> happened? How can I take my cluster to consistent state?
> What did I wrong during extending cluster, so nodetool falls into infinite loop?
> At the first sight data looks ok, I can read from cluster and I'm
> getting expected output.
> best regards
> Aleksander
>