You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Anubhav Kale (JIRA)" <ji...@apache.org> on 2016/02/10 23:12:18 UTC

[jira] [Updated] (CASSANDRA-11143) Schema changes don't propagate correctly if nodes are down

     [ https://issues.apache.org/jira/browse/CASSANDRA-11143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Anubhav Kale updated CASSANDRA-11143:
-------------------------------------
    Description: 
We saw a problem similar to what I describe below in our PROD environment a few times. Below is a consistent repro. We can change the priority to Minor since there is a workaround, though.

Using steps from http://stackoverflow.com/questions/22513979/setting-up-cassandra-multi-node-cluster-on-a-single-ubuntu-server/25348301#25348301, setup a two node cluster locally. 

. Bring up both nodes
. Create a table, and ensure cqlsh is correctly showing it on both nodes.
. Bring down one node
. Drop and re-create the same table Or change some schema in the table.
. Bring up the down node.

You will notice the exceptions like below (because of schema mismatch), and the new schema never propagates to this node that was down ((meaning  a select * via cqlsh will continue to show old schema for the table). I let the cluster run for an hour to see if gossip will somehow catch up. 

However, the interesting part is if you restart this node that was down when schema changes were made, the exception below goes away and it gets new schema correctly. 

What is it caching that a second restart is necessary to make it behave correctly ?

ERROR 00:23:33 Configuration exception merging remote schema
org.apache.cassandra.exceptions.ConfigurationException: Column family ID mismatch (found 7208d260-cf8c-11e5-a13b-fb6871b443fb; expected e2839010-cf7e-11e5-a13b-fb6871b443fb)
	at org.apache.cassandra.config.CFMetaData.validateCompatibility(CFMetaData.java:783) ~[main/:na]
	at org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:743) ~[main/:na]
	at org.apache.cassandra.config.Schema.updateTable(Schema.java:626) ~[main/:na]
	at org.apach


  was:
We saw a problem similar to what I describe below in our PROD environment a few times. Below is a consistent repro. We can change the priority to Minor since there is a workaround, though.

Using steps from http://stackoverflow.com/questions/22513979/setting-up-cassandra-multi-node-cluster-on-a-single-ubuntu-server/25348301#25348301, setup a two node cluster locally. 

. Bring up both nodes
. Create a table, and ensure cqlsh is correctly showing it on both nodes.
. Bring down one node
. Drop and re-create the same table Or change some schema in the table.
. Bring up the down node.

You will notice the exceptions like below (because of schema mismatch), and the new schema never propagates to this node that was down ((meaning cqlsh will continue to show old schema for the table). I let the cluster run for an hour to see if gossip will somehow catch up. 

However, the interesting part is if you restart this node that was down when schema changes were made, the exception below goes away and it gets new schema correctly. 

What is it caching that a second restart is necessary to make it behave correctly ?

ERROR 00:23:33 Configuration exception merging remote schema
org.apache.cassandra.exceptions.ConfigurationException: Column family ID mismatch (found 7208d260-cf8c-11e5-a13b-fb6871b443fb; expected e2839010-cf7e-11e5-a13b-fb6871b443fb)
	at org.apache.cassandra.config.CFMetaData.validateCompatibility(CFMetaData.java:783) ~[main/:na]
	at org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:743) ~[main/:na]
	at org.apache.cassandra.config.Schema.updateTable(Schema.java:626) ~[main/:na]
	at org.apach



> Schema changes don't propagate correctly if nodes are down
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-11143
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11143
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: PROD
>            Reporter: Anubhav Kale
>
> We saw a problem similar to what I describe below in our PROD environment a few times. Below is a consistent repro. We can change the priority to Minor since there is a workaround, though.
> Using steps from http://stackoverflow.com/questions/22513979/setting-up-cassandra-multi-node-cluster-on-a-single-ubuntu-server/25348301#25348301, setup a two node cluster locally. 
> . Bring up both nodes
> . Create a table, and ensure cqlsh is correctly showing it on both nodes.
> . Bring down one node
> . Drop and re-create the same table Or change some schema in the table.
> . Bring up the down node.
> You will notice the exceptions like below (because of schema mismatch), and the new schema never propagates to this node that was down ((meaning  a select * via cqlsh will continue to show old schema for the table). I let the cluster run for an hour to see if gossip will somehow catch up. 
> However, the interesting part is if you restart this node that was down when schema changes were made, the exception below goes away and it gets new schema correctly. 
> What is it caching that a second restart is necessary to make it behave correctly ?
> ERROR 00:23:33 Configuration exception merging remote schema
> org.apache.cassandra.exceptions.ConfigurationException: Column family ID mismatch (found 7208d260-cf8c-11e5-a13b-fb6871b443fb; expected e2839010-cf7e-11e5-a13b-fb6871b443fb)
> 	at org.apache.cassandra.config.CFMetaData.validateCompatibility(CFMetaData.java:783) ~[main/:na]
> 	at org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:743) ~[main/:na]
> 	at org.apache.cassandra.config.Schema.updateTable(Schema.java:626) ~[main/:na]
> 	at org.apach



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)