You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Josh Smith <js...@ionicsecurity.com> on 2016/11/15 18:04:25 UTC

Schema Changes

Would someone please explain how schema changes happen?
Here are some of the ring details
We have 5 nodes in 1 DC and 5 nodes in another DC across the country.
Here is our problem, we have a tool which automates our schema creation. Our schema consists of 7 keyspaces with 21 tables in each keyspace, so a total of 147 tables are created at the initial provisioning.  During this schema creation we end up with system_schema keyspace corruption, we have found that it is due to schema version disagreement. To combat this we setup a wait until there is only one version in both system.local and system.peers tables.
The way I understand it schema changes are made on the local node only; changes are then propagated through either Thrift or Gossip, I could not find a definitive answer online if thrift or gossip was the carrier. So if I make all of the schema changes to one node it should propagate the changes to the other nodes one at a time. This is how I used to think that schema changes are propagated but we still get schema disagreement when changing the schema only on one node. Is the only option to introduce a wait after every table creation?  Should we be looking at another table besides system.local and peers? Any help would be appreciated.

Josh Smith

Re: Schema Changes

Posted by Matija Gobec <ma...@gmail.com>.
We used cassandra migration tool for schema versioning and schema
agreement. Check it out here
<https://github.com/smartcat-labs/cassandra-migration-tool-java>.

Short:
When executing schema altering statements use these to wait for schema
propagation
resultSet.getExecutionInfo().isSchemaInAgreement()
and
session.getCluster().getMetadata().checkSchemaAgreement()

For detailed info check driver documentation. This solution is based on this
fix <https://datastax-oss.atlassian.net/browse/JAVA-669>.

Matija

On Tue, Nov 15, 2016 at 7:32 PM, Edward Capriolo <ed...@gmail.com>
wrote:

> You can start here:
>
> https://issues.apache.org/jira/browse/CASSANDRA-10699
>
> And here:
>
> http://stackoverflow.com/questions/20293897/cassandra-
> resolution-of-concurrent-schema-changes
>
> In a nutshell, schema changes works best when issued serially, when all
> nodes are up, and reachable. When these 3 conditions are not met a variety
> of behavior can be observed.
>
> On Tue, Nov 15, 2016 at 1:04 PM, Josh Smith <js...@ionicsecurity.com>
> wrote:
>
>> Would someone please explain how schema changes happen?
>>
>> Here are some of the ring details
>>
>> We have 5 nodes in 1 DC and 5 nodes in another DC across the country.
>>
>> Here is our problem, we have a tool which automates our schema creation.
>> Our schema consists of 7 keyspaces with 21 tables in each keyspace, so a
>> total of 147 tables are created at the initial provisioning.  During this
>> schema creation we end up with system_schema keyspace corruption, we have
>> found that it is due to schema version disagreement. To combat this we
>> setup a wait until there is only one version in both system.local and
>> system.peers tables.
>>
>> The way I understand it schema changes are made on the local node only;
>> changes are then propagated through either Thrift or Gossip, I could not
>> find a definitive answer online if thrift or gossip was the carrier. So if
>> I make all of the schema changes to one node it should propagate the
>> changes to the other nodes one at a time. This is how I used to think that
>> schema changes are propagated but we still get schema disagreement when
>> changing the schema only on one node. Is the only option to introduce a
>> wait after every table creation?  Should we be looking at another table
>> besides system.local and peers? Any help would be appreciated.
>>
>>
>>
>> Josh Smith
>>
>
>

Re: Schema Changes

Posted by Edward Capriolo <ed...@gmail.com>.
You can start here:

https://issues.apache.org/jira/browse/CASSANDRA-10699

And here:

http://stackoverflow.com/questions/20293897/cassandra-resolution-of-concurrent-schema-changes

In a nutshell, schema changes works best when issued serially, when all
nodes are up, and reachable. When these 3 conditions are not met a variety
of behavior can be observed.

On Tue, Nov 15, 2016 at 1:04 PM, Josh Smith <js...@ionicsecurity.com>
wrote:

> Would someone please explain how schema changes happen?
>
> Here are some of the ring details
>
> We have 5 nodes in 1 DC and 5 nodes in another DC across the country.
>
> Here is our problem, we have a tool which automates our schema creation.
> Our schema consists of 7 keyspaces with 21 tables in each keyspace, so a
> total of 147 tables are created at the initial provisioning.  During this
> schema creation we end up with system_schema keyspace corruption, we have
> found that it is due to schema version disagreement. To combat this we
> setup a wait until there is only one version in both system.local and
> system.peers tables.
>
> The way I understand it schema changes are made on the local node only;
> changes are then propagated through either Thrift or Gossip, I could not
> find a definitive answer online if thrift or gossip was the carrier. So if
> I make all of the schema changes to one node it should propagate the
> changes to the other nodes one at a time. This is how I used to think that
> schema changes are propagated but we still get schema disagreement when
> changing the schema only on one node. Is the only option to introduce a
> wait after every table creation?  Should we be looking at another table
> besides system.local and peers? Any help would be appreciated.
>
>
>
> Josh Smith
>

Re: Schema Changes

Posted by Fabrice Facorat <fa...@gmail.com>.
Schema are propagated by GOSSIP

you can check schema propagation cluster wide with nodetool describecluster
or "nodetool gossipinfo | grep SCHEMA | cut -f3 -d: | sort | uniq -c"

You'd better send your DDL instruction to only one node (for example by
using the whitelist load balancing policy with only 1 host specified), this
way your schemas changes will be serialized and you will avoid issues and
race conditions



2016-11-15 19:04 GMT+01:00 Josh Smith <js...@ionicsecurity.com>:

> Would someone please explain how schema changes happen?
>
> Here are some of the ring details
>
> We have 5 nodes in 1 DC and 5 nodes in another DC across the country.
>
> Here is our problem, we have a tool which automates our schema creation.
> Our schema consists of 7 keyspaces with 21 tables in each keyspace, so a
> total of 147 tables are created at the initial provisioning.  During this
> schema creation we end up with system_schema keyspace corruption, we have
> found that it is due to schema version disagreement. To combat this we
> setup a wait until there is only one version in both system.local and
> system.peers tables.
>
> The way I understand it schema changes are made on the local node only;
> changes are then propagated through either Thrift or Gossip, I could not
> find a definitive answer online if thrift or gossip was the carrier. So if
> I make all of the schema changes to one node it should propagate the
> changes to the other nodes one at a time. This is how I used to think that
> schema changes are propagated but we still get schema disagreement when
> changing the schema only on one node. Is the only option to introduce a
> wait after every table creation?  Should we be looking at another table
> besides system.local and peers? Any help would be appreciated.
>
>
>
> Josh Smith
>



-- 
Close the World, Open the Net
http://www.linux-wizard.net