You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by "Max C." <mc...@core43.com> on 2021/06/01 03:19:47 UTC

Re: multiple clients making schema changes at once

In our case we have a shared dev cluster with (for example) a key space for each developer, a key space for each CI runner, etc. As part of initializing our test suite we setup the schema to match the code that is about to be tested. This can mean multiple CI runners each adding/dropping tables at the same time but for different key spaces.

Our experience is even though the schema changes do not conflict, we still run into schema mismatch problems. Our solution to this was to have a lock (external to Cassandra) that ensures only a single schema change operation is being issued at a time.

People assume schema changes in Cassandra work the same way as MySQL or multiple users editing files on disk — i.e. as long as you’re not editing the same file (or same MySQL table), then there’s no problem. This is NOT the case. Cassandra schema changes are more like “git push”ing a commit to the same branch — i.e. at most one change can be outstanding at a time (across all tables, all key spaces)…otherwise you will run into trouble.

Hope that helps. Best of luck.

- Max

Hello,

I have a more general question about that, I cannot find clear answer.

In my use case I have many tables (around 10k new tables created per months) and they are created from many clients and only dynamically, with several clients creating same tables simulteanously.

What is the recommended way of creating tables dynamically? If I am doing "if not exists" queries + wait for schema aggreement before and after each create statement, will it work correctly for Cassandra?

Sébastien.

Re: multiple clients making schema changes at once

Posted by Erick Ramirez <er...@datastax.com>.

Having said that, I'm still not a fan of making schema changes
programmatically. I spend way too much time helping users unscramble their
schema after they've hit multiple disagreements. I do understand the need
for it but avoid it if you can particularly in production.

On Fri, 4 Jun 2021 at 09:41, Erick Ramirez <er...@datastax.com>
wrote:

> I wonder if there’s a way to query the driver to see if your schema change
>> has fully propagated.  I haven’t looked into this.
>>
>
> Yes, the drivers have APIs for this. For example, the Java driver has
> isSchemaInAgreement() and checkSchemaAgreement().
>
> See
> https://docs.datastax.com/en/developer/java-driver/latest/manual/core/metadata/schema/.
> Cheers!
>
>

Re: multiple clients making schema changes at once

Posted by Erick Ramirez <er...@datastax.com>.

>
> I wonder if there’s a way to query the driver to see if your schema change
> has fully propagated.  I haven’t looked into this.
>

Yes, the drivers have APIs for this. For example, the Java driver has
isSchemaInAgreement() and checkSchemaAgreement().

See
https://docs.datastax.com/en/developer/java-driver/latest/manual/core/metadata/schema/.
Cheers!

Re: multiple clients making schema changes at once

Posted by "Max C." <mc...@core43.com>.

Hi Joe,

In our case we only do this in the test environment and it could be the case that there are several seconds or even minutes between when a schema change occurs vs when a test executes that depends on said schema change.  Perhaps we have been lucky thus far.  :-)

I wonder if there’s a way to query the driver to see if your schema change has fully propagated.  I haven’t looked into this.

- Max

> On Jun 3, 2021, at 8:23 am, Joe Obernberger <jo...@gmail.com> wrote:
> 
> How does this work?  I have a program that runs a series of alter table statements, and then does inserts.  In some cases, the insert happens immediately after the alter table statement and the insert fails because the schema (apparently) has not had time to propagate.  I get an Undefined column name error.
> 
> The alter statements run single threaded, but the inserts run in multiple threads.  The alter statement is run in a synchronized block (Java).  Should I put an artificial delay after the alter statement?
> 
> -Joe
> 
> On 6/1/2021 2:59 PM, Max C. wrote:
>> We use ZooKeeper + kazoo’s lock implementation.  Kazoo is a Python client library for ZooKeeper.
>> 
>> - Max
>> 
>>> Yes this is quite annoying. How did you implement that "external lock"? I also thought of doing an external service that would be dedicated to that. Cassandra client apps would send create instruction to that service, that would receive them and do the creates 1 by 1, and the client app would wait the response from it before starting to insert.
>>> 
>>> Best,
>>> 
>>> Sébastien.
>>> 
>>> Le mar. 1 juin 2021 à 05:21, Max C. <mc_cassandra2@core43.com <ma...@core43.com>> a écrit :
>>> In our case we have a shared dev cluster with (for example) a key space for each developer, a key space for each CI runner, etc.   As part of initializing our test suite we setup the schema to match the code that is about to be tested.  This can mean multiple CI runners each adding/dropping tables at the same time but for different key spaces.                         
>>> 
>>> Our experience is even though the schema changes do not conflict, we still run into schema mismatch problems.   Our solution to this was to have a lock (external to Cassandra) that ensures only a single schema change operation is being issued at a time.
>>> 
>>> People assume schema changes in Cassandra work the same way as MySQL or multiple users editing files on disk — i.e. as long as you’re not editing the same file (or same MySQL table), then there’s no problem.  This is NOT the case.  Cassandra schema changes are more like “git push”ing a commit to the same branch — i.e. at most one change can be outstanding at a time (across all tables, all key spaces)…otherwise you will run into trouble.
>>> 
>>> Hope that helps.  Best of luck.
>>> 
>>> - Max
>>> 
>>> 
>>> Hello,
>>> 
>>> I have a more general question about that, I cannot find clear answer.
>>> 
>>> In my use case I have many tables (around 10k new tables created per months) and they are created from many clients and only dynamically, with several clients creating same tables simulteanously.
>>> 
>>> What is the recommended way of creating tables dynamically? If I am doing "if not exists" queries + wait for schema aggreement before and after each create statement, will it work correctly for Cassandra?
>>> 
>>> Sébastien.
>>> 
>> 
>> 
>>  <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>	Virus-free. www.avg.com <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient> <x-msg://2/#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

Re: multiple clients making schema changes at once

Posted by Jeff Jirsa <jj...@gmail.com>.

CFID mismatch is not "schema not propagated", it means you created the
table twice at the same time, and you have an inconsistent view of the
table within your cluster.

This is bad. Really bad. Worse than you expect. It's a bug in cassandra,
but until it's fixed, you should stop doing concurrent schema modifications.



On Thu, Jun 3, 2021 at 8:37 AM Sébastien Rebecchi <sr...@kameleoon.com>
wrote:

> Sometimes even waiting hours does not change. I have a cluster where I did
> like you, synchronization of create tables statement, then even I tried
> waiting for schema agreement, in loop until success, but sometimes the
> success never happens, i got that error in loop in the logs of a node, it
> seems we must restart nodes really often :(
>
> Sébastien
>
> ERROR [InternalResponseStage:1117] 2021-06-03 17:32:34,937
> MigrationCoordinator.java:408 - Unable to merge schema from /
> 135.181.222.100
> org.apache.cassandra.exceptions.ConfigurationException: Column family ID
> mismatch (found a991bb50-c475-11eb-83cb-df35fc5a9bea; expected
> 994bee02-c475-11eb-beff-6d70d473832f)
> at
> org.apache.cassandra.config.CFMetaData.validateCompatibility(CFMetaData.java:984)
> ~[apache-cassandra-3.11.10.jar:3.11.10]
> at org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:938)
> ~[apache-cassandra-3.11.10.jar:3.11.10]
> at org.apache.cassandra.config.Schema.updateTable(Schema.java:687)
> ~[apache-cassandra-3.11.10.jar:3.11.10]
> at
> org.apache.cassandra.schema.SchemaKeyspace.updateKeyspace(SchemaKeyspace.java:1478)
> ~[apache-cassandra-3.11.10.jar:3.11.10]
> at
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1434)
> ~[apache-cassandra-3.11.10.jar:3.11.10]
> at
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1403)
> ~[apache-cassandra-3.11.10.jar:3.11.10]
> at
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1380)
> ~[apache-cassandra-3.11.10.jar:3.11.10]
> at
> org.apache.cassandra.service.MigrationCoordinator.mergeSchemaFrom(MigrationCoordinator.java:367)
> ~[apache-cassandra-3.11.10.jar:3.11.10]
> at
> org.apache.cassandra.service.MigrationCoordinator$Callback.response(MigrationCoordinator.java:404)
> [apache-cassandra-3.11.10.jar:3.11.10]
> at
> org.apache.cassandra.service.MigrationCoordinator$Callback.response(MigrationCoordinator.java:393)
> [apache-cassandra-3.11.10.jar:3.11.10]
> at
> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:53)
> [apache-cassandra-3.11.10.jar:3.11.10]
> at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66)
> [apache-cassandra-3.11.10.jar:3.11.10]
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [na:1.8.0_292]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_292]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [na:1.8.0_292]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [na:1.8.0_292]
> at
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
> [apache-cassandra-3.11.10.jar:3.11.10]
> at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_292]
>
> Le jeu. 3 juin 2021 à 17:23, Joe Obernberger <jo...@gmail.com>
> a écrit :
>
>> How does this work?  I have a program that runs a series of alter table
>> statements, and then does inserts.  In some cases, the insert happens
>> immediately after the alter table statement and the insert fails because
>> the schema (apparently) has not had time to propagate.  I get an Undefined
>> column name error.
>>
>> The alter statements run single threaded, but the inserts run in multiple
>> threads.  The alter statement is run in a synchronized block (Java).
>> Should I put an artificial delay after the alter statement?
>>
>> -Joe
>> On 6/1/2021 2:59 PM, Max C. wrote:
>>
>> We use ZooKeeper + kazoo’s lock implementation.  Kazoo is a Python client
>> library for ZooKeeper.
>>
>> - Max
>>
>> Yes this is quite annoying. How did you implement that "external lock"? I
>> also thought of doing an external service that would be dedicated to that.
>> Cassandra client apps would send create instruction to that service, that
>> would receive them and do the creates 1 by 1, and the client app would wait
>> the response from it before starting to insert.
>>
>> Best,
>>
>> Sébastien.
>>
>> Le mar. 1 juin 2021 à 05:21, Max C. <mc...@core43.com> a écrit :
>>
>>> In our case we have a shared dev cluster with (for example) a key space
>>> for each developer, a key space for each CI runner, etc.   As part of
>>> initializing our test suite we setup the schema to match the code that is
>>> about to be tested.  This can mean multiple CI runners each adding/dropping
>>> tables at the same time but for different key spaces.
>>>
>>> Our experience is even though the schema changes do not conflict, we
>>> still run into schema mismatch problems.   Our solution to this was to have
>>> a lock (external to Cassandra) that ensures only a single schema change
>>> operation is being issued at a time.
>>>
>>> People assume schema changes in Cassandra work the same way as MySQL or
>>> multiple users editing files on disk — i.e. as long as you’re not editing
>>> the same file (or same MySQL table), then there’s no problem.  *This is
>>> NOT the case.*  Cassandra schema changes are more like “git push”ing a
>>> commit to the same branch — i.e. at most one change can be outstanding at a
>>> time (across all tables, all key spaces)…otherwise you will run into
>>> trouble.
>>>
>>> Hope that helps.  Best of luck.
>>>
>>> - Max
>>>
>>> Hello,
>>>>
>>>> I have a more general question about that, I cannot find clear answer.
>>>>
>>>> In my use case I have many tables (around 10k new tables created per
>>>> months) and they are created from many clients and only dynamically, with
>>>> several clients creating same tables simulteanously.
>>>>
>>>> What is the recommended way of creating tables dynamically? If I am
>>>> doing "if not exists" queries + wait for schema aggreement before and after
>>>> each create statement, will it work correctly for Cassandra?
>>>>
>>>> Sébastien.
>>>>
>>>
>>>
>>
>>
>> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient> Virus-free.
>> www.avg.com
>> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>> <#m_4053134317068262856_m_-7057239499363649981_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>
>>

Re: multiple clients making schema changes at once

Posted by Sébastien Rebecchi <sr...@kameleoon.com>.

Sometimes even waiting hours does not change. I have a cluster where I did
like you, synchronization of create tables statement, then even I tried
waiting for schema agreement, in loop until success, but sometimes the
success never happens, i got that error in loop in the logs of a node, it
seems we must restart nodes really often :(

Sébastien

ERROR [InternalResponseStage:1117] 2021-06-03 17:32:34,937
MigrationCoordinator.java:408 - Unable to merge schema from /135.181.222.100
org.apache.cassandra.exceptions.ConfigurationException: Column family ID
mismatch (found a991bb50-c475-11eb-83cb-df35fc5a9bea; expected
994bee02-c475-11eb-beff-6d70d473832f)
at
org.apache.cassandra.config.CFMetaData.validateCompatibility(CFMetaData.java:984)
~[apache-cassandra-3.11.10.jar:3.11.10]
at org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:938)
~[apache-cassandra-3.11.10.jar:3.11.10]
at org.apache.cassandra.config.Schema.updateTable(Schema.java:687)
~[apache-cassandra-3.11.10.jar:3.11.10]
at
org.apache.cassandra.schema.SchemaKeyspace.updateKeyspace(SchemaKeyspace.java:1478)
~[apache-cassandra-3.11.10.jar:3.11.10]
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1434)
~[apache-cassandra-3.11.10.jar:3.11.10]
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1403)
~[apache-cassandra-3.11.10.jar:3.11.10]
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1380)
~[apache-cassandra-3.11.10.jar:3.11.10]
at
org.apache.cassandra.service.MigrationCoordinator.mergeSchemaFrom(MigrationCoordinator.java:367)
~[apache-cassandra-3.11.10.jar:3.11.10]
at
org.apache.cassandra.service.MigrationCoordinator$Callback.response(MigrationCoordinator.java:404)
[apache-cassandra-3.11.10.jar:3.11.10]
at
org.apache.cassandra.service.MigrationCoordinator$Callback.response(MigrationCoordinator.java:393)
[apache-cassandra-3.11.10.jar:3.11.10]
at
org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:53)
[apache-cassandra-3.11.10.jar:3.11.10]
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66)
[apache-cassandra-3.11.10.jar:3.11.10]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[na:1.8.0_292]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_292]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[na:1.8.0_292]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[na:1.8.0_292]
at
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
[apache-cassandra-3.11.10.jar:3.11.10]
at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_292]

Le jeu. 3 juin 2021 à 17:23, Joe Obernberger <jo...@gmail.com>
a écrit :

> How does this work?  I have a program that runs a series of alter table
> statements, and then does inserts.  In some cases, the insert happens
> immediately after the alter table statement and the insert fails because
> the schema (apparently) has not had time to propagate.  I get an Undefined
> column name error.
>
> The alter statements run single threaded, but the inserts run in multiple
> threads.  The alter statement is run in a synchronized block (Java).
> Should I put an artificial delay after the alter statement?
>
> -Joe
> On 6/1/2021 2:59 PM, Max C. wrote:
>
> We use ZooKeeper + kazoo’s lock implementation.  Kazoo is a Python client
> library for ZooKeeper.
>
> - Max
>
> Yes this is quite annoying. How did you implement that "external lock"? I
> also thought of doing an external service that would be dedicated to that.
> Cassandra client apps would send create instruction to that service, that
> would receive them and do the creates 1 by 1, and the client app would wait
> the response from it before starting to insert.
>
> Best,
>
> Sébastien.
>
> Le mar. 1 juin 2021 à 05:21, Max C. <mc...@core43.com> a écrit :
>
>> In our case we have a shared dev cluster with (for example) a key space
>> for each developer, a key space for each CI runner, etc.   As part of
>> initializing our test suite we setup the schema to match the code that is
>> about to be tested.  This can mean multiple CI runners each adding/dropping
>> tables at the same time but for different key spaces.
>>
>> Our experience is even though the schema changes do not conflict, we
>> still run into schema mismatch problems.   Our solution to this was to have
>> a lock (external to Cassandra) that ensures only a single schema change
>> operation is being issued at a time.
>>
>> People assume schema changes in Cassandra work the same way as MySQL or
>> multiple users editing files on disk — i.e. as long as you’re not editing
>> the same file (or same MySQL table), then there’s no problem.  *This is
>> NOT the case.*  Cassandra schema changes are more like “git push”ing a
>> commit to the same branch — i.e. at most one change can be outstanding at a
>> time (across all tables, all key spaces)…otherwise you will run into
>> trouble.
>>
>> Hope that helps.  Best of luck.
>>
>> - Max
>>
>> Hello,
>>>
>>> I have a more general question about that, I cannot find clear answer.
>>>
>>> In my use case I have many tables (around 10k new tables created per
>>> months) and they are created from many clients and only dynamically, with
>>> several clients creating same tables simulteanously.
>>>
>>> What is the recommended way of creating tables dynamically? If I am
>>> doing "if not exists" queries + wait for schema aggreement before and after
>>> each create statement, will it work correctly for Cassandra?
>>>
>>> Sébastien.
>>>
>>
>>
>
>
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient> Virus-free.
> www.avg.com
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
> <#m_-7057239499363649981_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
>

Re: multiple clients making schema changes at once

Posted by Joe Obernberger <jo...@gmail.com>.

How does this work?  I have a program that runs a series of alter table 
statements, and then does inserts.  In some cases, the insert happens 
immediately after the alter table statement and the insert fails because 
the schema (apparently) has not had time to propagate.  I get an 
Undefined column name error.

The alter statements run single threaded, but the inserts run in 
multiple threads.  The alter statement is run in a synchronized block 
(Java).  Should I put an artificial delay after the alter statement?

-Joe

On 6/1/2021 2:59 PM, Max C. wrote:
> We use ZooKeeper + kazoo’s lock implementation.  Kazoo is a Python 
> client library for ZooKeeper.
>
> - Max
>
>> Yes this is quite annoying. How did you implement that "external 
>> lock"? I also thought of doing an external service that would be 
>> dedicated to that. Cassandra client apps would send create 
>> instruction to that service, that would receive them and do the 
>> creates 1 by 1, and the client app would wait the response from it 
>> before starting to insert.
>>
>> Best,
>>
>> Sébastien.
>>
>> Le mar. 1 juin 2021 à 05:21, Max C. <mc...@core43.com> a 
>> écrit :
>>
>>     In our case we have a shared dev cluster with (for example) a key
>>     space for each developer, a key space for each CI runner, etc.  
>>     As part of initializing our test suite we setup the schema to
>>     match the code that is about to be tested.� This can mean
>>     multiple CI runners each adding/dropping tables at the same time
>>     but for different key spaces.
>>
>>     Our experience is even though the schema changes do not conflict,
>>     we still run into schema mismatch problems.   Our solution to
>>     this was to have a lock (external to Cassandra) that ensures only
>>     a single schema change operation is being issued at a time.
>>
>>     People assume schema changes in Cassandra work the same way as
>>     MySQL or multiple users editing files on disk — i.e. as long as
>>     you’re not editing the same file (or same MySQL table), then
>>     there’s no problem. � *_This is NOT the case._*  Cassandra
>>     schema changes are more like “git push”ing a commit to the
>>     same branch — i.e. at most one change can be outstanding at a
>>     time (across all tables, all key spaces)…otherwise you will run
>>     into trouble.
>>
>>     Hope that helps.  Best of luck.
>>
>>     - Max
>>
>>         Hello,
>>
>>         I have a more general question about that, I cannot find
>>         clear answer.
>>
>>         In my use case I have many tables (around 10k new tables
>>         created per months) and they are created from many clients
>>         and only dynamically, with several clients creating same
>>         tables simulteanously.
>>
>>         What is the recommended way of creating tables dynamically?
>>         If I am doing "if not exists" queries + wait for schema
>>         aggreement before and after each create statement, will it
>>         work correctly for Cassandra?
>>
>>         Sébastien.
>>
>>
>
>
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient> 
> 	Virus-free. www.avg.com 
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient> 
>
>
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

Re: multiple clients making schema changes at once

Posted by "Max C." <mc...@core43.com>.

We use ZooKeeper + kazoo’s lock implementation.  Kazoo is a Python client library for ZooKeeper.

- Max

> Yes this is quite annoying. How did you implement that "external lock"? I also thought of doing an external service that would be dedicated to that. Cassandra client apps would send create instruction to that service, that would receive them and do the creates 1 by 1, and the client app would wait the response from it before starting to insert.
> 
> Best,
> 
> Sébastien.
> 
> Le mar. 1 juin 2021 à 05:21, Max C. <mc_cassandra2@core43.com <ma...@core43.com>> a écrit :
> In our case we have a shared dev cluster with (for example) a key space for each developer, a key space for each CI runner, etc.   As part of initializing our test suite we setup the schema to match the code that is about to be tested.  This can mean multiple CI runners each adding/dropping tables at the same time but for different key spaces.
> 
> Our experience is even though the schema changes do not conflict, we still run into schema mismatch problems.   Our solution to this was to have a lock (external to Cassandra) that ensures only a single schema change operation is being issued at a time.
> 
> People assume schema changes in Cassandra work the same way as MySQL or multiple users editing files on disk — i.e. as long as you’re not editing the same file (or same MySQL table), then there’s no problem.  This is NOT the case.  Cassandra schema changes are more like “git push”ing a commit to the same branch — i.e. at most one change can be outstanding at a time (across all tables, all key spaces)…otherwise you will run into trouble.
> 
> Hope that helps.  Best of luck.
> 
> - Max
> 
> 
> Hello,
> 
> I have a more general question about that, I cannot find clear answer.
> 
> In my use case I have many tables (around 10k new tables created per months) and they are created from many clients and only dynamically, with several clients creating same tables simulteanously.
> 
> What is the recommended way of creating tables dynamically? If I am doing "if not exists" queries + wait for schema aggreement before and after each create statement, will it work correctly for Cassandra?
> 
> Sébastien.
>

Re: multiple clients making schema changes at once

Posted by Sébastien Rebecchi <sr...@kameleoon.com>.

Hello,

Yes this is quite annoying. How did you implement that "external lock"? I
also thought of doing an external service that would be dedicated to that.
Cassandra client apps would send create instruction to that service, that
would receive them and do the creates 1 by 1, and the client app would wait
the response from it before starting to insert.

Best,

Sébastien.

Le mar. 1 juin 2021 à 05:21, Max C. <mc...@core43.com> a écrit :

> In our case we have a shared dev cluster with (for example) a key space
> for each developer, a key space for each CI runner, etc.   As part of
> initializing our test suite we setup the schema to match the code that is
> about to be tested.  This can mean multiple CI runners each adding/dropping
> tables at the same time but for different key spaces.
>
> Our experience is even though the schema changes do not conflict, we still
> run into schema mismatch problems.   Our solution to this was to have a
> lock (external to Cassandra) that ensures only a single schema change
> operation is being issued at a time.
>
> People assume schema changes in Cassandra work the same way as MySQL or
> multiple users editing files on disk — i.e. as long as you’re not editing
> the same file (or same MySQL table), then there’s no problem.  *This is
> NOT the case.*  Cassandra schema changes are more like “git push”ing a
> commit to the same branch — i.e. at most one change can be outstanding at a
> time (across all tables, all key spaces)…otherwise you will run into
> trouble.
>
> Hope that helps.  Best of luck.
>
> - Max
>
> Hello,
>>
>> I have a more general question about that, I cannot find clear answer.
>>
>> In my use case I have many tables (around 10k new tables created per
>> months) and they are created from many clients and only dynamically, with
>> several clients creating same tables simulteanously.
>>
>> What is the recommended way of creating tables dynamically? If I am doing
>> "if not exists" queries + wait for schema aggreement before and after each
>> create statement, will it work correctly for Cassandra?
>>
>> Sébastien.
>>
>
>