You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Dimo Velev <di...@gmail.com> on 2019/12/23 16:55:32 UTC

Create table concurrently

Hi, We have microservices that use Cassandra. Each instance, when started, creates the required DB schema (and keeps a changelog). As instances can be started at the same time, we use a row in a table as lock - insert into if not exists using ttl. That all works without any issues.The problem is that the table that contains the lock is also created by the applications using create if not exists. Despite the name, Cassandra seems to have racing condition when this statement is called concurrently - it ends up with multiple definitions of a table with different table id. Any DDLs after that fail with configuration exceptions. How does one clean up after that has happened?As a work around we're creating table with explicit table id set (computed from the key space and table name so that all nodes generate the same id. This kind of works but feels like an ugly hack. Are there other options that you can think of that only rely on Cassandra?CheersDimo

Re: Create table concurrently

Posted by Jeff Jirsa <jj...@gmail.com>.


> On Dec 23, 2019, at 5:02 PM, Dimo Velev <di...@gmail.com> wrote:
> 
> Hi, 
> 
> We have microservices that use Cassandra. Each instance, when started, creates the required DB schema (and keeps a changelog). As instances can be started at the same time, we use a row in a table as lock - insert into if not exists using ttl. That all works without any issues.
> 
> The problem is that the table that contains the lock is also created by the applications using create if not exists. Despite the name, Cassandra seems to have racing condition when this statement is called concurrently - it ends up with multiple definitions of a table with different table id. Any DDLs after that fail with configuration exceptions. 

The IF NOT EXISTS in DDL doesn’t use paxos, and has probably multiple races - it may be better in 4.0, but until then do not programmatically create tables in a way that can race. 

> 
> How does one clean up after that has happened?

Not easily. You have to figure out which cfid is “right” and bounce each host, copying the real data to the right folder as you do it. It’s really really bad and painful. 


> 
> As a work around we're creating table with explicit table id set (computed from the key space and table name so that all nodes generate the same id. This kind of works but feels like an ugly hack. Are there other options that you can think of that only rely on Cassandra?

This is a reasonable workaround. The other alternative is external locking (e.g. zookeeper). Ugly. 
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org