You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Kevin Burton <bu...@spinn3r.com> on 2016/01/23 03:16:27 UTC

automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

Not sure if this is a bug or not or kind of a *fuzzy* area.

In 2.0 this worked fine.

We have a bunch of automated scripts that go through and create tables...
one per day.

at midnight UTC our entire CQL went offline.. .took down our whole app.  ;-/

The resolution was a full CQL shut down and then a drop table to remove the
bad tables...

pretty sure the issue was with schema disagreement.

All our CREATE TABLE use IF NOT EXISTS.... but I think the IF NOT EXISTS
only checks locally?

My work around is going to be to use zookeeper to create a mutex lock
during this operation.

Any other things I should avoid?


-- 

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

Posted by Sebastian Estevez <se...@datastax.com>.

CASSANDRA-9424 <https://issues.apache.org/jira/browse/CASSANDRA-9424>

All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.estevez@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Sat, Jan 23, 2016 at 12:22 AM, Jack Krupansky <ja...@gmail.com>
wrote:

> I recall that there was some discussion last year about this issue of how
> risky it is to do an automated CREATE TABLE IF NOT EXISTS due to the
> unpredictable amount of time it takes for the table creation to fully
> propagate around the full cluster. I think it was recognized as a real
> problem, but without an immediate solution, so the recommended practice for
> now is to only manually perform the operation (sure, it can be scripted,
> but only under manual control) to assure that the operation completes and
> that only one attempt is made to create the table. I don't recall if there
> was a specific Jira assigned, and the antipattern doc doesn't appear to
> reference this scenario. Maybe a committer can shed some more light.
>
> -- Jack Krupansky
>
> On Fri, Jan 22, 2016 at 10:29 PM, Kevin Burton <bu...@spinn3r.com> wrote:
>
>> I sort of agree.. but we are also considering migrating to hourly
>> tables.. and what if the single script doesn't run.
>>
>> I like having N nodes make changes like this because in my experience
>> that central / single box will usually fail at the wrong time :-/
>>
>>
>>
>> On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad <jo...@jonhaddad.com>
>> wrote:
>>
>>> Instead of using ZK, why not solve your concurrency problem by removing
>>> it?  By that, I mean simply have 1 process that creates all your tables
>>> instead of creating a race condition intentionally?
>>>
>>> On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton <bu...@spinn3r.com> wrote:
>>>
>>>> Not sure if this is a bug or not or kind of a *fuzzy* area.
>>>>
>>>> In 2.0 this worked fine.
>>>>
>>>> We have a bunch of automated scripts that go through and create
>>>> tables... one per day.
>>>>
>>>> at midnight UTC our entire CQL went offline.. .took down our whole app.
>>>>  ;-/
>>>>
>>>> The resolution was a full CQL shut down and then a drop table to remove
>>>> the bad tables...
>>>>
>>>> pretty sure the issue was with schema disagreement.
>>>>
>>>> All our CREATE TABLE use IF NOT EXISTS.... but I think the IF NOT
>>>> EXISTS only checks locally?
>>>>
>>>> My work around is going to be to use zookeeper to create a mutex lock
>>>> during this operation.
>>>>
>>>> Any other things I should avoid?
>>>>
>>>>
>>>> --
>>>>
>>>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>>>> Engineers!
>>>>
>>>> Founder/CEO Spinn3r.com
>>>> Location: *San Francisco, CA*
>>>> blog: http://burtonator.wordpress.com
>>>> … or check out my Google+ profile
>>>> <https://plus.google.com/102718274791889610666/posts>
>>>>
>>>>
>>
>>
>> --
>>
>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>> Engineers!
>>
>> Founder/CEO Spinn3r.com
>> Location: *San Francisco, CA*
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> <https://plus.google.com/102718274791889610666/posts>
>>
>>
>

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

Posted by Sebastian Estevez <se...@datastax.com>.

You have to wait for schema agreement which most drivers should do by
default. At least have a check schema agreement method you can use.

https://datastax.github.io/java-driver/2.1.9/features/metadata/

The new cqlsh uses the python driver so the same should apply:

https://datastax.github.io/python-driver/api/cassandra/cluster.html

But check 'nodetool describecluster' to confirm that all nodes have the
same schema version.

Note: This will not help you in the concurrency / multiple writers
scenario.

all the best,

Sebastián
On Jan 23, 2016 7:29 PM, "Kevin Burton" <bu...@spinn3r.com> wrote:

> Once the CREATE TABLE returns in cqlsh (or programatically) is it safe to
> assume it's on all nodes at that point?
>
> If not I'll have to put in even more logic to handle this case..
>
> On Fri, Jan 22, 2016 at 9:22 PM, Jack Krupansky <ja...@gmail.com>
> wrote:
>
>> I recall that there was some discussion last year about this issue of how
>> risky it is to do an automated CREATE TABLE IF NOT EXISTS due to the
>> unpredictable amount of time it takes for the table creation to fully
>> propagate around the full cluster. I think it was recognized as a real
>> problem, but without an immediate solution, so the recommended practice for
>> now is to only manually perform the operation (sure, it can be scripted,
>> but only under manual control) to assure that the operation completes and
>> that only one attempt is made to create the table. I don't recall if there
>> was a specific Jira assigned, and the antipattern doc doesn't appear to
>> reference this scenario. Maybe a committer can shed some more light.
>>
>> -- Jack Krupansky
>>
>> On Fri, Jan 22, 2016 at 10:29 PM, Kevin Burton <bu...@spinn3r.com>
>> wrote:
>>
>>> I sort of agree.. but we are also considering migrating to hourly
>>> tables.. and what if the single script doesn't run.
>>>
>>> I like having N nodes make changes like this because in my experience
>>> that central / single box will usually fail at the wrong time :-/
>>>
>>>
>>>
>>> On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad <jo...@jonhaddad.com>
>>> wrote:
>>>
>>>> Instead of using ZK, why not solve your concurrency problem by removing
>>>> it?  By that, I mean simply have 1 process that creates all your tables
>>>> instead of creating a race condition intentionally?
>>>>
>>>> On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton <bu...@spinn3r.com>
>>>> wrote:
>>>>
>>>>> Not sure if this is a bug or not or kind of a *fuzzy* area.
>>>>>
>>>>> In 2.0 this worked fine.
>>>>>
>>>>> We have a bunch of automated scripts that go through and create
>>>>> tables... one per day.
>>>>>
>>>>> at midnight UTC our entire CQL went offline.. .took down our whole
>>>>> app.  ;-/
>>>>>
>>>>> The resolution was a full CQL shut down and then a drop table to
>>>>> remove the bad tables...
>>>>>
>>>>> pretty sure the issue was with schema disagreement.
>>>>>
>>>>> All our CREATE TABLE use IF NOT EXISTS.... but I think the IF NOT
>>>>> EXISTS only checks locally?
>>>>>
>>>>> My work around is going to be to use zookeeper to create a mutex lock
>>>>> during this operation.
>>>>>
>>>>> Any other things I should avoid?
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> We’re hiring if you know of any awesome Java Devops or Linux
>>>>> Operations Engineers!
>>>>>
>>>>> Founder/CEO Spinn3r.com
>>>>> Location: *San Francisco, CA*
>>>>> blog: http://burtonator.wordpress.com
>>>>> … or check out my Google+ profile
>>>>> <https://plus.google.com/102718274791889610666/posts>
>>>>>
>>>>>
>>>
>>>
>>> --
>>>
>>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>>> Engineers!
>>>
>>> Founder/CEO Spinn3r.com
>>> Location: *San Francisco, CA*
>>> blog: http://burtonator.wordpress.com
>>> … or check out my Google+ profile
>>> <https://plus.google.com/102718274791889610666/posts>
>>>
>>>
>>
>
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
>
>

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

Posted by Jack Krupansky <ja...@gmail.com>.

+1 for doc update. I added some comments to the seemingly most relevant
Jira ticket to confirm best practice that we can then forward to the doc
team:
https://issues.apache.org/jira/browse/CASSANDRA-10699


-- Jack Krupansky

On Mon, Jan 25, 2016 at 1:12 PM, Eric Stevens <mi...@gmail.com> wrote:

> It seems like this exact problem pops up every few weeks on this list.  I
> think the documentation does a dangerously bad job of describing the
> limitations of CREATE TABLE...IF NOT EXISTS.
>
> CREATE TABLE...IF NOT EXISTS is a dangerous construct because it seems to
> advertise atomicity and isolation, neither of which it actually possesses.
> Worse, the failure mode isn't just unpredictable output, but outright
> failure of cluster stability.  The documentation on this does not do an
> adequate job of describing what it actually does, and its characteristics
> are inconsistent with other forms of IF NOT EXISTS.
>
> > Cassandra 2.1.1 and later supports the IF NOT EXISTS syntax for
> creating a trigger [SIC]. Attempting to create an existing table returns an
> error unless the IF NOT EXISTS option is used. If the option is used, the
> statement if a no-op if the table already exists.
>
> I would strongly suggest this documentation be updated to indicate that it
> is NOT SAFE to rely on atomicity and isolation of this statement, and that
> it cannot be used like relational databases to coordinate schema changes.
>
>
> On Sat, Jan 23, 2016 at 5:29 PM Kevin Burton <bu...@spinn3r.com> wrote:
>
>> Once the CREATE TABLE returns in cqlsh (or programatically) is it safe to
>> assume it's on all nodes at that point?
>>
>> If not I'll have to put in even more logic to handle this case..
>>
>> On Fri, Jan 22, 2016 at 9:22 PM, Jack Krupansky <jack.krupansky@gmail.com
>> > wrote:
>>
>>> I recall that there was some discussion last year about this issue of
>>> how risky it is to do an automated CREATE TABLE IF NOT EXISTS due to the
>>> unpredictable amount of time it takes for the table creation to fully
>>> propagate around the full cluster. I think it was recognized as a real
>>> problem, but without an immediate solution, so the recommended practice for
>>> now is to only manually perform the operation (sure, it can be scripted,
>>> but only under manual control) to assure that the operation completes and
>>> that only one attempt is made to create the table. I don't recall if there
>>> was a specific Jira assigned, and the antipattern doc doesn't appear to
>>> reference this scenario. Maybe a committer can shed some more light.
>>>
>>> -- Jack Krupansky
>>>
>>> On Fri, Jan 22, 2016 at 10:29 PM, Kevin Burton <bu...@spinn3r.com>
>>> wrote:
>>>
>>>> I sort of agree.. but we are also considering migrating to hourly
>>>> tables.. and what if the single script doesn't run.
>>>>
>>>> I like having N nodes make changes like this because in my experience
>>>> that central / single box will usually fail at the wrong time :-/
>>>>
>>>>
>>>>
>>>> On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad <jo...@jonhaddad.com>
>>>> wrote:
>>>>
>>>>> Instead of using ZK, why not solve your concurrency problem by
>>>>> removing it?  By that, I mean simply have 1 process that creates all your
>>>>> tables instead of creating a race condition intentionally?
>>>>>
>>>>> On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton <bu...@spinn3r.com>
>>>>> wrote:
>>>>>
>>>>>> Not sure if this is a bug or not or kind of a *fuzzy* area.
>>>>>>
>>>>>> In 2.0 this worked fine.
>>>>>>
>>>>>> We have a bunch of automated scripts that go through and create
>>>>>> tables... one per day.
>>>>>>
>>>>>> at midnight UTC our entire CQL went offline.. .took down our whole
>>>>>> app.  ;-/
>>>>>>
>>>>>> The resolution was a full CQL shut down and then a drop table to
>>>>>> remove the bad tables...
>>>>>>
>>>>>> pretty sure the issue was with schema disagreement.
>>>>>>
>>>>>> All our CREATE TABLE use IF NOT EXISTS.... but I think the IF NOT
>>>>>> EXISTS only checks locally?
>>>>>>
>>>>>> My work around is going to be to use zookeeper to create a mutex lock
>>>>>> during this operation.
>>>>>>
>>>>>> Any other things I should avoid?
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> We’re hiring if you know of any awesome Java Devops or Linux
>>>>>> Operations Engineers!
>>>>>>
>>>>>> Founder/CEO Spinn3r.com
>>>>>> Location: *San Francisco, CA*
>>>>>> blog: http://burtonator.wordpress.com
>>>>>> … or check out my Google+ profile
>>>>>> <https://plus.google.com/102718274791889610666/posts>
>>>>>>
>>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>>>> Engineers!
>>>>
>>>> Founder/CEO Spinn3r.com
>>>> Location: *San Francisco, CA*
>>>> blog: http://burtonator.wordpress.com
>>>> … or check out my Google+ profile
>>>> <https://plus.google.com/102718274791889610666/posts>
>>>>
>>>>
>>>
>>
>>
>> --
>>
>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>> Engineers!
>>
>> Founder/CEO Spinn3r.com
>> Location: *San Francisco, CA*
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> <https://plus.google.com/102718274791889610666/posts>
>>
>>

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

Posted by Eric Stevens <mi...@gmail.com>.

It seems like this exact problem pops up every few weeks on this list.  I
think the documentation does a dangerously bad job of describing the
limitations of CREATE TABLE...IF NOT EXISTS.

CREATE TABLE...IF NOT EXISTS is a dangerous construct because it seems to
advertise atomicity and isolation, neither of which it actually possesses.
Worse, the failure mode isn't just unpredictable output, but outright
failure of cluster stability.  The documentation on this does not do an
adequate job of describing what it actually does, and its characteristics
are inconsistent with other forms of IF NOT EXISTS.

> Cassandra 2.1.1 and later supports the IF NOT EXISTS syntax for creating
a trigger [SIC]. Attempting to create an existing table returns an error
unless the IF NOT EXISTS option is used. If the option is used, the
statement if a no-op if the table already exists.

I would strongly suggest this documentation be updated to indicate that it
is NOT SAFE to rely on atomicity and isolation of this statement, and that
it cannot be used like relational databases to coordinate schema changes.


On Sat, Jan 23, 2016 at 5:29 PM Kevin Burton <bu...@spinn3r.com> wrote:

> Once the CREATE TABLE returns in cqlsh (or programatically) is it safe to
> assume it's on all nodes at that point?
>
> If not I'll have to put in even more logic to handle this case..
>
> On Fri, Jan 22, 2016 at 9:22 PM, Jack Krupansky <ja...@gmail.com>
> wrote:
>
>> I recall that there was some discussion last year about this issue of how
>> risky it is to do an automated CREATE TABLE IF NOT EXISTS due to the
>> unpredictable amount of time it takes for the table creation to fully
>> propagate around the full cluster. I think it was recognized as a real
>> problem, but without an immediate solution, so the recommended practice for
>> now is to only manually perform the operation (sure, it can be scripted,
>> but only under manual control) to assure that the operation completes and
>> that only one attempt is made to create the table. I don't recall if there
>> was a specific Jira assigned, and the antipattern doc doesn't appear to
>> reference this scenario. Maybe a committer can shed some more light.
>>
>> -- Jack Krupansky
>>
>> On Fri, Jan 22, 2016 at 10:29 PM, Kevin Burton <bu...@spinn3r.com>
>> wrote:
>>
>>> I sort of agree.. but we are also considering migrating to hourly
>>> tables.. and what if the single script doesn't run.
>>>
>>> I like having N nodes make changes like this because in my experience
>>> that central / single box will usually fail at the wrong time :-/
>>>
>>>
>>>
>>> On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad <jo...@jonhaddad.com>
>>> wrote:
>>>
>>>> Instead of using ZK, why not solve your concurrency problem by removing
>>>> it?  By that, I mean simply have 1 process that creates all your tables
>>>> instead of creating a race condition intentionally?
>>>>
>>>> On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton <bu...@spinn3r.com>
>>>> wrote:
>>>>
>>>>> Not sure if this is a bug or not or kind of a *fuzzy* area.
>>>>>
>>>>> In 2.0 this worked fine.
>>>>>
>>>>> We have a bunch of automated scripts that go through and create
>>>>> tables... one per day.
>>>>>
>>>>> at midnight UTC our entire CQL went offline.. .took down our whole
>>>>> app.  ;-/
>>>>>
>>>>> The resolution was a full CQL shut down and then a drop table to
>>>>> remove the bad tables...
>>>>>
>>>>> pretty sure the issue was with schema disagreement.
>>>>>
>>>>> All our CREATE TABLE use IF NOT EXISTS.... but I think the IF NOT
>>>>> EXISTS only checks locally?
>>>>>
>>>>> My work around is going to be to use zookeeper to create a mutex lock
>>>>> during this operation.
>>>>>
>>>>> Any other things I should avoid?
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> We’re hiring if you know of any awesome Java Devops or Linux
>>>>> Operations Engineers!
>>>>>
>>>>> Founder/CEO Spinn3r.com
>>>>> Location: *San Francisco, CA*
>>>>> blog: http://burtonator.wordpress.com
>>>>> … or check out my Google+ profile
>>>>> <https://plus.google.com/102718274791889610666/posts>
>>>>>
>>>>>
>>>
>>>
>>> --
>>>
>>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>>> Engineers!
>>>
>>> Founder/CEO Spinn3r.com
>>> Location: *San Francisco, CA*
>>> blog: http://burtonator.wordpress.com
>>> … or check out my Google+ profile
>>> <https://plus.google.com/102718274791889610666/posts>
>>>
>>>
>>
>
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
>
>

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

Posted by Kevin Burton <bu...@spinn3r.com>.

Once the CREATE TABLE returns in cqlsh (or programatically) is it safe to
assume it's on all nodes at that point?

If not I'll have to put in even more logic to handle this case..

On Fri, Jan 22, 2016 at 9:22 PM, Jack Krupansky <ja...@gmail.com>
wrote:

> I recall that there was some discussion last year about this issue of how
> risky it is to do an automated CREATE TABLE IF NOT EXISTS due to the
> unpredictable amount of time it takes for the table creation to fully
> propagate around the full cluster. I think it was recognized as a real
> problem, but without an immediate solution, so the recommended practice for
> now is to only manually perform the operation (sure, it can be scripted,
> but only under manual control) to assure that the operation completes and
> that only one attempt is made to create the table. I don't recall if there
> was a specific Jira assigned, and the antipattern doc doesn't appear to
> reference this scenario. Maybe a committer can shed some more light.
>
> -- Jack Krupansky
>
> On Fri, Jan 22, 2016 at 10:29 PM, Kevin Burton <bu...@spinn3r.com> wrote:
>
>> I sort of agree.. but we are also considering migrating to hourly
>> tables.. and what if the single script doesn't run.
>>
>> I like having N nodes make changes like this because in my experience
>> that central / single box will usually fail at the wrong time :-/
>>
>>
>>
>> On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad <jo...@jonhaddad.com>
>> wrote:
>>
>>> Instead of using ZK, why not solve your concurrency problem by removing
>>> it?  By that, I mean simply have 1 process that creates all your tables
>>> instead of creating a race condition intentionally?
>>>
>>> On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton <bu...@spinn3r.com> wrote:
>>>
>>>> Not sure if this is a bug or not or kind of a *fuzzy* area.
>>>>
>>>> In 2.0 this worked fine.
>>>>
>>>> We have a bunch of automated scripts that go through and create
>>>> tables... one per day.
>>>>
>>>> at midnight UTC our entire CQL went offline.. .took down our whole app.
>>>>  ;-/
>>>>
>>>> The resolution was a full CQL shut down and then a drop table to remove
>>>> the bad tables...
>>>>
>>>> pretty sure the issue was with schema disagreement.
>>>>
>>>> All our CREATE TABLE use IF NOT EXISTS.... but I think the IF NOT
>>>> EXISTS only checks locally?
>>>>
>>>> My work around is going to be to use zookeeper to create a mutex lock
>>>> during this operation.
>>>>
>>>> Any other things I should avoid?
>>>>
>>>>
>>>> --
>>>>
>>>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>>>> Engineers!
>>>>
>>>> Founder/CEO Spinn3r.com
>>>> Location: *San Francisco, CA*
>>>> blog: http://burtonator.wordpress.com
>>>> … or check out my Google+ profile
>>>> <https://plus.google.com/102718274791889610666/posts>
>>>>
>>>>
>>
>>
>> --
>>
>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>> Engineers!
>>
>> Founder/CEO Spinn3r.com
>> Location: *San Francisco, CA*
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> <https://plus.google.com/102718274791889610666/posts>
>>
>>
>


-- 

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

Posted by Jack Krupansky <ja...@gmail.com>.

I recall that there was some discussion last year about this issue of how
risky it is to do an automated CREATE TABLE IF NOT EXISTS due to the
unpredictable amount of time it takes for the table creation to fully
propagate around the full cluster. I think it was recognized as a real
problem, but without an immediate solution, so the recommended practice for
now is to only manually perform the operation (sure, it can be scripted,
but only under manual control) to assure that the operation completes and
that only one attempt is made to create the table. I don't recall if there
was a specific Jira assigned, and the antipattern doc doesn't appear to
reference this scenario. Maybe a committer can shed some more light.

-- Jack Krupansky

On Fri, Jan 22, 2016 at 10:29 PM, Kevin Burton <bu...@spinn3r.com> wrote:

> I sort of agree.. but we are also considering migrating to hourly tables..
> and what if the single script doesn't run.
>
> I like having N nodes make changes like this because in my experience that
> central / single box will usually fail at the wrong time :-/
>
>
>
> On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad <jo...@jonhaddad.com>
> wrote:
>
>> Instead of using ZK, why not solve your concurrency problem by removing
>> it?  By that, I mean simply have 1 process that creates all your tables
>> instead of creating a race condition intentionally?
>>
>> On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton <bu...@spinn3r.com> wrote:
>>
>>> Not sure if this is a bug or not or kind of a *fuzzy* area.
>>>
>>> In 2.0 this worked fine.
>>>
>>> We have a bunch of automated scripts that go through and create
>>> tables... one per day.
>>>
>>> at midnight UTC our entire CQL went offline.. .took down our whole app.
>>>  ;-/
>>>
>>> The resolution was a full CQL shut down and then a drop table to remove
>>> the bad tables...
>>>
>>> pretty sure the issue was with schema disagreement.
>>>
>>> All our CREATE TABLE use IF NOT EXISTS.... but I think the IF NOT EXISTS
>>> only checks locally?
>>>
>>> My work around is going to be to use zookeeper to create a mutex lock
>>> during this operation.
>>>
>>> Any other things I should avoid?
>>>
>>>
>>> --
>>>
>>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>>> Engineers!
>>>
>>> Founder/CEO Spinn3r.com
>>> Location: *San Francisco, CA*
>>> blog: http://burtonator.wordpress.com
>>> … or check out my Google+ profile
>>> <https://plus.google.com/102718274791889610666/posts>
>>>
>>>
>
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
>
>

Re: Rename Keyspace offline

Posted by Jack Krupansky <ja...@gmail.com>.

If you are doing this full bulk reload a lot, it may make more sense to use
a separate cluster to bring up the new data and then atomically switch your
clients/apps to the IP address of the new cluster once you've validated it,
and then decommission and recyle the machines of the old cluster. This
would maximize performance of the production cluster and maximize
performance of the staging process as well. And you would need less
hardware for each node/cluster as well since you won't need to support two
copies of the data on a single node/cluster. It will also make it a lot
easier to upgrade the cluster without worry about impact on production
during the upgrade since the client/app would only ever see a fully
consistent cluster. (I lost count of how many wins this approach would give
you!)

-- Jack Krupansky

On Wed, Jan 27, 2016 at 10:53 AM, Jonathan Haddad <jo...@jonhaddad.com> wrote:

> Why rename the keyspace? If it was me I'd just give it a name that
> includes the date or some identifier and include that logic in my app.
> That's way easier.
> On Wed, Jan 27, 2016 at 6:49 AM Jean Tremblay <
> jean.tremblay@zen-innovations.com> wrote:
>
>> Hi,
>>
>> I have a huge set of data, which takes about 2 days to bulk load on a
>> Cassandra 3.0 cluster of 5 nodes. That is about 13 billion rows.
>>
>> Quite often I need to reload this data, new structure, or data is
>> reorganise. There are clients reading from a given keyspace (KS-X).
>>
>> Since it takes me 2 days to load my data, I was planning to load the new
>> set on a new keyspace (KS-Y), and when loaded drop KS-X and rename KS-Y to
>> KS-X.
>>
>> Now I know "renaming keyspace" is a functionality which was removed.
>>
>> Would this procedure work to destroy an old keyspace KS-X and rename a
>> new keyspace KS-Y to KS-X:
>>
>> 1) nodetool drain each node.
>> 2) stop cassandra on each node.
>> 3) on each node:
>>         3.1) rm -r data/KS-X
>>         3.2) mv data/KS-Y data/KS-X
>> 4) restart each node.
>>
>> Could someone please confirm this? I guess it would work, but I’m just
>> afraid that there could be in some system table some information that would
>> not allow this.
>>
>> Thanks for your help.
>>
>> Cheers
>>
>> Jean
>
>

Re: Rename Keyspace offline

Posted by Jonathan Haddad <jo...@jonhaddad.com>.

Why rename the keyspace? If it was me I'd just give it a name that includes
the date or some identifier and include that logic in my app. That's way
easier.
On Wed, Jan 27, 2016 at 6:49 AM Jean Tremblay <
jean.tremblay@zen-innovations.com> wrote:

> Hi,
>
> I have a huge set of data, which takes about 2 days to bulk load on a
> Cassandra 3.0 cluster of 5 nodes. That is about 13 billion rows.
>
> Quite often I need to reload this data, new structure, or data is
> reorganise. There are clients reading from a given keyspace (KS-X).
>
> Since it takes me 2 days to load my data, I was planning to load the new
> set on a new keyspace (KS-Y), and when loaded drop KS-X and rename KS-Y to
> KS-X.
>
> Now I know "renaming keyspace" is a functionality which was removed.
>
> Would this procedure work to destroy an old keyspace KS-X and rename a new
> keyspace KS-Y to KS-X:
>
> 1) nodetool drain each node.
> 2) stop cassandra on each node.
> 3) on each node:
>         3.1) rm -r data/KS-X
>         3.2) mv data/KS-Y data/KS-X
> 4) restart each node.
>
> Could someone please confirm this? I guess it would work, but I’m just
> afraid that there could be in some system table some information that would
> not allow this.
>
> Thanks for your help.
>
> Cheers
>
> Jean

Re: Rename Keyspace offline

Posted by Jean Tremblay <je...@zen-innovations.com>.

Thank you all for your replies.
My main objective was not to change my client.
After your answers it makes a lot of sense to modify my client in a way to make it accept different key space name. This way I will no longer need to rename a key space I simply need to develop a way to tell my client that there is a new key space.

Thanks again for your feedback
Jean

On 27 Jan,2016, at 19:58, Robert Coli <rc...@eventbrite.com>> wrote:

On Wed, Jan 27, 2016 at 6:49 AM, Jean Tremblay <je...@zen-innovations.com>> wrote:
Since it takes me 2 days to load my data, I was planning to load the new set on a new keyspace (KS-Y), and when loaded drop KS-X and rename KS-Y to KS-X.

Why bother with the rename? Just have two keyspaces, foo and foo_, and alternate your bulk loads between truncating them?

Would this procedure work to destroy an old keyspace KS-X and rename a new keyspace KS-Y to KS-X:

Yes, if you include :

0) Load schema for KS-Y into KS-X

1) nodetool drain each node.
2) stop cassandra on each node.
3) on each node:
        3.1) rm -r data/KS-X
        3.2) mv data/KS-Y data/KS-X
4) restart each node.

Note also that in step 3.2, the uuid component of file and/or directory names will have to be changed.

=Rob

Re: Rename Keyspace offline

Posted by Robert Coli <rc...@eventbrite.com>.

On Wed, Jan 27, 2016 at 6:49 AM, Jean Tremblay <
jean.tremblay@zen-innovations.com> wrote:

> Since it takes me 2 days to load my data, I was planning to load the new
> set on a new keyspace (KS-Y), and when loaded drop KS-X and rename KS-Y to
> KS-X.
>

Why bother with the rename? Just have two keyspaces, foo and foo_, and
alternate your bulk loads between truncating them?


> Would this procedure work to destroy an old keyspace KS-X and rename a new
> keyspace KS-Y to KS-X:
>

Yes, if you include :

0) Load schema for KS-Y into KS-X

1) nodetool drain each node.
> 2) stop cassandra on each node.
> 3) on each node:
>         3.1) rm -r data/KS-X
>         3.2) mv data/KS-Y data/KS-X
> 4) restart each node.
>

Note also that in step 3.2, the uuid component of file and/or directory
names will have to be changed.

=Rob

Re: Rename Keyspace offline

Posted by Alain RODRIGUEZ <ar...@gmail.com>.

>
>  3.1) rm -r data/KS-X
>  3.2) mv data/KS-Y data/KS-X


This won't work, sstable names contains keyspace name.

I had this issue too (wanted to split a keyspace into multiple ones, use
this occasion to rename tables, etc

I finally ended up writing a small python script there :
https://github.com/arodrime/cassandra-tools/blob/master/operations/move_table.py.
This was allowing me to mv any ks.cf to ks.cf2, ks2.cf2, or ks2.cf.

I used it in my previous job prod and it worked like a charm, we were
really happy with this, yet I won't assume any responsibility. Just hope it
will be useful to you.

One last warning, it was written to be compatible with my environment, you
might adjust a few things or improve the code to have anything you need as
an option. Feel free to do whatever you want with this code.

Anyway, you have the logic in there at least.

C*heers,

-----------------
Alain Rodriguez
France

The Last Pickle
http://www.thelastpickle.com

2016-01-27 15:49 GMT+01:00 Jean Tremblay <je...@zen-innovations.com>
:

> Hi,
>
> I have a huge set of data, which takes about 2 days to bulk load on a
> Cassandra 3.0 cluster of 5 nodes. That is about 13 billion rows.
>
> Quite often I need to reload this data, new structure, or data is
> reorganise. There are clients reading from a given keyspace (KS-X).
>
> Since it takes me 2 days to load my data, I was planning to load the new
> set on a new keyspace (KS-Y), and when loaded drop KS-X and rename KS-Y to
> KS-X.
>
> Now I know "renaming keyspace" is a functionality which was removed.
>
> Would this procedure work to destroy an old keyspace KS-X and rename a new
> keyspace KS-Y to KS-X:
>
> 1) nodetool drain each node.
> 2) stop cassandra on each node.
> 3) on each node:
>         3.1) rm -r data/KS-X
>         3.2) mv data/KS-Y data/KS-X
> 4) restart each node.
>
> Could someone please confirm this? I guess it would work, but I’m just
> afraid that there could be in some system table some information that would
> not allow this.
>
> Thanks for your help.
>
> Cheers
>
> Jean

Rename Keyspace offline

Posted by Jean Tremblay <je...@zen-innovations.com>.

Hi,

I have a huge set of data, which takes about 2 days to bulk load on a Cassandra 3.0 cluster of 5 nodes. That is about 13 billion rows.

Quite often I need to reload this data, new structure, or data is reorganise. There are clients reading from a given keyspace (KS-X).

Since it takes me 2 days to load my data, I was planning to load the new set on a new keyspace (KS-Y), and when loaded drop KS-X and rename KS-Y to KS-X.

Now I know "renaming keyspace" is a functionality which was removed.

Would this procedure work to destroy an old keyspace KS-X and rename a new keyspace KS-Y to KS-X:

1) nodetool drain each node.
2) stop cassandra on each node.
3) on each node:
	3.1) rm -r data/KS-X
	3.2) mv data/KS-Y data/KS-X
4) restart each node.

Could someone please confirm this? I guess it would work, but I’m just afraid that there could be in some system table some information that would not allow this.

Thanks for your help.

Cheers

Jean

RE: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

Posted by Jacques-Henri Berthemet <ja...@genesys.com>.

You will have the same problem without IF NOT EXIST, at least I had Cassandra 2.1 complaining about having tables with the same name but different UUIDs. In the end in our case we have a single application node that is responsible for schema upgrades, that’s ok for us as we don’t plan to have the schema upgraded that much.

--
Jacques-Henri Berthemet

From: Ken Hancock [mailto:ken.hancock@schange.com]
Sent: mardi 2 février 2016 17:14
To: user@cassandra.apache.org
Subject: Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

Just to close the loop on this, but am I correct that the IF NOT EXITS isn't the real problem?  Even multiple calls to CREATE TABLE cause the same schema mismatch if done concurrently?  Normally, a CREATE TABLE call will return an exception that the table already exists.

On Tue, Feb 2, 2016 at 11:06 AM, Jack Krupansky <ja...@gmail.com>> wrote:
And CASSANDRA-10699  seems to be the sub-issue of CASSANDRA-9424 to do that:
https://issues.apache.org/jira/browse/CASSANDRA-10699

-- Jack Krupansky

On Tue, Feb 2, 2016 at 9:59 AM, Sebastian Estevez <se...@datastax.com>> wrote:

Hi Ken,

Earlier in this thread I posted a link to https://issues.apache.org/jira/browse/CASSANDRA-9424

That is the fix for these schema disagreement issues and as ay commented, the plan is to use CAS. Until then we have to treat schema delicately.

all the best,

Sebastián
On Feb 2, 2016 9:48 AM, "Ken Hancock" <ke...@schange.com>> wrote:
So this rings odd to me.  If you can accomplish the same thing by using a CAS operation, why not fix create table if not exist so that if your are writing an application that creates the table on startup, that the application is safe to run on multiple nodes and uses CAS to safeguard multiple concurrent creations?

On Tue, Jan 26, 2016 at 12:32 PM, Eric Stevens <mi...@gmail.com>> wrote:
There's still a race condition there, because two clients could SELECT at the same time as each other, then both INSERT.

You'd be better served with a CAS operation, and let Paxos guarantee at-most-once execution.

On Tue, Jan 26, 2016 at 9:06 AM Francisco Reyes <li...@natserv.net>> wrote:
On 01/22/2016 10:29 PM, Kevin Burton wrote:
I sort of agree.. but we are also considering migrating to hourly tables.. and what if the single script doesn't run.

I like having N nodes make changes like this because in my experience that central / single box will usually fail at the wrong time :-/

On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad <jo...@jonhaddad.com>> wrote:
Instead of using ZK, why not solve your concurrency problem by removing it?  By that, I mean simply have 1 process that creates all your tables instead of creating a race condition intentionally?

On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton <bu...@spinn3r.com>> wrote:
Not sure if this is a bug or not or kind of a *fuzzy* area.

In 2.0 this worked fine.

We have a bunch of automated scripts that go through and create tables... one per day.

at midnight UTC our entire CQL went offline.. .took down our whole app.  ;-/

The resolution was a full CQL shut down and then a drop table to remove the bad tables...

pretty sure the issue was with schema disagreement.

All our CREATE TABLE use IF NOT EXISTS.... but I think the IF NOT EXISTS only checks locally?

My work around is going to be to use zookeeper to create a mutex lock during this operation.

Any other things I should avoid?

--
We’re hiring if you know of any awesome Java Devops or Linux Operations Engineers!

Founder/CEO Spinn3r.com<http://Spinn3r.com>
Location: San Francisco, CA
blog: http://burtonator.wordpress.com
… or check out my Google+ profile<https://plus.google.com/102718274791889610666/posts>
Error! Filename not specified.

--
We’re hiring if you know of any awesome Java Devops or Linux Operations Engineers!

Founder/CEO Spinn3r.com<http://Spinn3r.com>
Location: San Francisco, CA
blog: http://burtonator.wordpress.com
… or check out my Google+ profile<https://plus.google.com/102718274791889610666/posts>
Error! Filename not specified.

One way to accomplish both, a single process doing the work and having multiple machines be able to do it, is to have a control table.

You can have a table that lists what tables have been created and force concistency all. In this table you list the names of tables created. If a table name is in there, it doesn't need to be created again.

--
Ken Hancock | System Architect, Advanced Advertising
SeaChange International
50 Nagog Park
Acton, Massachusetts 01720
ken.hancock@schange.com<ma...@schange.com> | www.schange.com<http://www.schange.com/> | NASDAQ:SEAC<http://www.schange.com/en-US/Company/InvestorRelations.aspx>
Office: +1 (978) 889-3329<tel:%2B1%20%28978%29%20889-3329> | [Image removed by sender. Google Talk:]  ken.hancock@schange.com<ma...@schange.com> | [Image removed by sender. Skype:] hancockks | [Image removed by sender. Yahoo IM:] hancockks

[Image removed by sender. LinkedIn]<http://www.linkedin.com/in/kenhancock>

[Image removed by sender. SeaChange International]
<http://www.schange.com/>

This e-mail and any attachments may contain information which is SeaChange International confidential. The information enclosed is intended only for the addressees herein and may not be copied or forwarded without permission from SeaChange International.

--
Ken Hancock | System Architect, Advanced Advertising
SeaChange International
50 Nagog Park
Acton, Massachusetts 01720
ken.hancock@schange.com<ma...@schange.com> | www.schange.com<http://www.schange.com/> | NASDAQ:SEAC<http://www.schange.com/en-US/Company/InvestorRelations.aspx>
Office: +1 (978) 889-3329 | [Image removed by sender. Google Talk:]  ken.hancock@schange.com<ma...@schange.com> | [Image removed by sender. Skype:] hancockks | [Image removed by sender. Yahoo IM:] hancockks

[Image removed by sender. LinkedIn]<http://www.linkedin.com/in/kenhancock>

[Image removed by sender. SeaChange International]
<http://www.schange.com/>

This e-mail and any attachments may contain information which is SeaChange International confidential. The information enclosed is intended only for the addressees herein and may not be copied or forwarded without permission from SeaChange International.

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

Posted by Ken Hancock <ke...@schange.com>.

Just to close the loop on this, but am I correct that the IF NOT EXITS
isn't the real problem?  Even multiple calls to CREATE TABLE cause the same
schema mismatch if done concurrently?  Normally, a CREATE TABLE call will
return an exception that the table already exists.

On Tue, Feb 2, 2016 at 11:06 AM, Jack Krupansky <ja...@gmail.com>
wrote:

> And CASSANDRA-10699  seems to be the sub-issue of CASSANDRA-9424 to do
> that:
> https://issues.apache.org/jira/browse/CASSANDRA-10699
>
>
> -- Jack Krupansky
>
> On Tue, Feb 2, 2016 at 9:59 AM, Sebastian Estevez <
> sebastian.estevez@datastax.com> wrote:
>
>> Hi Ken,
>>
>> Earlier in this thread I posted a link to
>> https://issues.apache.org/jira/browse/CASSANDRA-9424
>>
>> That is the fix for these schema disagreement issues and as ay commented,
>> the plan is to use CAS. Until then we have to treat schema delicately.
>>
>> all the best,
>>
>> Sebastián
>> On Feb 2, 2016 9:48 AM, "Ken Hancock" <ke...@schange.com> wrote:
>>
>>> So this rings odd to me.  If you can accomplish the same thing by using
>>> a CAS operation, why not fix create table if not exist so that if your are
>>> writing an application that creates the table on startup, that the
>>> application is safe to run on multiple nodes and uses CAS to safeguard
>>> multiple concurrent creations?
>>>
>>>
>>> On Tue, Jan 26, 2016 at 12:32 PM, Eric Stevens <mi...@gmail.com>
>>> wrote:
>>>
>>>> There's still a race condition there, because two clients could SELECT
>>>> at the same time as each other, then both INSERT.
>>>>
>>>> You'd be better served with a CAS operation, and let Paxos guarantee
>>>> at-most-once execution.
>>>>
>>>> On Tue, Jan 26, 2016 at 9:06 AM Francisco Reyes <li...@natserv.net>
>>>> wrote:
>>>>
>>>>> On 01/22/2016 10:29 PM, Kevin Burton wrote:
>>>>>
>>>>> I sort of agree.. but we are also considering migrating to hourly
>>>>> tables.. and what if the single script doesn't run.
>>>>>
>>>>> I like having N nodes make changes like this because in my experience
>>>>> that central / single box will usually fail at the wrong time :-/
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad <jo...@jonhaddad.com>
>>>>> wrote:
>>>>>
>>>>>> Instead of using ZK, why not solve your concurrency problem by
>>>>>> removing it?  By that, I mean simply have 1 process that creates all your
>>>>>> tables instead of creating a race condition intentionally?
>>>>>>
>>>>>> On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton <bu...@spinn3r.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Not sure if this is a bug or not or kind of a *fuzzy* area.
>>>>>>>
>>>>>>> In 2.0 this worked fine.
>>>>>>>
>>>>>>> We have a bunch of automated scripts that go through and create
>>>>>>> tables... one per day.
>>>>>>>
>>>>>>> at midnight UTC our entire CQL went offline.. .took down our whole
>>>>>>> app.  ;-/
>>>>>>>
>>>>>>> The resolution was a full CQL shut down and then a drop table to
>>>>>>> remove the bad tables...
>>>>>>>
>>>>>>> pretty sure the issue was with schema disagreement.
>>>>>>>
>>>>>>> All our CREATE TABLE use IF NOT EXISTS.... but I think the IF NOT
>>>>>>> EXISTS only checks locally?
>>>>>>>
>>>>>>> My work around is going to be to use zookeeper to create a mutex
>>>>>>> lock during this operation.
>>>>>>>
>>>>>>> Any other things I should avoid?
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> We’re hiring if you know of any awesome Java Devops or Linux
>>>>>>> Operations Engineers!
>>>>>>>
>>>>>>> Founder/CEO Spinn3r.com
>>>>>>> Location: *San Francisco, CA*
>>>>>>> blog:  <http://burtonator.wordpress.com>
>>>>>>> http://burtonator.wordpress.com
>>>>>>> … or check out my Google+ profile
>>>>>>> <https://plus.google.com/102718274791889610666/posts>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> We’re hiring if you know of any awesome Java Devops or Linux
>>>>> Operations Engineers!
>>>>>
>>>>> Founder/CEO Spinn3r.com
>>>>> Location: *San Francisco, CA*
>>>>> blog:  <http://burtonator.wordpress.com>
>>>>> http://burtonator.wordpress.com
>>>>> … or check out my Google+ profile
>>>>> <https://plus.google.com/102718274791889610666/posts>
>>>>>
>>>>>
>>>>> One way to accomplish both, a single process doing the work and having
>>>>> multiple machines be able to do it, is to have a control table.
>>>>>
>>>>> You can have a table that lists what tables have been created and
>>>>> force concistency all. In this table you list the names of tables created.
>>>>> If a table name is in there, it doesn't need to be created again.
>>>>>
>>>>
>>>
>>>
>>> --
>>> *Ken Hancock *| System Architect, Advanced Advertising
>>> SeaChange International
>>> 50 Nagog Park
>>> Acton, Massachusetts 01720
>>> ken.hancock@schange.com | www.schange.com | NASDAQ:SEAC
>>> <http://www.schange.com/en-US/Company/InvestorRelations.aspx>
>>> Office: +1 (978) 889-3329 | [image: Google Talk:]
>>> ken.hancock@schange.com | [image: Skype:]hancockks | [image: Yahoo IM:]
>>> hancockks [image: LinkedIn] <http://www.linkedin.com/in/kenhancock>
>>>
>>> [image: SeaChange International]
>>> <http://www.schange.com/>
>>> This e-mail and any attachments may contain information which is
>>> SeaChange International confidential. The information enclosed is intended
>>> only for the addressees herein and may not be copied or forwarded without
>>> permission from SeaChange International.
>>>
>>
>


-- 
*Ken Hancock *| System Architect, Advanced Advertising
SeaChange International
50 Nagog Park
Acton, Massachusetts 01720
ken.hancock@schange.com | www.schange.com | NASDAQ:SEAC
<http://www.schange.com/en-US/Company/InvestorRelations.aspx>
Office: +1 (978) 889-3329 | [image: Google Talk:]
ken.hancock@schange.com | [image:
Skype:]hancockks | [image: Yahoo IM:]hancockks [image: LinkedIn]
<http://www.linkedin.com/in/kenhancock>

[image: SeaChange International]
<http://www.schange.com/>
This e-mail and any attachments may contain information which is SeaChange
International confidential. The information enclosed is intended only for
the addressees herein and may not be copied or forwarded without permission
from SeaChange International.

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

Posted by Jack Krupansky <ja...@gmail.com>.

And CASSANDRA-10699  seems to be the sub-issue of CASSANDRA-9424 to do that:
https://issues.apache.org/jira/browse/CASSANDRA-10699


-- Jack Krupansky

On Tue, Feb 2, 2016 at 9:59 AM, Sebastian Estevez <
sebastian.estevez@datastax.com> wrote:

> Hi Ken,
>
> Earlier in this thread I posted a link to
> https://issues.apache.org/jira/browse/CASSANDRA-9424
>
> That is the fix for these schema disagreement issues and as ay commented,
> the plan is to use CAS. Until then we have to treat schema delicately.
>
> all the best,
>
> Sebastián
> On Feb 2, 2016 9:48 AM, "Ken Hancock" <ke...@schange.com> wrote:
>
>> So this rings odd to me.  If you can accomplish the same thing by using a
>> CAS operation, why not fix create table if not exist so that if your are
>> writing an application that creates the table on startup, that the
>> application is safe to run on multiple nodes and uses CAS to safeguard
>> multiple concurrent creations?
>>
>>
>> On Tue, Jan 26, 2016 at 12:32 PM, Eric Stevens <mi...@gmail.com> wrote:
>>
>>> There's still a race condition there, because two clients could SELECT
>>> at the same time as each other, then both INSERT.
>>>
>>> You'd be better served with a CAS operation, and let Paxos guarantee
>>> at-most-once execution.
>>>
>>> On Tue, Jan 26, 2016 at 9:06 AM Francisco Reyes <li...@natserv.net>
>>> wrote:
>>>
>>>> On 01/22/2016 10:29 PM, Kevin Burton wrote:
>>>>
>>>> I sort of agree.. but we are also considering migrating to hourly
>>>> tables.. and what if the single script doesn't run.
>>>>
>>>> I like having N nodes make changes like this because in my experience
>>>> that central / single box will usually fail at the wrong time :-/
>>>>
>>>>
>>>>
>>>> On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad <jo...@jonhaddad.com>
>>>> wrote:
>>>>
>>>>> Instead of using ZK, why not solve your concurrency problem by
>>>>> removing it?  By that, I mean simply have 1 process that creates all your
>>>>> tables instead of creating a race condition intentionally?
>>>>>
>>>>> On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton <bu...@spinn3r.com>
>>>>> wrote:
>>>>>
>>>>>> Not sure if this is a bug or not or kind of a *fuzzy* area.
>>>>>>
>>>>>> In 2.0 this worked fine.
>>>>>>
>>>>>> We have a bunch of automated scripts that go through and create
>>>>>> tables... one per day.
>>>>>>
>>>>>> at midnight UTC our entire CQL went offline.. .took down our whole
>>>>>> app.  ;-/
>>>>>>
>>>>>> The resolution was a full CQL shut down and then a drop table to
>>>>>> remove the bad tables...
>>>>>>
>>>>>> pretty sure the issue was with schema disagreement.
>>>>>>
>>>>>> All our CREATE TABLE use IF NOT EXISTS.... but I think the IF NOT
>>>>>> EXISTS only checks locally?
>>>>>>
>>>>>> My work around is going to be to use zookeeper to create a mutex lock
>>>>>> during this operation.
>>>>>>
>>>>>> Any other things I should avoid?
>>>>>>
>>>>>>
>>>>>> --
>>>>>> We’re hiring if you know of any awesome Java Devops or Linux
>>>>>> Operations Engineers!
>>>>>>
>>>>>> Founder/CEO Spinn3r.com
>>>>>> Location: *San Francisco, CA*
>>>>>> blog:  <http://burtonator.wordpress.com>
>>>>>> http://burtonator.wordpress.com
>>>>>> … or check out my Google+ profile
>>>>>> <https://plus.google.com/102718274791889610666/posts>
>>>>>>
>>>>>>
>>>>
>>>>
>>>> --
>>>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>>>> Engineers!
>>>>
>>>> Founder/CEO Spinn3r.com
>>>> Location: *San Francisco, CA*
>>>> blog:  <http://burtonator.wordpress.com>http://burtonator.wordpress.com
>>>> … or check out my Google+ profile
>>>> <https://plus.google.com/102718274791889610666/posts>
>>>>
>>>>
>>>> One way to accomplish both, a single process doing the work and having
>>>> multiple machines be able to do it, is to have a control table.
>>>>
>>>> You can have a table that lists what tables have been created and force
>>>> concistency all. In this table you list the names of tables created. If a
>>>> table name is in there, it doesn't need to be created again.
>>>>
>>>
>>
>>
>> --
>> *Ken Hancock *| System Architect, Advanced Advertising
>> SeaChange International
>> 50 Nagog Park
>> Acton, Massachusetts 01720
>> ken.hancock@schange.com | www.schange.com | NASDAQ:SEAC
>> <http://www.schange.com/en-US/Company/InvestorRelations.aspx>
>> Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hancock@schange.com
>>  | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks [image:
>> LinkedIn] <http://www.linkedin.com/in/kenhancock>
>>
>> [image: SeaChange International]
>> <http://www.schange.com/>
>> This e-mail and any attachments may contain information which is
>> SeaChange International confidential. The information enclosed is intended
>> only for the addressees herein and may not be copied or forwarded without
>> permission from SeaChange International.
>>
>

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

Posted by Sebastian Estevez <se...@datastax.com>.

Hi Ken,

Earlier in this thread I posted a link to
https://issues.apache.org/jira/browse/CASSANDRA-9424

That is the fix for these schema disagreement issues and as ay commented,
the plan is to use CAS. Until then we have to treat schema delicately.

all the best,

Sebastián
On Feb 2, 2016 9:48 AM, "Ken Hancock" <ke...@schange.com> wrote:

> So this rings odd to me.  If you can accomplish the same thing by using a
> CAS operation, why not fix create table if not exist so that if your are
> writing an application that creates the table on startup, that the
> application is safe to run on multiple nodes and uses CAS to safeguard
> multiple concurrent creations?
>
>
> On Tue, Jan 26, 2016 at 12:32 PM, Eric Stevens <mi...@gmail.com> wrote:
>
>> There's still a race condition there, because two clients could SELECT at
>> the same time as each other, then both INSERT.
>>
>> You'd be better served with a CAS operation, and let Paxos guarantee
>> at-most-once execution.
>>
>> On Tue, Jan 26, 2016 at 9:06 AM Francisco Reyes <li...@natserv.net>
>> wrote:
>>
>>> On 01/22/2016 10:29 PM, Kevin Burton wrote:
>>>
>>> I sort of agree.. but we are also considering migrating to hourly
>>> tables.. and what if the single script doesn't run.
>>>
>>> I like having N nodes make changes like this because in my experience
>>> that central / single box will usually fail at the wrong time :-/
>>>
>>>
>>>
>>> On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad <jo...@jonhaddad.com>
>>> wrote:
>>>
>>>> Instead of using ZK, why not solve your concurrency problem by removing
>>>> it?  By that, I mean simply have 1 process that creates all your tables
>>>> instead of creating a race condition intentionally?
>>>>
>>>> On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton <bu...@spinn3r.com>
>>>> wrote:
>>>>
>>>>> Not sure if this is a bug or not or kind of a *fuzzy* area.
>>>>>
>>>>> In 2.0 this worked fine.
>>>>>
>>>>> We have a bunch of automated scripts that go through and create
>>>>> tables... one per day.
>>>>>
>>>>> at midnight UTC our entire CQL went offline.. .took down our whole
>>>>> app.  ;-/
>>>>>
>>>>> The resolution was a full CQL shut down and then a drop table to
>>>>> remove the bad tables...
>>>>>
>>>>> pretty sure the issue was with schema disagreement.
>>>>>
>>>>> All our CREATE TABLE use IF NOT EXISTS.... but I think the IF NOT
>>>>> EXISTS only checks locally?
>>>>>
>>>>> My work around is going to be to use zookeeper to create a mutex lock
>>>>> during this operation.
>>>>>
>>>>> Any other things I should avoid?
>>>>>
>>>>>
>>>>> --
>>>>> We’re hiring if you know of any awesome Java Devops or Linux
>>>>> Operations Engineers!
>>>>>
>>>>> Founder/CEO Spinn3r.com
>>>>> Location: *San Francisco, CA*
>>>>> blog:  <http://burtonator.wordpress.com>
>>>>> http://burtonator.wordpress.com
>>>>> … or check out my Google+ profile
>>>>> <https://plus.google.com/102718274791889610666/posts>
>>>>>
>>>>>
>>>
>>>
>>> --
>>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>>> Engineers!
>>>
>>> Founder/CEO Spinn3r.com
>>> Location: *San Francisco, CA*
>>> blog:  <http://burtonator.wordpress.com>http://burtonator.wordpress.com
>>> … or check out my Google+ profile
>>> <https://plus.google.com/102718274791889610666/posts>
>>>
>>>
>>> One way to accomplish both, a single process doing the work and having
>>> multiple machines be able to do it, is to have a control table.
>>>
>>> You can have a table that lists what tables have been created and force
>>> concistency all. In this table you list the names of tables created. If a
>>> table name is in there, it doesn't need to be created again.
>>>
>>
>
>
> --
> *Ken Hancock *| System Architect, Advanced Advertising
> SeaChange International
> 50 Nagog Park
> Acton, Massachusetts 01720
> ken.hancock@schange.com | www.schange.com | NASDAQ:SEAC
> <http://www.schange.com/en-US/Company/InvestorRelations.aspx>
> Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hancock@schange.com
>  | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks [image:
> LinkedIn] <http://www.linkedin.com/in/kenhancock>
>
> [image: SeaChange International]
> <http://www.schange.com/>
> This e-mail and any attachments may contain information which is SeaChange
> International confidential. The information enclosed is intended only for
> the addressees herein and may not be copied or forwarded without permission
> from SeaChange International.
>

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

Posted by Ken Hancock <ke...@schange.com>.

So this rings odd to me.  If you can accomplish the same thing by using a
CAS operation, why not fix create table if not exist so that if your are
writing an application that creates the table on startup, that the
application is safe to run on multiple nodes and uses CAS to safeguard
multiple concurrent creations?


On Tue, Jan 26, 2016 at 12:32 PM, Eric Stevens <mi...@gmail.com> wrote:

> There's still a race condition there, because two clients could SELECT at
> the same time as each other, then both INSERT.
>
> You'd be better served with a CAS operation, and let Paxos guarantee
> at-most-once execution.
>
> On Tue, Jan 26, 2016 at 9:06 AM Francisco Reyes <li...@natserv.net> wrote:
>
>> On 01/22/2016 10:29 PM, Kevin Burton wrote:
>>
>> I sort of agree.. but we are also considering migrating to hourly
>> tables.. and what if the single script doesn't run.
>>
>> I like having N nodes make changes like this because in my experience
>> that central / single box will usually fail at the wrong time :-/
>>
>>
>>
>> On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad <jo...@jonhaddad.com>
>> wrote:
>>
>>> Instead of using ZK, why not solve your concurrency problem by removing
>>> it?  By that, I mean simply have 1 process that creates all your tables
>>> instead of creating a race condition intentionally?
>>>
>>> On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton <bu...@spinn3r.com> wrote:
>>>
>>>> Not sure if this is a bug or not or kind of a *fuzzy* area.
>>>>
>>>> In 2.0 this worked fine.
>>>>
>>>> We have a bunch of automated scripts that go through and create
>>>> tables... one per day.
>>>>
>>>> at midnight UTC our entire CQL went offline.. .took down our whole app.
>>>>  ;-/
>>>>
>>>> The resolution was a full CQL shut down and then a drop table to remove
>>>> the bad tables...
>>>>
>>>> pretty sure the issue was with schema disagreement.
>>>>
>>>> All our CREATE TABLE use IF NOT EXISTS.... but I think the IF NOT
>>>> EXISTS only checks locally?
>>>>
>>>> My work around is going to be to use zookeeper to create a mutex lock
>>>> during this operation.
>>>>
>>>> Any other things I should avoid?
>>>>
>>>>
>>>> --
>>>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>>>> Engineers!
>>>>
>>>> Founder/CEO Spinn3r.com
>>>> Location: *San Francisco, CA*
>>>> blog:  <http://burtonator.wordpress.com>http://burtonator.wordpress.com
>>>> … or check out my Google+ profile
>>>> <https://plus.google.com/102718274791889610666/posts>
>>>>
>>>>
>>
>>
>> --
>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>> Engineers!
>>
>> Founder/CEO Spinn3r.com
>> Location: *San Francisco, CA*
>> blog:  <http://burtonator.wordpress.com>http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> <https://plus.google.com/102718274791889610666/posts>
>>
>>
>> One way to accomplish both, a single process doing the work and having
>> multiple machines be able to do it, is to have a control table.
>>
>> You can have a table that lists what tables have been created and force
>> concistency all. In this table you list the names of tables created. If a
>> table name is in there, it doesn't need to be created again.
>>
>


-- 
*Ken Hancock *| System Architect, Advanced Advertising
SeaChange International
50 Nagog Park
Acton, Massachusetts 01720
ken.hancock@schange.com | www.schange.com | NASDAQ:SEAC
<http://www.schange.com/en-US/Company/InvestorRelations.aspx>
Office: +1 (978) 889-3329 | [image: Google Talk:]
ken.hancock@schange.com | [image:
Skype:]hancockks | [image: Yahoo IM:]hancockks [image: LinkedIn]
<http://www.linkedin.com/in/kenhancock>

[image: SeaChange International]
<http://www.schange.com/>
This e-mail and any attachments may contain information which is SeaChange
International confidential. The information enclosed is intended only for
the addressees herein and may not be copied or forwarded without permission
from SeaChange International.

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

Posted by Eric Stevens <mi...@gmail.com>.

There's still a race condition there, because two clients could SELECT at
the same time as each other, then both INSERT.

You'd be better served with a CAS operation, and let Paxos guarantee
at-most-once execution.

On Tue, Jan 26, 2016 at 9:06 AM Francisco Reyes <li...@natserv.net> wrote:

> On 01/22/2016 10:29 PM, Kevin Burton wrote:
>
> I sort of agree.. but we are also considering migrating to hourly tables..
> and what if the single script doesn't run.
>
> I like having N nodes make changes like this because in my experience that
> central / single box will usually fail at the wrong time :-/
>
>
>
> On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad <jo...@jonhaddad.com>
> wrote:
>
>> Instead of using ZK, why not solve your concurrency problem by removing
>> it?  By that, I mean simply have 1 process that creates all your tables
>> instead of creating a race condition intentionally?
>>
>> On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton <bu...@spinn3r.com> wrote:
>>
>>> Not sure if this is a bug or not or kind of a *fuzzy* area.
>>>
>>> In 2.0 this worked fine.
>>>
>>> We have a bunch of automated scripts that go through and create
>>> tables... one per day.
>>>
>>> at midnight UTC our entire CQL went offline.. .took down our whole app.
>>>  ;-/
>>>
>>> The resolution was a full CQL shut down and then a drop table to remove
>>> the bad tables...
>>>
>>> pretty sure the issue was with schema disagreement.
>>>
>>> All our CREATE TABLE use IF NOT EXISTS.... but I think the IF NOT EXISTS
>>> only checks locally?
>>>
>>> My work around is going to be to use zookeeper to create a mutex lock
>>> during this operation.
>>>
>>> Any other things I should avoid?
>>>
>>>
>>> --
>>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>>> Engineers!
>>>
>>> Founder/CEO Spinn3r.com
>>> Location: *San Francisco, CA*
>>> blog: http://burtonator.wordpress.com
>>> … or check out my Google+ profile
>>> <https://plus.google.com/102718274791889610666/posts>
>>>
>>>
>
>
> --
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
>
>
> One way to accomplish both, a single process doing the work and having
> multiple machines be able to do it, is to have a control table.
>
> You can have a table that lists what tables have been created and force
> concistency all. In this table you list the names of tables created. If a
> table name is in there, it doesn't need to be created again.
>

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

Posted by Francisco Reyes <li...@natserv.net>.

On 01/22/2016 10:29 PM, Kevin Burton wrote:
> I sort of agree.. but we are also considering migrating to hourly 
> tables.. and what if the single script doesn't run.
>
> I like having N nodes make changes like this because in my experience 
> that central / single box will usually fail at the wrong time :-/
>
>
>
> On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad <jon@jonhaddad.com 
> <ma...@jonhaddad.com>> wrote:
>
>     Instead of using ZK, why not solve your concurrency problem by
>     removing it?  By that, I mean simply have 1 process that creates
>     all your tables instead of creating a race condition intentionally?
>
>     On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton <burton@spinn3r.com
>     <ma...@spinn3r.com>> wrote:
>
>         Not sure if this is a bug or not or kind of a *fuzzy* area.
>
>         In 2.0 this worked fine.
>
>         We have a bunch of automated scripts that go through and
>         create tables... one per day.
>
>         at midnight UTC our entire CQL went offline.. .took down our
>         whole app.  ;-/
>
>         The resolution was a full CQL shut down and then a drop table
>         to remove the bad tables...
>
>         pretty sure the issue was with schema disagreement.
>
>         All our CREATE TABLE use IF NOT EXISTS.... but I think the IF
>         NOT EXISTS only checks locally?
>
>         My work around is going to be to use zookeeper to create a
>         mutex lock during this operation.
>
>         Any other things I should avoid?
>
>
>         -- 
>         We’re hiring if you know of any awesome Java Devops or Linux
>         Operations Engineers!
>
>         Founder/CEO Spinn3r.com <http://Spinn3r.com>
>         Location: *San Francisco, CA*
>         blog:**http://burtonator.wordpress.com
>         … or check out my Google+ profile
>         <https://plus.google.com/102718274791889610666/posts>
>
>
>
>
> -- 
> We’re hiring if you know of any awesome Java Devops or Linux 
> Operations Engineers!
>
> Founder/CEO Spinn3r.com <http://Spinn3r.com>
> Location: *San Francisco, CA*
> blog:**http://burtonator.wordpress.com
> … or check out my Google+ profile 
> <https://plus.google.com/102718274791889610666/posts>
>

One way to accomplish both, a single process doing the work and having 
multiple machines be able to do it, is to have a control table.

You can have a table that lists what tables have been created and force 
concistency all. In this table you list the names of tables created. If 
a table name is in there, it doesn't need to be created again.

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

Posted by Kevin Burton <bu...@spinn3r.com>.

I sort of agree.. but we are also considering migrating to hourly tables..
and what if the single script doesn't run.

I like having N nodes make changes like this because in my experience that
central / single box will usually fail at the wrong time :-/



On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:

> Instead of using ZK, why not solve your concurrency problem by removing
> it?  By that, I mean simply have 1 process that creates all your tables
> instead of creating a race condition intentionally?
>
> On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton <bu...@spinn3r.com> wrote:
>
>> Not sure if this is a bug or not or kind of a *fuzzy* area.
>>
>> In 2.0 this worked fine.
>>
>> We have a bunch of automated scripts that go through and create tables...
>> one per day.
>>
>> at midnight UTC our entire CQL went offline.. .took down our whole app.
>>  ;-/
>>
>> The resolution was a full CQL shut down and then a drop table to remove
>> the bad tables...
>>
>> pretty sure the issue was with schema disagreement.
>>
>> All our CREATE TABLE use IF NOT EXISTS.... but I think the IF NOT EXISTS
>> only checks locally?
>>
>> My work around is going to be to use zookeeper to create a mutex lock
>> during this operation.
>>
>> Any other things I should avoid?
>>
>>
>> --
>>
>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>> Engineers!
>>
>> Founder/CEO Spinn3r.com
>> Location: *San Francisco, CA*
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> <https://plus.google.com/102718274791889610666/posts>
>>
>>


-- 

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

Posted by Jonathan Haddad <jo...@jonhaddad.com>.

Instead of using ZK, why not solve your concurrency problem by removing
it?  By that, I mean simply have 1 process that creates all your tables
instead of creating a race condition intentionally?

On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton <bu...@spinn3r.com> wrote:

> Not sure if this is a bug or not or kind of a *fuzzy* area.
>
> In 2.0 this worked fine.
>
> We have a bunch of automated scripts that go through and create tables...
> one per day.
>
> at midnight UTC our entire CQL went offline.. .took down our whole app.
>  ;-/
>
> The resolution was a full CQL shut down and then a drop table to remove
> the bad tables...
>
> pretty sure the issue was with schema disagreement.
>
> All our CREATE TABLE use IF NOT EXISTS.... but I think the IF NOT EXISTS
> only checks locally?
>
> My work around is going to be to use zookeeper to create a mutex lock
> during this operation.
>
> Any other things I should avoid?
>
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
>
>