You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Conan Cook <co...@amee.com> on 2012/05/08 18:51:29 UTC

Keyspace lost after restart

Hi Cassandra Folk,

We've experienced a problem a couple of times where Cassandra nodes lose a
keyspace after a restart.  We've restarted 2 out of 3 nodes, and they have
both experienced this problem; clearly we're doing something wrong, but
don't know what.  The data files are all still there, as before, but the
node can't see the keyspace (we only have one).  Tthe nodetool still says
that each one is responsible for 33% of the keys, but the disk usage has
dropped to a tiny amount on the nodes that we've restarted.  I saw this:

http://mail-archives.apache.org/mod_mbox/cassandra-user/201202.mbox/%3C4F3582E7.20907@conga.com%3E

Seems to be exactly our problem, but we have not modified the
cassandra.yaml - we have overwritten it through an automated process, and
that happened just before restarting, but the contents did not change.

Any ideas as to what might cause this, or how the keyspace can be restored
(like I say, the data is all still in the data directory).

We're running in AWS.

Thanks,


Conan

Re: Keyspace lost after restart

Posted by Conan Cook <co...@amee.com>.

Hi Jeff,

Great!  We'll roll back for now, thanks for letting me know.

Conan

On 11 May 2012 10:18, Jeff Williams <je...@wherethebitsroam.com> wrote:

> Conan,
>
> Good to see I'm not alone in this! I just set up a fresh test cluster. I
> first did a fresh install of 1.1.0 and was able to replicate the issue. I
> then did a fresh install using 1.0.10 and didn't see the issue. So it looks
> like rolling back to 1.0.10 could be the answer for now.
>
> Jeff
>
> On May 11, 2012, at 10:40 AM, Conan Cook wrote:
>
> Hi,
>
> OK we're pretty sure we dropped and re-created the keyspace before
> restarting the Cassandra nodes during some testing (we've been migrating to
> a new cluster).  The keyspace was created via the cli:
>
>
> create keyspace m7
>
>   with placement_strategy = 'NetworkTopologyStrategy'
>
>   and strategy_options = {us-east: 3}
>
>   and durable_writes = true;
>
>
> I'm pretty confident that it's a result of the issue I spotted before:
>
> https://issues.apache.org/jira/browse/CASSANDRA-4219
>
> Does anyone know whether this also affected versions before 1.1.0?  If not
> then we can just roll back until there's a fix; we're not using our cluster
> in production so we can afford to just bin it all and load it again.  +1
> for this being a major issue though, the fact that you can't see it until
> you restart a node makes it quite dangerous, and that node is lost when it
> occurs (I also haven't been able to restore the schema in any way).
>
> Thanks very much,
>
>
> Conan
>
>
>
> On 10 May 2012 17:15, Conan Cook <co...@amee.com> wrote:
>
>> Hi Aaron,
>>
>> Thanks for getting back to me!  Yes, I believe our keyspace was created
>> prior to 1.1, and I think I also understand why you're asking that, having
>> found this:
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-4219
>>
>> Here's our startup log:
>>
>> https://gist.github.com/2654155
>>
>> There isn't much in there of interest however.  It may well be the case
>> that we created our keyspace, dropped it, then created it again.  The dev
>> responsible for setting it up is ill today, but I'll get back to you
>> tomorrow with exact details of how it was originally created and whether we
>> did definitely drop and re-create it.
>>
>> Ta,
>>
>> Conan
>>
>>
>> On 10 May 2012 11:43, aaron morton <aa...@thelastpickle.com> wrote:
>>
>>> Was this a schema that was created prior to 1.1 ?
>>>
>>> What process are you using to create the schema ?
>>>
>>> Can you share the logs from system startup ? Up until it logs "Listening
>>> for thrift clients". (if they are long please link to them)
>>>
>>> Cheers
>>>
>>>   -----------------
>>> Aaron Morton
>>> Freelance Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>>
>>> On 10/05/2012, at 1:04 AM, Conan Cook wrote:
>>>
>>> Sorry, forgot to mention we're running Cassandra 1.1.
>>>
>>> Conan
>>>
>>> On 8 May 2012 17:51, Conan Cook <co...@amee.com> wrote:
>>>
>>>> Hi Cassandra Folk,
>>>>
>>>> We've experienced a problem a couple of times where Cassandra nodes
>>>> lose a keyspace after a restart.  We've restarted 2 out of 3 nodes, and
>>>> they have both experienced this problem; clearly we're doing something
>>>> wrong, but don't know what.  The data files are all still there, as before,
>>>> but the node can't see the keyspace (we only have one).  Tthe nodetool
>>>> still says that each one is responsible for 33% of the keys, but the disk
>>>> usage has dropped to a tiny amount on the nodes that we've restarted.  I
>>>> saw this:
>>>>
>>>>
>>>> http://mail-archives.apache.org/mod_mbox/cassandra-user/201202.mbox/%3C4F3582E7.20907@conga.com%3E
>>>>
>>>> Seems to be exactly our problem, but we have not modified the
>>>> cassandra.yaml - we have overwritten it through an automated process, and
>>>> that happened just before restarting, but the contents did not change.
>>>>
>>>> Any ideas as to what might cause this, or how the keyspace can be
>>>> restored (like I say, the data is all still in the data directory).
>>>>
>>>> We're running in AWS.
>>>>
>>>> Thanks,
>>>>
>>>>
>>>> Conan
>>>>
>>>
>>>
>>>
>>
>
>

Re: Keyspace lost after restart

Posted by Jeff Williams <je...@wherethebitsroam.com>.

Conan,

Good to see I'm not alone in this! I just set up a fresh test cluster. I first did a fresh install of 1.1.0 and was able to replicate the issue. I then did a fresh install using 1.0.10 and didn't see the issue. So it looks like rolling back to 1.0.10 could be the answer for now.

Jeff

On May 11, 2012, at 10:40 AM, Conan Cook wrote:

> Hi,
> 
> OK we're pretty sure we dropped and re-created the keyspace before restarting the Cassandra nodes during some testing (we've been migrating to a new cluster).  The keyspace was created via the cli:
> 
> 
> create keyspace m7
> 
>   with placement_strategy = 'NetworkTopologyStrategy'
> 
>   and strategy_options = {us-east: 3}
> 
>   and durable_writes = true;
> 
> I'm pretty confident that it's a result of the issue I spotted before:
> 
> https://issues.apache.org/jira/browse/CASSANDRA-4219 
> 
> Does anyone know whether this also affected versions before 1.1.0?  If not then we can just roll back until there's a fix; we're not using our cluster in production so we can afford to just bin it all and load it again.  +1 for this being a major issue though, the fact that you can't see it until you restart a node makes it quite dangerous, and that node is lost when it occurs (I also haven't been able to restore the schema in any way).
> 
> Thanks very much,
> 
> 
> Conan
> 
> 
> 
> On 10 May 2012 17:15, Conan Cook <co...@amee.com> wrote:
> Hi Aaron,
> 
> Thanks for getting back to me!  Yes, I believe our keyspace was created prior to 1.1, and I think I also understand why you're asking that, having found this:
> 
> https://issues.apache.org/jira/browse/CASSANDRA-4219 
> 
> Here's our startup log:
> 
> https://gist.github.com/2654155
> 
> There isn't much in there of interest however.  It may well be the case that we created our keyspace, dropped it, then created it again.  The dev responsible for setting it up is ill today, but I'll get back to you tomorrow with exact details of how it was originally created and whether we did definitely drop and re-create it.
> 
> Ta,
> 
> Conan
> 
> 
> On 10 May 2012 11:43, aaron morton <aa...@thelastpickle.com> wrote:
> Was this a schema that was created prior to 1.1 ?
> 
> What process are you using to create the schema ? 
> 
> Can you share the logs from system startup ? Up until it logs "Listening for thrift clients". (if they are long please link to them)
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 10/05/2012, at 1:04 AM, Conan Cook wrote:
> 
>> Sorry, forgot to mention we're running Cassandra 1.1.
>> 
>> Conan
>> 
>> On 8 May 2012 17:51, Conan Cook <co...@amee.com> wrote:
>> Hi Cassandra Folk,
>> 
>> We've experienced a problem a couple of times where Cassandra nodes lose a keyspace after a restart.  We've restarted 2 out of 3 nodes, and they have both experienced this problem; clearly we're doing something wrong, but don't know what.  The data files are all still there, as before, but the node can't see the keyspace (we only have one).  Tthe nodetool still says that each one is responsible for 33% of the keys, but the disk usage has dropped to a tiny amount on the nodes that we've restarted.  I saw this:
>> 
>> http://mail-archives.apache.org/mod_mbox/cassandra-user/201202.mbox/%3C4F3582E7.20907@conga.com%3E
>> 
>> Seems to be exactly our problem, but we have not modified the cassandra.yaml - we have overwritten it through an automated process, and that happened just before restarting, but the contents did not change.
>> 
>> Any ideas as to what might cause this, or how the keyspace can be restored (like I say, the data is all still in the data directory).
>> 
>> We're running in AWS.
>> 
>> Thanks,
>> 
>> 
>> Conan
>> 
> 
> 
>

Re: Keyspace lost after restart

Posted by Conan Cook <co...@amee.com>.

Hi,

OK we're pretty sure we dropped and re-created the keyspace before
restarting the Cassandra nodes during some testing (we've been migrating to
a new cluster).  The keyspace was created via the cli:

create keyspace m7
  with placement_strategy = 'NetworkTopologyStrategy'
  and strategy_options = {us-east: 3}
  and durable_writes = true;


I'm pretty confident that it's a result of the issue I spotted before:

https://issues.apache.org/jira/browse/CASSANDRA-4219

Does anyone know whether this also affected versions before 1.1.0?  If not
then we can just roll back until there's a fix; we're not using our cluster
in production so we can afford to just bin it all and load it again.  +1
for this being a major issue though, the fact that you can't see it until
you restart a node makes it quite dangerous, and that node is lost when it
occurs (I also haven't been able to restore the schema in any way).

Thanks very much,


Conan



On 10 May 2012 17:15, Conan Cook <co...@amee.com> wrote:

> Hi Aaron,
>
> Thanks for getting back to me!  Yes, I believe our keyspace was created
> prior to 1.1, and I think I also understand why you're asking that, having
> found this:
>
> https://issues.apache.org/jira/browse/CASSANDRA-4219
>
> Here's our startup log:
>
> https://gist.github.com/2654155
>
> There isn't much in there of interest however.  It may well be the case
> that we created our keyspace, dropped it, then created it again.  The dev
> responsible for setting it up is ill today, but I'll get back to you
> tomorrow with exact details of how it was originally created and whether we
> did definitely drop and re-create it.
>
> Ta,
>
> Conan
>
>
> On 10 May 2012 11:43, aaron morton <aa...@thelastpickle.com> wrote:
>
>> Was this a schema that was created prior to 1.1 ?
>>
>> What process are you using to create the schema ?
>>
>> Can you share the logs from system startup ? Up until it logs "Listening
>> for thrift clients". (if they are long please link to them)
>>
>> Cheers
>>
>>   -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 10/05/2012, at 1:04 AM, Conan Cook wrote:
>>
>> Sorry, forgot to mention we're running Cassandra 1.1.
>>
>> Conan
>>
>> On 8 May 2012 17:51, Conan Cook <co...@amee.com> wrote:
>>
>>> Hi Cassandra Folk,
>>>
>>> We've experienced a problem a couple of times where Cassandra nodes lose
>>> a keyspace after a restart.  We've restarted 2 out of 3 nodes, and they
>>> have both experienced this problem; clearly we're doing something wrong,
>>> but don't know what.  The data files are all still there, as before, but
>>> the node can't see the keyspace (we only have one).  Tthe nodetool still
>>> says that each one is responsible for 33% of the keys, but the disk usage
>>> has dropped to a tiny amount on the nodes that we've restarted.  I saw this:
>>>
>>>
>>> http://mail-archives.apache.org/mod_mbox/cassandra-user/201202.mbox/%3C4F3582E7.20907@conga.com%3E
>>>
>>> Seems to be exactly our problem, but we have not modified the
>>> cassandra.yaml - we have overwritten it through an automated process, and
>>> that happened just before restarting, but the contents did not change.
>>>
>>> Any ideas as to what might cause this, or how the keyspace can be
>>> restored (like I say, the data is all still in the data directory).
>>>
>>> We're running in AWS.
>>>
>>> Thanks,
>>>
>>>
>>> Conan
>>>
>>
>>
>>
>

Re: Keyspace lost after restart

Posted by Conan Cook <co...@amee.com>.

Hi Aaron,

Thanks for getting back to me!  Yes, I believe our keyspace was created
prior to 1.1, and I think I also understand why you're asking that, having
found this:

https://issues.apache.org/jira/browse/CASSANDRA-4219

Here's our startup log:

https://gist.github.com/2654155

There isn't much in there of interest however.  It may well be the case
that we created our keyspace, dropped it, then created it again.  The dev
responsible for setting it up is ill today, but I'll get back to you
tomorrow with exact details of how it was originally created and whether we
did definitely drop and re-create it.

Ta,

Conan


On 10 May 2012 11:43, aaron morton <aa...@thelastpickle.com> wrote:

> Was this a schema that was created prior to 1.1 ?
>
> What process are you using to create the schema ?
>
> Can you share the logs from system startup ? Up until it logs "Listening
> for thrift clients". (if they are long please link to them)
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 10/05/2012, at 1:04 AM, Conan Cook wrote:
>
> Sorry, forgot to mention we're running Cassandra 1.1.
>
> Conan
>
> On 8 May 2012 17:51, Conan Cook <co...@amee.com> wrote:
>
>> Hi Cassandra Folk,
>>
>> We've experienced a problem a couple of times where Cassandra nodes lose
>> a keyspace after a restart.  We've restarted 2 out of 3 nodes, and they
>> have both experienced this problem; clearly we're doing something wrong,
>> but don't know what.  The data files are all still there, as before, but
>> the node can't see the keyspace (we only have one).  Tthe nodetool still
>> says that each one is responsible for 33% of the keys, but the disk usage
>> has dropped to a tiny amount on the nodes that we've restarted.  I saw this:
>>
>>
>> http://mail-archives.apache.org/mod_mbox/cassandra-user/201202.mbox/%3C4F3582E7.20907@conga.com%3E
>>
>> Seems to be exactly our problem, but we have not modified the
>> cassandra.yaml - we have overwritten it through an automated process, and
>> that happened just before restarting, but the contents did not change.
>>
>> Any ideas as to what might cause this, or how the keyspace can be
>> restored (like I say, the data is all still in the data directory).
>>
>> We're running in AWS.
>>
>> Thanks,
>>
>>
>> Conan
>>
>
>
>

Re: Keyspace lost after restart

Posted by aaron morton <aa...@thelastpickle.com>.

Was this a schema that was created prior to 1.1 ?

What process are you using to create the schema ? 

Can you share the logs from system startup ? Up until it logs "Listening for thrift clients". (if they are long please link to them)

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 10/05/2012, at 1:04 AM, Conan Cook wrote:

> Sorry, forgot to mention we're running Cassandra 1.1.
> 
> Conan
> 
> On 8 May 2012 17:51, Conan Cook <co...@amee.com> wrote:
> Hi Cassandra Folk,
> 
> We've experienced a problem a couple of times where Cassandra nodes lose a keyspace after a restart.  We've restarted 2 out of 3 nodes, and they have both experienced this problem; clearly we're doing something wrong, but don't know what.  The data files are all still there, as before, but the node can't see the keyspace (we only have one).  Tthe nodetool still says that each one is responsible for 33% of the keys, but the disk usage has dropped to a tiny amount on the nodes that we've restarted.  I saw this:
> 
> http://mail-archives.apache.org/mod_mbox/cassandra-user/201202.mbox/%3C4F3582E7.20907@conga.com%3E
> 
> Seems to be exactly our problem, but we have not modified the cassandra.yaml - we have overwritten it through an automated process, and that happened just before restarting, but the contents did not change.
> 
> Any ideas as to what might cause this, or how the keyspace can be restored (like I say, the data is all still in the data directory).
> 
> We're running in AWS.
> 
> Thanks,
> 
> 
> Conan
>

Re: Keyspace lost after restart

Posted by Conan Cook <co...@amee.com>.

Sorry, forgot to mention we're running Cassandra 1.1.

Conan

On 8 May 2012 17:51, Conan Cook <co...@amee.com> wrote:

> Hi Cassandra Folk,
>
> We've experienced a problem a couple of times where Cassandra nodes lose a
> keyspace after a restart.  We've restarted 2 out of 3 nodes, and they have
> both experienced this problem; clearly we're doing something wrong, but
> don't know what.  The data files are all still there, as before, but the
> node can't see the keyspace (we only have one).  Tthe nodetool still says
> that each one is responsible for 33% of the keys, but the disk usage has
> dropped to a tiny amount on the nodes that we've restarted.  I saw this:
>
>
> http://mail-archives.apache.org/mod_mbox/cassandra-user/201202.mbox/%3C4F3582E7.20907@conga.com%3E
>
> Seems to be exactly our problem, but we have not modified the
> cassandra.yaml - we have overwritten it through an automated process, and
> that happened just before restarting, but the contents did not change.
>
> Any ideas as to what might cause this, or how the keyspace can be restored
> (like I say, the data is all still in the data directory).
>
> We're running in AWS.
>
> Thanks,
>
>
> Conan
>