You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Chip Salzenberg <re...@gmail.com> on 2010/11/06 02:03:46 UTC

node won't leave

In the below "nodetool ring" output, machine 18 was told to loadbalance over
an hour ago.  It won't actually leave the ring.  When I first told it to
loadbalance, the cluster was under heavy write load; I've turned off the
write load, but the node won't actually leave, still.  Help?

(It also collected 3.6G of load even though automatic bootstrapping is
disabled -- but this node had belonged to the cluster before, so maybe
cleaning out /var/lib/cassandra/* wasn't enough to prevent the node from
rejoining and taking data responsibility?)

$ nt ring
Address         Status State   Load            Token

170141183460469231731687303715884105727
X.22       Up     Normal  561.98 GB
42535295865117307932921825928971026431
X.18       Up     Leaving 3.68 GB
57974296897141728657026715086733697169
X.21       Up     Normal  523.78 GB
85070591730234615865843651857942052862
X.20       Up     Normal  539.24 GB
127605887595351923798765477786913079293
X.19       Up     Normal  505.12 GB
170141183460469231731687303715884105727

$ nodetool -h X.18 netstats
Mode: Leaving: streaming data to other nodes
Nothing streaming to /X.21
Nothing streaming to /X.20
Not receiving any streams.
Pool Name                    Active   Pending      Completed
Commands                        n/a         0          12246
Responses                       n/a         0       22595947

Re: node won't leave

Posted by Chip Salzenberg <re...@gmail.com>.

On Sun, Nov 7, 2010 at 11:58 PM, Reverend Chip <re...@gmail.com> wrote:

> Is there an existing tool to just read everything from every node, just
> to force a read repair on everything?
>

"nodetool repair", of course.  me-- for getting FAQ and mailing list out of
order.

Re: node won't leave

Posted by Reverend Chip <re...@gmail.com>.

On 11/6/2010 8:26 PM, Jonathan Ellis wrote:
> On Sat, Nov 6, 2010 at 4:51 PM, Reverend Chip <re...@gmail.com> wrote:
>>  Am I to understand that
>> ring maintenance requests can just fail when partially complete, in the
>> same manner as a regular insert might fail, perhaps due to inter-node
>> RPC overflow?
> Yes, in beta3 this can happen.  This was fixed in CASSANDRA-1676.

Interesting.  I'll upgrade.  Meanwhile, I waited until the next day, (!)
and the cluster reformed somewhere in the meantime.  I'm now seeing
streaming to the new node.

Is there an existing tool to just read everything from every node, just
to force a read repair on everything?


>> It would appear, then, that Cassandra isn't designed to be operated and
>> understood without constant log watching of all nodes.
> Not in beta, it's not. :)
>
> (In fact I would recommend running beta nodes at debug log level so
> when something goes wrong you have a better picture of what happened.)

Point taken :-)

Re: node won't leave

Posted by Jonathan Ellis <jb...@gmail.com>.

On Sat, Nov 6, 2010 at 4:51 PM, Reverend Chip <re...@gmail.com> wrote:
> On 11/6/2010 1:48 PM, Jonathan Ellis wrote:
>>   Did any of the nodes log any dropped messages?
>
> I didn't keep timestamps of the maintenance steps, so I will be unable
> to be sure which log entries correspond to which failure states.  I did
> find dropped message log entries on node X.22, though.  Here's the batch
> that happened more or less the time things went wrong:
>
>  WARN [ScheduledTasks:1] 2010-11-05 17:15:03,294 MessagingService.java
> (line 515) Dropped 9122 messages in the last 1000ms

> Am I to understand that
> ring maintenance requests can just fail when partially complete, in the
> same manner as a regular insert might fail, perhaps due to inter-node
> RPC overflow?

Yes, in beta3 this can happen.  This was fixed in CASSANDRA-1676.

> It would appear, then, that Cassandra isn't designed to be operated and
> understood without constant log watching of all nodes.

Not in beta, it's not. :)

(In fact I would recommend running beta nodes at debug log level so
when something goes wrong you have a better picture of what happened.)

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: node won't leave

Posted by Reverend Chip <re...@gmail.com>.

On 11/6/2010 1:48 PM, Jonathan Ellis wrote:
> On Fri, Nov 5, 2010 at 8:03 PM, Chip Salzenberg <re...@gmail.com> wrote:
>> In the below "nodetool ring" output, machine 18 was told to loadbalance over
>> an hour ago.  It won't actually leave the ring.  When I first told it to
>> loadbalance, the cluster was under heavy write load; I've turned off the
>> write load, but the node won't actually leave, still.  Help?
> What version is the cluster on?

You mean, the Cassandra version?  0.7 beta3.

>   Did any of the nodes log any dropped messages?

I didn't keep timestamps of the maintenance steps, so I will be unable
to be sure which log entries correspond to which failure states.  I did
find dropped message log entries on node X.22, though.  Here's the batch
that happened more or less the time things went wrong:

 WARN [ScheduledTasks:1] 2010-11-05 17:15:03,294 MessagingService.java
(line 515) Dropped 9122 messages in the last 1000ms
 WARN [ScheduledTasks:1] 2010-11-05 17:15:05,434 MessagingService.java
(line 515) Dropped 16658 messages in the last 1000ms
 WARN [ScheduledTasks:1] 2010-11-05 17:15:07,084 MessagingService.java
(line 515) Dropped 2167 messages in the last 1000ms
 WARN [ScheduledTasks:1] 2010-11-05 17:15:09,371 MessagingService.java
(line 515) Dropped 28011 messages in the last 1000ms
 WARN [ScheduledTasks:1] 2010-11-05 17:15:11,111 MessagingService.java
(line 515) Dropped 1139 messages in the last 1000ms
 WARN [ScheduledTasks:1] 2010-11-05 17:15:13,330 MessagingService.java
(line 515) Dropped 1203 messages in the last 1000ms
 WARN [ScheduledTasks:1] 2010-11-05 17:15:15,241 MessagingService.java
(line 515) Dropped 4494 messages in the last 1000ms
 WARN [ScheduledTasks:1] 2010-11-05 17:15:16,925 MessagingService.java
(line 515) Dropped 2277 messages in the last 1000ms
 WARN [ScheduledTasks:1] 2010-11-05 17:15:18,839 MessagingService.java
(line 515) Dropped 17376 messages in the last 1000ms
 WARN [ScheduledTasks:1] 2010-11-05 17:15:23,385 MessagingService.java
(line 515) Dropped 18714 messages in the last 1000ms
 WARN [ScheduledTasks:1] 2010-11-05 17:15:25,261 MessagingService.java
(line 515) Dropped 18952 messages in the last 1000ms
 WARN [ScheduledTasks:1] 2010-11-05 17:15:29,006 MessagingService.java
(line 515) Dropped 25137 messages in the last 1000ms
 WARN [ScheduledTasks:1] 2010-11-05 17:15:30,859 MessagingService.java
(line 515) Dropped 1 messages in the last 1000ms
 WARN [ScheduledTasks:1] 2010-11-05 17:15:34,418 MessagingService.java
(line 515) Dropped 2580 messages in the last 1000ms
 WARN [ScheduledTasks:1] 2010-11-05 17:15:35,816 MessagingService.java
(line 515) Dropped 4317 messages in the last 1000ms

I looked for similar messages on node X.21 but didn't find any.

It seems that node states can become weird or wedged -- bordering on
internally inconsistent -- and cleanup operations on the order of
"shutdown the node manually and force-remove it from the ring" are
commonplace.  I hope I'm missing something.  Am I to understand that
ring maintenance requests can just fail when partially complete, in the
same manner as a regular insert might fail, perhaps due to inter-node
RPC overflow?

> Any other error or warning messages?

"Cannot provide an optimal BloomFilter" several times, and "Schema
definitions were defined both locally and in cassandra.yaml" on startup.

>> (It also collected 3.6G of load even though automatic bootstrapping is
>> disabled -- but this node had belonged to the cluster before, so maybe
>> cleaning out /var/lib/cassandra/* wasn't enough to prevent the node from
>> rejoining and taking data responsibility?)
> Assuming that contains both commitlog and data directories, that
> should do it.  You can tell by what it logs when it first starts up,
> if it's asking other nodes to send it data.

It would appear, then, that Cassandra isn't designed to be operated and
understood without constant log watching of all nodes.

Re: node won't leave

Posted by Jonathan Ellis <jb...@gmail.com>.

On Fri, Nov 5, 2010 at 8:03 PM, Chip Salzenberg <re...@gmail.com> wrote:
> In the below "nodetool ring" output, machine 18 was told to loadbalance over
> an hour ago.  It won't actually leave the ring.  When I first told it to
> loadbalance, the cluster was under heavy write load; I've turned off the
> write load, but the node won't actually leave, still.  Help?

What version is the cluster on?  Did any of the nodes log any dropped
messages?  Any other error or warning messages?

> (It also collected 3.6G of load even though automatic bootstrapping is
> disabled -- but this node had belonged to the cluster before, so maybe
> cleaning out /var/lib/cassandra/* wasn't enough to prevent the node from
> rejoining and taking data responsibility?)

Assuming that contains both commitlog and data directories, that
should do it.  You can tell by what it logs when it first starts up,
if it's asking other nodes to send it data.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com