You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Keith Wright (JIRA)" <ji...@apache.org> on 2014/02/04 19:20:13 UTC
[jira] [Commented] (CASSANDRA-6156) Poor resilience and recovery for bootstrapping node - "unable to fetch range"

    [ https://issues.apache.org/jira/browse/CASSANDRA-6156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890964#comment-13890964 ] 

Keith Wright commented on CASSANDRA-6156:
-----------------------------------------

We are still experiencing this issue and its blocking us from growing our cluster.  We are running C* 1.2.13 using VNodes with ~1 TB per node with DataStax client.  When we attempt to bootstrap, we have seen nodes become unresponsive (still investigating the cause) causing the receiving node to give up on pending streams.  It appears to me that in order for a bootstrap to succeed, all of the nodes involved must remain stable the entire time it takes to send the data.  In the case of VNodes in large clusters, it seems to me that the probability of failure increases.

However in 2.0 is this addressed?  How can we apply that to our production 1.2.13 environment?  Is there a way we could somehow force the bootstrapping node to re-attempt the streams, after all we see them in netstats.

> Poor resilience and recovery for bootstrapping node - "unable to fetch range"
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-6156
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6156
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Alyssa Kwan
>            Priority: Minor
>
> We have an 8 node cluster on 1.2.8 using vnodes.  One of our nodes failed and we are having lots of trouble bootstrapping it back.  On each attempt, bootstrapping eventually fails with a RuntimeException "Unable to fetch range".  As far as we can tell, long GC pauses on the sender side cause heartbeat drops or delays, which leads the gossip controller to convict the connection and mark the sender dead.  We've done significant GC tuning to minimize the duration of pauses and raised phi_convict to its max.  It merely lets the bootstrap process take longer to fail.
> The inability to reliably add nodes significantly affects our ability to scale.
> We're not the only ones:  http://stackoverflow.com/questions/19199349/cassandra-bootstrap-fails-with-unable-to-fetch-range
> What can we do in the immediate term to bring this node in?  And what's the long term solution?
> One possible solution would be to allow bootstrapping to be an incremental process with individual transfers of vnode ownership instead of attempting to transfer the whole set of vnodes transactionally.  (I assume that's what's happening now.)  I don't know what would have to change on the gossip and token-aware client side to support this.
> Another solution would be to partition sstable files by vnode and allow transfer of those files directly with some sort of checkpointing of and incremental transfer of writes after the sstable is transferred.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)