You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Phil Burress <ph...@gmail.com> on 2014/04/16 20:10:14 UTC

Bootstrap Timing

Greetings,

How long does bootstrapping typically take? I have 3 existing nodes in our
cluster with about 40GB each. I've added three new nodes to the cluster.
They have been in bootstrap mode for a little over 3 days now. Should I be
concerned? Is there a way to tell how long it will take to finish?

Running Cassandra version 2.0.6. on Ubuntu 12.04.

Thanks very much!

Phil

Re: Bootstrap Timing

Posted by Steven A Robenalt <sr...@stanford.edu>.

Interesting. I did our 2.0.3 -> 2.0.5 upgrade by bootstrapping/joining each
node into our cluster, one at a time, then retiring the old nodes one at a
time. Maybe something specific to the 2.0.6 release?

Good to hear that you've gotten through it anyway.

Steve



On Fri, Apr 25, 2014 at 7:49 AM, Phil Burress <ph...@gmail.com>wrote:

> Cassandra 2.0.6
>
>
> On Fri, Apr 25, 2014 at 10:31 AM, James Rothering <jr...@codojo.me>wrote:
>
>> What version of C* is this?
>>
>>
>> On Fri, Apr 25, 2014 at 6:55 AM, Phil Burress <ph...@gmail.com>wrote:
>>
>>> Just a follow-up on this for any interested parties. Ultimately we've
>>> determined that the bootstrap/join process is broken in Cassandra. We ended
>>> up creating an entirely new cluster and migrating the data.
>>>
>>>
>>> On Mon, Apr 21, 2014 at 10:32 AM, Phil Burress <philburresseme@gmail.com
>>> > wrote:
>>>
>>>> The new node has managed to stay up without dying for about 24 hours
>>>> now... but it still is in JOINING state. A new concern has popped up. Disk
>>>> usage is at 500GB on the new node. The three original nodes have about 40GB
>>>> each. Any ideas why this is happening?
>>>>
>>>>
>>>> On Sat, Apr 19, 2014 at 9:19 PM, Phil Burress <philburresseme@gmail.com
>>>> > wrote:
>>>>
>>>>> Thank you all for your advice and good info. The node has died a
>>>>> couple of times with out of memory errors. I've restarted each time but it
>>>>> starts re - running compaction and then dies again.
>>>>>
>>>>> Is there a better way to do this?
>>>>> On Apr 18, 2014 6:06 PM, "Steven A Robenalt" <sr...@stanford.edu>
>>>>> wrote:
>>>>>
>>>>>> That's what I'd be doing, but I wouldn't expect it to run for 3 days
>>>>>> this time. My guess is that whatever was going wrong with the bootstrap
>>>>>> when you had 3 nodes starting at once was interfering with the completion
>>>>>> of the 1 remaining node of those 3. A clean bootstrap of a single node
>>>>>> should complete eventually, and I would think it'll be a lot less than 3
>>>>>> days. Our database is much smaller than yours at the moment, so I can't
>>>>>> really guide you on how long it should take, but I'd think that others on
>>>>>> the list with similar database sizes might be able to give you a better
>>>>>> idea.
>>>>>>
>>>>>> Steve
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 18, 2014 at 1:43 PM, Phil Burress <
>>>>>> philburresseme@gmail.com> wrote:
>>>>>>
>>>>>>> First, I just stopped 2 of the nodes and left one running. But this
>>>>>>> morning, I stopped that third node, cleared out the data, restarted and let
>>>>>>> it rejoin again. It appears streaming is done (according to netstats),
>>>>>>> right now it appears to be running compaction and building secondary index
>>>>>>> (according to compactionstats). Just sit and wait I guess?
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Apr 18, 2014 at 2:23 PM, Steven A Robenalt <
>>>>>>> srobenal@stanford.edu> wrote:
>>>>>>>
>>>>>>>> Looking back through this email chain, it looks like Phil said he
>>>>>>>> wasn't using vnodes.
>>>>>>>>
>>>>>>>> For the record, we are using vnodes since we brought up our first
>>>>>>>> cluster, and have not seen any issues with bootstrapping new nodes either
>>>>>>>> to replace existing nodes, or to grow/shrink the cluster. We did adhere to
>>>>>>>> the caveats that new nodes should not be seed nodes, and that we should
>>>>>>>> allow each node to join the cluster completely before making any other
>>>>>>>> changes.
>>>>>>>>
>>>>>>>> Phil, when you dropped to adding just the single node to your
>>>>>>>> cluster, did you start over with the newly added node (blowing away the
>>>>>>>> database created on the previous startup), or did you shut down the other 2
>>>>>>>> added nodes and leave the remaining one in progress to continue?
>>>>>>>>
>>>>>>>> Steve
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Apr 18, 2014 at 10:40 AM, Robert Coli <rcoli@eventbrite.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> On Fri, Apr 18, 2014 at 5:05 AM, Phil Burress <
>>>>>>>>> philburresseme@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> nodetool netstats shows 84 files. They are all at 100%. Nothing
>>>>>>>>>> showing in Pending or Active for Read Repair Stats.
>>>>>>>>>>
>>>>>>>>>> I'm assuming this means it's done. But it still shows "JOINING".
>>>>>>>>>> Is there an undocumented step I'm missing here? This whole process seems
>>>>>>>>>> broken to me.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Lately it seems like a lot more people than usual are :
>>>>>>>>>
>>>>>>>>> 1) using vnodes
>>>>>>>>> 2) unable to bootstrap new nodes
>>>>>>>>>
>>>>>>>>> If I were you, I would likely file a JIRA detailing your negative
>>>>>>>>> experience with this core functionality.
>>>>>>>>>
>>>>>>>>> =Rob
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Steve Robenalt
>>>>>>>> Software Architect
>>>>>>>>  HighWire | Stanford University
>>>>>>>> 425 Broadway St, Redwood City, CA 94063
>>>>>>>>
>>>>>>>> srobenal@stanford.edu
>>>>>>>> http://highwire.stanford.edu
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Steve Robenalt
>>>>>> Software Architect
>>>>>> HighWire | Stanford University
>>>>>> 425 Broadway St, Redwood City, CA 94063
>>>>>>
>>>>>> srobenal@stanford.edu
>>>>>> http://highwire.stanford.edu
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
>>
>


-- 
Steve Robenalt
Software Architect
HighWire | Stanford University
425 Broadway St, Redwood City, CA 94063

srobenal@stanford.edu
http://highwire.stanford.edu

Re: Bootstrap Timing

Posted by Phil Burress <ph...@gmail.com>.

Cassandra 2.0.6


On Fri, Apr 25, 2014 at 10:31 AM, James Rothering <jr...@codojo.me>wrote:

> What version of C* is this?
>
>
> On Fri, Apr 25, 2014 at 6:55 AM, Phil Burress <ph...@gmail.com>wrote:
>
>> Just a follow-up on this for any interested parties. Ultimately we've
>> determined that the bootstrap/join process is broken in Cassandra. We ended
>> up creating an entirely new cluster and migrating the data.
>>
>>
>> On Mon, Apr 21, 2014 at 10:32 AM, Phil Burress <ph...@gmail.com>wrote:
>>
>>> The new node has managed to stay up without dying for about 24 hours
>>> now... but it still is in JOINING state. A new concern has popped up. Disk
>>> usage is at 500GB on the new node. The three original nodes have about 40GB
>>> each. Any ideas why this is happening?
>>>
>>>
>>> On Sat, Apr 19, 2014 at 9:19 PM, Phil Burress <ph...@gmail.com>wrote:
>>>
>>>> Thank you all for your advice and good info. The node has died a couple
>>>> of times with out of memory errors. I've restarted each time but it starts
>>>> re - running compaction and then dies again.
>>>>
>>>> Is there a better way to do this?
>>>> On Apr 18, 2014 6:06 PM, "Steven A Robenalt" <sr...@stanford.edu>
>>>> wrote:
>>>>
>>>>> That's what I'd be doing, but I wouldn't expect it to run for 3 days
>>>>> this time. My guess is that whatever was going wrong with the bootstrap
>>>>> when you had 3 nodes starting at once was interfering with the completion
>>>>> of the 1 remaining node of those 3. A clean bootstrap of a single node
>>>>> should complete eventually, and I would think it'll be a lot less than 3
>>>>> days. Our database is much smaller than yours at the moment, so I can't
>>>>> really guide you on how long it should take, but I'd think that others on
>>>>> the list with similar database sizes might be able to give you a better
>>>>> idea.
>>>>>
>>>>> Steve
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Apr 18, 2014 at 1:43 PM, Phil Burress <
>>>>> philburresseme@gmail.com> wrote:
>>>>>
>>>>>> First, I just stopped 2 of the nodes and left one running. But this
>>>>>> morning, I stopped that third node, cleared out the data, restarted and let
>>>>>> it rejoin again. It appears streaming is done (according to netstats),
>>>>>> right now it appears to be running compaction and building secondary index
>>>>>> (according to compactionstats). Just sit and wait I guess?
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 18, 2014 at 2:23 PM, Steven A Robenalt <
>>>>>> srobenal@stanford.edu> wrote:
>>>>>>
>>>>>>> Looking back through this email chain, it looks like Phil said he
>>>>>>> wasn't using vnodes.
>>>>>>>
>>>>>>> For the record, we are using vnodes since we brought up our first
>>>>>>> cluster, and have not seen any issues with bootstrapping new nodes either
>>>>>>> to replace existing nodes, or to grow/shrink the cluster. We did adhere to
>>>>>>> the caveats that new nodes should not be seed nodes, and that we should
>>>>>>> allow each node to join the cluster completely before making any other
>>>>>>> changes.
>>>>>>>
>>>>>>> Phil, when you dropped to adding just the single node to your
>>>>>>> cluster, did you start over with the newly added node (blowing away the
>>>>>>> database created on the previous startup), or did you shut down the other 2
>>>>>>> added nodes and leave the remaining one in progress to continue?
>>>>>>>
>>>>>>> Steve
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Apr 18, 2014 at 10:40 AM, Robert Coli <rc...@eventbrite.com>wrote:
>>>>>>>
>>>>>>>> On Fri, Apr 18, 2014 at 5:05 AM, Phil Burress <
>>>>>>>> philburresseme@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> nodetool netstats shows 84 files. They are all at 100%. Nothing
>>>>>>>>> showing in Pending or Active for Read Repair Stats.
>>>>>>>>>
>>>>>>>>> I'm assuming this means it's done. But it still shows "JOINING".
>>>>>>>>> Is there an undocumented step I'm missing here? This whole process seems
>>>>>>>>> broken to me.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Lately it seems like a lot more people than usual are :
>>>>>>>>
>>>>>>>> 1) using vnodes
>>>>>>>> 2) unable to bootstrap new nodes
>>>>>>>>
>>>>>>>> If I were you, I would likely file a JIRA detailing your negative
>>>>>>>> experience with this core functionality.
>>>>>>>>
>>>>>>>> =Rob
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Steve Robenalt
>>>>>>> Software Architect
>>>>>>>  HighWire | Stanford University
>>>>>>> 425 Broadway St, Redwood City, CA 94063
>>>>>>>
>>>>>>> srobenal@stanford.edu
>>>>>>> http://highwire.stanford.edu
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Steve Robenalt
>>>>> Software Architect
>>>>> HighWire | Stanford University
>>>>> 425 Broadway St, Redwood City, CA 94063
>>>>>
>>>>> srobenal@stanford.edu
>>>>> http://highwire.stanford.edu
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>
>

Re: Bootstrap Timing

Posted by James Rothering <jr...@codojo.me>.

What version of C* is this?


On Fri, Apr 25, 2014 at 6:55 AM, Phil Burress <ph...@gmail.com>wrote:

> Just a follow-up on this for any interested parties. Ultimately we've
> determined that the bootstrap/join process is broken in Cassandra. We ended
> up creating an entirely new cluster and migrating the data.
>
>
> On Mon, Apr 21, 2014 at 10:32 AM, Phil Burress <ph...@gmail.com>wrote:
>
>> The new node has managed to stay up without dying for about 24 hours
>> now... but it still is in JOINING state. A new concern has popped up. Disk
>> usage is at 500GB on the new node. The three original nodes have about 40GB
>> each. Any ideas why this is happening?
>>
>>
>> On Sat, Apr 19, 2014 at 9:19 PM, Phil Burress <ph...@gmail.com>wrote:
>>
>>> Thank you all for your advice and good info. The node has died a couple
>>> of times with out of memory errors. I've restarted each time but it starts
>>> re - running compaction and then dies again.
>>>
>>> Is there a better way to do this?
>>> On Apr 18, 2014 6:06 PM, "Steven A Robenalt" <sr...@stanford.edu>
>>> wrote:
>>>
>>>> That's what I'd be doing, but I wouldn't expect it to run for 3 days
>>>> this time. My guess is that whatever was going wrong with the bootstrap
>>>> when you had 3 nodes starting at once was interfering with the completion
>>>> of the 1 remaining node of those 3. A clean bootstrap of a single node
>>>> should complete eventually, and I would think it'll be a lot less than 3
>>>> days. Our database is much smaller than yours at the moment, so I can't
>>>> really guide you on how long it should take, but I'd think that others on
>>>> the list with similar database sizes might be able to give you a better
>>>> idea.
>>>>
>>>> Steve
>>>>
>>>>
>>>>
>>>> On Fri, Apr 18, 2014 at 1:43 PM, Phil Burress <philburresseme@gmail.com
>>>> > wrote:
>>>>
>>>>> First, I just stopped 2 of the nodes and left one running. But this
>>>>> morning, I stopped that third node, cleared out the data, restarted and let
>>>>> it rejoin again. It appears streaming is done (according to netstats),
>>>>> right now it appears to be running compaction and building secondary index
>>>>> (according to compactionstats). Just sit and wait I guess?
>>>>>
>>>>>
>>>>> On Fri, Apr 18, 2014 at 2:23 PM, Steven A Robenalt <
>>>>> srobenal@stanford.edu> wrote:
>>>>>
>>>>>> Looking back through this email chain, it looks like Phil said he
>>>>>> wasn't using vnodes.
>>>>>>
>>>>>> For the record, we are using vnodes since we brought up our first
>>>>>> cluster, and have not seen any issues with bootstrapping new nodes either
>>>>>> to replace existing nodes, or to grow/shrink the cluster. We did adhere to
>>>>>> the caveats that new nodes should not be seed nodes, and that we should
>>>>>> allow each node to join the cluster completely before making any other
>>>>>> changes.
>>>>>>
>>>>>> Phil, when you dropped to adding just the single node to your
>>>>>> cluster, did you start over with the newly added node (blowing away the
>>>>>> database created on the previous startup), or did you shut down the other 2
>>>>>> added nodes and leave the remaining one in progress to continue?
>>>>>>
>>>>>> Steve
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 18, 2014 at 10:40 AM, Robert Coli <rc...@eventbrite.com>wrote:
>>>>>>
>>>>>>> On Fri, Apr 18, 2014 at 5:05 AM, Phil Burress <
>>>>>>> philburresseme@gmail.com> wrote:
>>>>>>>
>>>>>>>> nodetool netstats shows 84 files. They are all at 100%. Nothing
>>>>>>>> showing in Pending or Active for Read Repair Stats.
>>>>>>>>
>>>>>>>> I'm assuming this means it's done. But it still shows "JOINING". Is
>>>>>>>> there an undocumented step I'm missing here? This whole process seems
>>>>>>>> broken to me.
>>>>>>>>
>>>>>>>
>>>>>>> Lately it seems like a lot more people than usual are :
>>>>>>>
>>>>>>> 1) using vnodes
>>>>>>> 2) unable to bootstrap new nodes
>>>>>>>
>>>>>>> If I were you, I would likely file a JIRA detailing your negative
>>>>>>> experience with this core functionality.
>>>>>>>
>>>>>>> =Rob
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Steve Robenalt
>>>>>> Software Architect
>>>>>>  HighWire | Stanford University
>>>>>> 425 Broadway St, Redwood City, CA 94063
>>>>>>
>>>>>> srobenal@stanford.edu
>>>>>> http://highwire.stanford.edu
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Steve Robenalt
>>>> Software Architect
>>>> HighWire | Stanford University
>>>> 425 Broadway St, Redwood City, CA 94063
>>>>
>>>> srobenal@stanford.edu
>>>> http://highwire.stanford.edu
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>
>

Re: Bootstrap Timing

Posted by Phil Burress <ph...@gmail.com>.

Just a follow-up on this for any interested parties. Ultimately we've
determined that the bootstrap/join process is broken in Cassandra. We ended
up creating an entirely new cluster and migrating the data.


On Mon, Apr 21, 2014 at 10:32 AM, Phil Burress <ph...@gmail.com>wrote:

> The new node has managed to stay up without dying for about 24 hours
> now... but it still is in JOINING state. A new concern has popped up. Disk
> usage is at 500GB on the new node. The three original nodes have about 40GB
> each. Any ideas why this is happening?
>
>
> On Sat, Apr 19, 2014 at 9:19 PM, Phil Burress <ph...@gmail.com>wrote:
>
>> Thank you all for your advice and good info. The node has died a couple
>> of times with out of memory errors. I've restarted each time but it starts
>> re - running compaction and then dies again.
>>
>> Is there a better way to do this?
>> On Apr 18, 2014 6:06 PM, "Steven A Robenalt" <sr...@stanford.edu>
>> wrote:
>>
>>> That's what I'd be doing, but I wouldn't expect it to run for 3 days
>>> this time. My guess is that whatever was going wrong with the bootstrap
>>> when you had 3 nodes starting at once was interfering with the completion
>>> of the 1 remaining node of those 3. A clean bootstrap of a single node
>>> should complete eventually, and I would think it'll be a lot less than 3
>>> days. Our database is much smaller than yours at the moment, so I can't
>>> really guide you on how long it should take, but I'd think that others on
>>> the list with similar database sizes might be able to give you a better
>>> idea.
>>>
>>> Steve
>>>
>>>
>>>
>>> On Fri, Apr 18, 2014 at 1:43 PM, Phil Burress <ph...@gmail.com>wrote:
>>>
>>>> First, I just stopped 2 of the nodes and left one running. But this
>>>> morning, I stopped that third node, cleared out the data, restarted and let
>>>> it rejoin again. It appears streaming is done (according to netstats),
>>>> right now it appears to be running compaction and building secondary index
>>>> (according to compactionstats). Just sit and wait I guess?
>>>>
>>>>
>>>> On Fri, Apr 18, 2014 at 2:23 PM, Steven A Robenalt <
>>>> srobenal@stanford.edu> wrote:
>>>>
>>>>> Looking back through this email chain, it looks like Phil said he
>>>>> wasn't using vnodes.
>>>>>
>>>>> For the record, we are using vnodes since we brought up our first
>>>>> cluster, and have not seen any issues with bootstrapping new nodes either
>>>>> to replace existing nodes, or to grow/shrink the cluster. We did adhere to
>>>>> the caveats that new nodes should not be seed nodes, and that we should
>>>>> allow each node to join the cluster completely before making any other
>>>>> changes.
>>>>>
>>>>> Phil, when you dropped to adding just the single node to your cluster,
>>>>> did you start over with the newly added node (blowing away the database
>>>>> created on the previous startup), or did you shut down the other 2 added
>>>>> nodes and leave the remaining one in progress to continue?
>>>>>
>>>>> Steve
>>>>>
>>>>>
>>>>> On Fri, Apr 18, 2014 at 10:40 AM, Robert Coli <rc...@eventbrite.com>wrote:
>>>>>
>>>>>> On Fri, Apr 18, 2014 at 5:05 AM, Phil Burress <
>>>>>> philburresseme@gmail.com> wrote:
>>>>>>
>>>>>>> nodetool netstats shows 84 files. They are all at 100%. Nothing
>>>>>>> showing in Pending or Active for Read Repair Stats.
>>>>>>>
>>>>>>> I'm assuming this means it's done. But it still shows "JOINING". Is
>>>>>>> there an undocumented step I'm missing here? This whole process seems
>>>>>>> broken to me.
>>>>>>>
>>>>>>
>>>>>> Lately it seems like a lot more people than usual are :
>>>>>>
>>>>>> 1) using vnodes
>>>>>> 2) unable to bootstrap new nodes
>>>>>>
>>>>>> If I were you, I would likely file a JIRA detailing your negative
>>>>>> experience with this core functionality.
>>>>>>
>>>>>> =Rob
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Steve Robenalt
>>>>> Software Architect
>>>>>  HighWire | Stanford University
>>>>> 425 Broadway St, Redwood City, CA 94063
>>>>>
>>>>> srobenal@stanford.edu
>>>>> http://highwire.stanford.edu
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Steve Robenalt
>>> Software Architect
>>> HighWire | Stanford University
>>> 425 Broadway St, Redwood City, CA 94063
>>>
>>> srobenal@stanford.edu
>>> http://highwire.stanford.edu
>>>
>>>
>>>
>>>
>>>
>>>
>

Re: Bootstrap Timing

Posted by Phil Burress <ph...@gmail.com>.

The new node has managed to stay up without dying for about 24 hours now...
but it still is in JOINING state. A new concern has popped up. Disk usage
is at 500GB on the new node. The three original nodes have about 40GB each.
Any ideas why this is happening?


On Sat, Apr 19, 2014 at 9:19 PM, Phil Burress <ph...@gmail.com>wrote:

> Thank you all for your advice and good info. The node has died a couple of
> times with out of memory errors. I've restarted each time but it starts re
> - running compaction and then dies again.
>
> Is there a better way to do this?
> On Apr 18, 2014 6:06 PM, "Steven A Robenalt" <sr...@stanford.edu>
> wrote:
>
>> That's what I'd be doing, but I wouldn't expect it to run for 3 days this
>> time. My guess is that whatever was going wrong with the bootstrap when you
>> had 3 nodes starting at once was interfering with the completion of the 1
>> remaining node of those 3. A clean bootstrap of a single node should
>> complete eventually, and I would think it'll be a lot less than 3 days. Our
>> database is much smaller than yours at the moment, so I can't really guide
>> you on how long it should take, but I'd think that others on the list with
>> similar database sizes might be able to give you a better idea.
>>
>> Steve
>>
>>
>>
>> On Fri, Apr 18, 2014 at 1:43 PM, Phil Burress <ph...@gmail.com>wrote:
>>
>>> First, I just stopped 2 of the nodes and left one running. But this
>>> morning, I stopped that third node, cleared out the data, restarted and let
>>> it rejoin again. It appears streaming is done (according to netstats),
>>> right now it appears to be running compaction and building secondary index
>>> (according to compactionstats). Just sit and wait I guess?
>>>
>>>
>>> On Fri, Apr 18, 2014 at 2:23 PM, Steven A Robenalt <
>>> srobenal@stanford.edu> wrote:
>>>
>>>> Looking back through this email chain, it looks like Phil said he
>>>> wasn't using vnodes.
>>>>
>>>> For the record, we are using vnodes since we brought up our first
>>>> cluster, and have not seen any issues with bootstrapping new nodes either
>>>> to replace existing nodes, or to grow/shrink the cluster. We did adhere to
>>>> the caveats that new nodes should not be seed nodes, and that we should
>>>> allow each node to join the cluster completely before making any other
>>>> changes.
>>>>
>>>> Phil, when you dropped to adding just the single node to your cluster,
>>>> did you start over with the newly added node (blowing away the database
>>>> created on the previous startup), or did you shut down the other 2 added
>>>> nodes and leave the remaining one in progress to continue?
>>>>
>>>> Steve
>>>>
>>>>
>>>> On Fri, Apr 18, 2014 at 10:40 AM, Robert Coli <rc...@eventbrite.com>wrote:
>>>>
>>>>> On Fri, Apr 18, 2014 at 5:05 AM, Phil Burress <
>>>>> philburresseme@gmail.com> wrote:
>>>>>
>>>>>> nodetool netstats shows 84 files. They are all at 100%. Nothing
>>>>>> showing in Pending or Active for Read Repair Stats.
>>>>>>
>>>>>> I'm assuming this means it's done. But it still shows "JOINING". Is
>>>>>> there an undocumented step I'm missing here? This whole process seems
>>>>>> broken to me.
>>>>>>
>>>>>
>>>>> Lately it seems like a lot more people than usual are :
>>>>>
>>>>> 1) using vnodes
>>>>> 2) unable to bootstrap new nodes
>>>>>
>>>>> If I were you, I would likely file a JIRA detailing your negative
>>>>> experience with this core functionality.
>>>>>
>>>>> =Rob
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Steve Robenalt
>>>> Software Architect
>>>>  HighWire | Stanford University
>>>> 425 Broadway St, Redwood City, CA 94063
>>>>
>>>> srobenal@stanford.edu
>>>> http://highwire.stanford.edu
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Steve Robenalt
>> Software Architect
>> HighWire | Stanford University
>> 425 Broadway St, Redwood City, CA 94063
>>
>> srobenal@stanford.edu
>> http://highwire.stanford.edu
>>
>>
>>
>>
>>
>>

Re: Bootstrap Timing

Posted by Phil Burress <ph...@gmail.com>.

Thank you all for your advice and good info. The node has died a couple of
times with out of memory errors. I've restarted each time but it starts re
- running compaction and then dies again.

Is there a better way to do this?
On Apr 18, 2014 6:06 PM, "Steven A Robenalt" <sr...@stanford.edu> wrote:

> That's what I'd be doing, but I wouldn't expect it to run for 3 days this
> time. My guess is that whatever was going wrong with the bootstrap when you
> had 3 nodes starting at once was interfering with the completion of the 1
> remaining node of those 3. A clean bootstrap of a single node should
> complete eventually, and I would think it'll be a lot less than 3 days. Our
> database is much smaller than yours at the moment, so I can't really guide
> you on how long it should take, but I'd think that others on the list with
> similar database sizes might be able to give you a better idea.
>
> Steve
>
>
>
> On Fri, Apr 18, 2014 at 1:43 PM, Phil Burress <ph...@gmail.com>wrote:
>
>> First, I just stopped 2 of the nodes and left one running. But this
>> morning, I stopped that third node, cleared out the data, restarted and let
>> it rejoin again. It appears streaming is done (according to netstats),
>> right now it appears to be running compaction and building secondary index
>> (according to compactionstats). Just sit and wait I guess?
>>
>>
>> On Fri, Apr 18, 2014 at 2:23 PM, Steven A Robenalt <srobenal@stanford.edu
>> > wrote:
>>
>>> Looking back through this email chain, it looks like Phil said he wasn't
>>> using vnodes.
>>>
>>> For the record, we are using vnodes since we brought up our first
>>> cluster, and have not seen any issues with bootstrapping new nodes either
>>> to replace existing nodes, or to grow/shrink the cluster. We did adhere to
>>> the caveats that new nodes should not be seed nodes, and that we should
>>> allow each node to join the cluster completely before making any other
>>> changes.
>>>
>>> Phil, when you dropped to adding just the single node to your cluster,
>>> did you start over with the newly added node (blowing away the database
>>> created on the previous startup), or did you shut down the other 2 added
>>> nodes and leave the remaining one in progress to continue?
>>>
>>> Steve
>>>
>>>
>>> On Fri, Apr 18, 2014 at 10:40 AM, Robert Coli <rc...@eventbrite.com>wrote:
>>>
>>>> On Fri, Apr 18, 2014 at 5:05 AM, Phil Burress <philburresseme@gmail.com
>>>> > wrote:
>>>>
>>>>> nodetool netstats shows 84 files. They are all at 100%. Nothing
>>>>> showing in Pending or Active for Read Repair Stats.
>>>>>
>>>>> I'm assuming this means it's done. But it still shows "JOINING". Is
>>>>> there an undocumented step I'm missing here? This whole process seems
>>>>> broken to me.
>>>>>
>>>>
>>>> Lately it seems like a lot more people than usual are :
>>>>
>>>> 1) using vnodes
>>>> 2) unable to bootstrap new nodes
>>>>
>>>> If I were you, I would likely file a JIRA detailing your negative
>>>> experience with this core functionality.
>>>>
>>>> =Rob
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Steve Robenalt
>>> Software Architect
>>>  HighWire | Stanford University
>>> 425 Broadway St, Redwood City, CA 94063
>>>
>>> srobenal@stanford.edu
>>> http://highwire.stanford.edu
>>>
>>>
>>>
>>>
>>>
>>>
>>
>
>
> --
> Steve Robenalt
> Software Architect
> HighWire | Stanford University
> 425 Broadway St, Redwood City, CA 94063
>
> srobenal@stanford.edu
> http://highwire.stanford.edu
>
>
>
>
>
>

Re: Bootstrap Timing

Posted by Steven A Robenalt <sr...@stanford.edu>.

That's what I'd be doing, but I wouldn't expect it to run for 3 days this
time. My guess is that whatever was going wrong with the bootstrap when you
had 3 nodes starting at once was interfering with the completion of the 1
remaining node of those 3. A clean bootstrap of a single node should
complete eventually, and I would think it'll be a lot less than 3 days. Our
database is much smaller than yours at the moment, so I can't really guide
you on how long it should take, but I'd think that others on the list with
similar database sizes might be able to give you a better idea.

Steve



On Fri, Apr 18, 2014 at 1:43 PM, Phil Burress <ph...@gmail.com>wrote:

> First, I just stopped 2 of the nodes and left one running. But this
> morning, I stopped that third node, cleared out the data, restarted and let
> it rejoin again. It appears streaming is done (according to netstats),
> right now it appears to be running compaction and building secondary index
> (according to compactionstats). Just sit and wait I guess?
>
>
> On Fri, Apr 18, 2014 at 2:23 PM, Steven A Robenalt <sr...@stanford.edu>wrote:
>
>> Looking back through this email chain, it looks like Phil said he wasn't
>> using vnodes.
>>
>> For the record, we are using vnodes since we brought up our first
>> cluster, and have not seen any issues with bootstrapping new nodes either
>> to replace existing nodes, or to grow/shrink the cluster. We did adhere to
>> the caveats that new nodes should not be seed nodes, and that we should
>> allow each node to join the cluster completely before making any other
>> changes.
>>
>> Phil, when you dropped to adding just the single node to your cluster,
>> did you start over with the newly added node (blowing away the database
>> created on the previous startup), or did you shut down the other 2 added
>> nodes and leave the remaining one in progress to continue?
>>
>> Steve
>>
>>
>> On Fri, Apr 18, 2014 at 10:40 AM, Robert Coli <rc...@eventbrite.com>wrote:
>>
>>> On Fri, Apr 18, 2014 at 5:05 AM, Phil Burress <ph...@gmail.com>wrote:
>>>
>>>> nodetool netstats shows 84 files. They are all at 100%. Nothing showing
>>>> in Pending or Active for Read Repair Stats.
>>>>
>>>> I'm assuming this means it's done. But it still shows "JOINING". Is
>>>> there an undocumented step I'm missing here? This whole process seems
>>>> broken to me.
>>>>
>>>
>>> Lately it seems like a lot more people than usual are :
>>>
>>> 1) using vnodes
>>> 2) unable to bootstrap new nodes
>>>
>>> If I were you, I would likely file a JIRA detailing your negative
>>> experience with this core functionality.
>>>
>>> =Rob
>>>
>>>
>>>
>>
>>
>>
>> --
>> Steve Robenalt
>> Software Architect
>>  HighWire | Stanford University
>> 425 Broadway St, Redwood City, CA 94063
>>
>> srobenal@stanford.edu
>> http://highwire.stanford.edu
>>
>>
>>
>>
>>
>>
>


-- 
Steve Robenalt
Software Architect
HighWire | Stanford University
425 Broadway St, Redwood City, CA 94063

srobenal@stanford.edu
http://highwire.stanford.edu

Re: Bootstrap Timing

Posted by Phil Burress <ph...@gmail.com>.

First, I just stopped 2 of the nodes and left one running. But this
morning, I stopped that third node, cleared out the data, restarted and let
it rejoin again. It appears streaming is done (according to netstats),
right now it appears to be running compaction and building secondary index
(according to compactionstats). Just sit and wait I guess?


On Fri, Apr 18, 2014 at 2:23 PM, Steven A Robenalt <sr...@stanford.edu>wrote:

> Looking back through this email chain, it looks like Phil said he wasn't
> using vnodes.
>
> For the record, we are using vnodes since we brought up our first cluster,
> and have not seen any issues with bootstrapping new nodes either to replace
> existing nodes, or to grow/shrink the cluster. We did adhere to the caveats
> that new nodes should not be seed nodes, and that we should allow each node
> to join the cluster completely before making any other changes.
>
> Phil, when you dropped to adding just the single node to your cluster, did
> you start over with the newly added node (blowing away the database created
> on the previous startup), or did you shut down the other 2 added nodes and
> leave the remaining one in progress to continue?
>
> Steve
>
>
> On Fri, Apr 18, 2014 at 10:40 AM, Robert Coli <rc...@eventbrite.com>wrote:
>
>> On Fri, Apr 18, 2014 at 5:05 AM, Phil Burress <ph...@gmail.com>wrote:
>>
>>> nodetool netstats shows 84 files. They are all at 100%. Nothing showing
>>> in Pending or Active for Read Repair Stats.
>>>
>>> I'm assuming this means it's done. But it still shows "JOINING". Is
>>> there an undocumented step I'm missing here? This whole process seems
>>> broken to me.
>>>
>>
>> Lately it seems like a lot more people than usual are :
>>
>> 1) using vnodes
>> 2) unable to bootstrap new nodes
>>
>> If I were you, I would likely file a JIRA detailing your negative
>> experience with this core functionality.
>>
>> =Rob
>>
>>
>>
>
>
>
> --
> Steve Robenalt
> Software Architect
> HighWire | Stanford University
> 425 Broadway St, Redwood City, CA 94063
>
> srobenal@stanford.edu
> http://highwire.stanford.edu
>
>
>
>
>
>

Re: Bootstrap Timing

Posted by Steven A Robenalt <sr...@stanford.edu>.

Looking back through this email chain, it looks like Phil said he wasn't
using vnodes.

For the record, we are using vnodes since we brought up our first cluster,
and have not seen any issues with bootstrapping new nodes either to replace
existing nodes, or to grow/shrink the cluster. We did adhere to the caveats
that new nodes should not be seed nodes, and that we should allow each node
to join the cluster completely before making any other changes.

Phil, when you dropped to adding just the single node to your cluster, did
you start over with the newly added node (blowing away the database created
on the previous startup), or did you shut down the other 2 added nodes and
leave the remaining one in progress to continue?

Steve

On Fri, Apr 18, 2014 at 10:40 AM, Robert Coli <rc...@eventbrite.com> wrote:

> On Fri, Apr 18, 2014 at 5:05 AM, Phil Burress <ph...@gmail.com>wrote:
>
>> nodetool netstats shows 84 files. They are all at 100%. Nothing showing
>> in Pending or Active for Read Repair Stats.
>>
>> I'm assuming this means it's done. But it still shows "JOINING". Is there
>> an undocumented step I'm missing here? This whole process seems broken to
>> me.
>>
>
> Lately it seems like a lot more people than usual are :
>
> 1) using vnodes
> 2) unable to bootstrap new nodes
>
> If I were you, I would likely file a JIRA detailing your negative
> experience with this core functionality.
>
> =Rob
>
>
>

-- 
Steve Robenalt
Software Architect
HighWire | Stanford University
425 Broadway St, Redwood City, CA 94063

srobenal@stanford.edu
http://highwire.stanford.edu

Re: Bootstrap Timing

Posted by Robert Coli <rc...@eventbrite.com>.

On Fri, Apr 18, 2014 at 5:05 AM, Phil Burress <ph...@gmail.com>wrote:

> nodetool netstats shows 84 files. They are all at 100%. Nothing showing in
> Pending or Active for Read Repair Stats.
>
> I'm assuming this means it's done. But it still shows "JOINING". Is there
> an undocumented step I'm missing here? This whole process seems broken to
> me.
>

Lately it seems like a lot more people than usual are :

1) using vnodes
2) unable to bootstrap new nodes

If I were you, I would likely file a JIRA detailing your negative
experience with this core functionality.

=Rob

Re: Bootstrap Timing

Posted by Mark Reddy <ma...@boxever.com>.

If all streams have completed, the node could be still rebuilding secondary
indexes? Try looking at 'nodetool compactionstats' for this.


On Fri, Apr 18, 2014 at 1:05 PM, Phil Burress <ph...@gmail.com>wrote:

> nodetool netstats shows 84 files. They are all at 100%. Nothing showing in
> Pending or Active for Read Repair Stats.
>
> I'm assuming this means it's done. But it still shows "JOINING". Is there
> an undocumented step I'm missing here? This whole process seems broken to
> me.
>
>
> On Thu, Apr 17, 2014 at 4:32 PM, Robert Coli <rc...@eventbrite.com> wrote:
>
>> On Wed, Apr 16, 2014 at 1:56 PM, Phil Burress <ph...@gmail.com>wrote:
>>
>>> I've shut down two of the nodes and am bootstrapping one right now. Is
>>> there any way to tell when it will finish bootstrapping?
>>>
>>
>> "nodetool netstats" will show the progress of the streams involved, which
>> could help you estimate.
>>
>> =Rob
>>
>>
>

Re: Bootstrap Timing

Posted by Phil Burress <ph...@gmail.com>.

nodetool netstats shows 84 files. They are all at 100%. Nothing showing in
Pending or Active for Read Repair Stats.

I'm assuming this means it's done. But it still shows "JOINING". Is there
an undocumented step I'm missing here? This whole process seems broken to
me.

On Thu, Apr 17, 2014 at 4:32 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Wed, Apr 16, 2014 at 1:56 PM, Phil Burress <ph...@gmail.com>wrote:
>
>> I've shut down two of the nodes and am bootstrapping one right now. Is
>> there any way to tell when it will finish bootstrapping?
>>
>
> "nodetool netstats" will show the progress of the streams involved, which
> could help you estimate.
>
> =Rob
>
>

Re: Bootstrap Timing

Posted by Robert Coli <rc...@eventbrite.com>.

On Wed, Apr 16, 2014 at 1:56 PM, Phil Burress <ph...@gmail.com>wrote:

> I've shut down two of the nodes and am bootstrapping one right now. Is
> there any way to tell when it will finish bootstrapping?
>

"nodetool netstats" will show the progress of the streams involved, which
could help you estimate.

=Rob

Re: Bootstrap Timing

Posted by Phil Burress <ph...@gmail.com>.

I've shut down two of the nodes and am bootstrapping one right now. Is
there any way to tell when it will finish bootstrapping?


On Wed, Apr 16, 2014 at 2:56 PM, Phil Burress <ph...@gmail.com>wrote:

> Thanks!
>
>
> On Wed, Apr 16, 2014 at 2:50 PM, Robert Coli <rc...@eventbrite.com> wrote:
>
>> On Wed, Apr 16, 2014 at 11:16 AM, Phil Burress <ph...@gmail.com>wrote:
>>
>>> Thanks very much for the response. I'm not using vnodes, does that
>>> matter?
>>>
>>
>> Not in your case. In some cases it is safe to bootstrap multiple nodes
>> into a cluster at once AT SPECIFIC TOKENS, because there is more than one
>> replica set to bootstrap them into safely. Even in this case, it is not
>> recommended.
>>
>>
>>> For the new nodes, do I add all three existing nodes as seeds? Or just
>>> add one?
>>
>>
>> One should be sufficient, but all three could not hurt.
>>
>> =Rob
>>
>>
>

Re: Bootstrap Timing

Posted by Phil Burress <ph...@gmail.com>.

Thanks!


On Wed, Apr 16, 2014 at 2:50 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Wed, Apr 16, 2014 at 11:16 AM, Phil Burress <ph...@gmail.com>wrote:
>
>> Thanks very much for the response. I'm not using vnodes, does that
>> matter?
>>
>
> Not in your case. In some cases it is safe to bootstrap multiple nodes
> into a cluster at once AT SPECIFIC TOKENS, because there is more than one
> replica set to bootstrap them into safely. Even in this case, it is not
> recommended.
>
>
>> For the new nodes, do I add all three existing nodes as seeds? Or just
>> add one?
>
>
> One should be sufficient, but all three could not hurt.
>
> =Rob
>
>

Re: Bootstrap Timing

Posted by Robert Coli <rc...@eventbrite.com>.

On Wed, Apr 16, 2014 at 11:16 AM, Phil Burress <ph...@gmail.com>wrote:

> Thanks very much for the response. I'm not using vnodes, does that matter?
>

Not in your case. In some cases it is safe to bootstrap multiple nodes into
a cluster at once AT SPECIFIC TOKENS, because there is more than one
replica set to bootstrap them into safely. Even in this case, it is not
recommended.

> For the new nodes, do I add all three existing nodes as seeds? Or just add
> one?

One should be sufficient, but all three could not hurt.

=Rob

Re: Bootstrap Timing

Posted by Ken Hancock <ke...@schange.com>.

Seed nodes don't bootstrap.

https://issues.apache.org/jira/browse/CASSANDRA-5836




On Wed, Apr 16, 2014 at 2:17 PM, Phil Burress <ph...@gmail.com>wrote:

> Also, one more quick question. For the new nodes, do I add all three
> existing nodes as seeds? Or just add one?
>
>
> On Wed, Apr 16, 2014 at 2:16 PM, Phil Burress <ph...@gmail.com>wrote:
>
>> Thanks very much for the response. I'm not using vnodes, does that matter?
>>
>>
>> On Wed, Apr 16, 2014 at 2:13 PM, Robert Coli <rc...@eventbrite.com>wrote:
>>
>>> On Wed, Apr 16, 2014 at 11:10 AM, Phil Burress <philburresseme@gmail.com
>>> > wrote:
>>>
>>>> How long does bootstrapping typically take? I have 3 existing nodes in
>>>> our cluster with about 40GB each. I've added three new nodes to the
>>>> cluster. They have been in bootstrap mode for a little over 3 days now.
>>>> Should I be concerned? Is there a way to tell how long it will take to
>>>> finish?
>>>>
>>>
>>> Adding more than one node at a time to a cluster (especially with
>>> vnodes) is Not Supported. If I were you, I would stop all 3 bootstraps and
>>> then do one at a time.
>>>
>>>  =Rob
>>>
>>>
>>
>>
>


-- 
*Ken Hancock *| System Architect, Advanced Advertising
SeaChange International
50 Nagog Park
Acton, Massachusetts 01720
ken.hancock@schange.com | www.schange.com |
NASDAQ:SEAC<http://www.schange.com/en-US/Company/InvestorRelations.aspx>

Office: +1 (978) 889-3329 | [image: Google Talk:]
ken.hancock@schange.com | [image:
Skype:]hancockks | [image: Yahoo IM:]hancockks [image:
LinkedIn]<http://www.linkedin.com/in/kenhancock>

[image: SeaChange International]
 <http://www.schange.com/>This e-mail and any attachments may contain
information which is SeaChange International confidential. The information
enclosed is intended only for the addressees herein and may not be copied
or forwarded without permission from SeaChange International.

Re: Bootstrap Timing

Posted by Phil Burress <ph...@gmail.com>.

Also, one more quick question. For the new nodes, do I add all three
existing nodes as seeds? Or just add one?


On Wed, Apr 16, 2014 at 2:16 PM, Phil Burress <ph...@gmail.com>wrote:

> Thanks very much for the response. I'm not using vnodes, does that matter?
>
>
> On Wed, Apr 16, 2014 at 2:13 PM, Robert Coli <rc...@eventbrite.com> wrote:
>
>> On Wed, Apr 16, 2014 at 11:10 AM, Phil Burress <ph...@gmail.com>wrote:
>>
>>> How long does bootstrapping typically take? I have 3 existing nodes in
>>> our cluster with about 40GB each. I've added three new nodes to the
>>> cluster. They have been in bootstrap mode for a little over 3 days now.
>>> Should I be concerned? Is there a way to tell how long it will take to
>>> finish?
>>>
>>
>> Adding more than one node at a time to a cluster (especially with vnodes)
>> is Not Supported. If I were you, I would stop all 3 bootstraps and then do
>> one at a time.
>>
>>  =Rob
>>
>>
>
>

Re: Bootstrap Timing

Posted by Phil Burress <ph...@gmail.com>.

Thanks very much for the response. I'm not using vnodes, does that matter?


On Wed, Apr 16, 2014 at 2:13 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Wed, Apr 16, 2014 at 11:10 AM, Phil Burress <ph...@gmail.com>wrote:
>
>> How long does bootstrapping typically take? I have 3 existing nodes in
>> our cluster with about 40GB each. I've added three new nodes to the
>> cluster. They have been in bootstrap mode for a little over 3 days now.
>> Should I be concerned? Is there a way to tell how long it will take to
>> finish?
>>
>
> Adding more than one node at a time to a cluster (especially with vnodes)
> is Not Supported. If I were you, I would stop all 3 bootstraps and then do
> one at a time.
>
> =Rob
>
>

Re: Bootstrap Timing

Posted by Robert Coli <rc...@eventbrite.com>.

On Wed, Apr 16, 2014 at 11:10 AM, Phil Burress <ph...@gmail.com>wrote:

> How long does bootstrapping typically take? I have 3 existing nodes in our
> cluster with about 40GB each. I've added three new nodes to the cluster.
> They have been in bootstrap mode for a little over 3 days now. Should I be
> concerned? Is there a way to tell how long it will take to finish?
>

Adding more than one node at a time to a cluster (especially with vnodes)
is Not Supported. If I were you, I would stop all 3 bootstraps and then do
one at a time.

=Rob