You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Mateusz Korniak <ma...@ant.gliwice.pl> on 2012/01/09 08:20:25 UTC
[0.8.x] Node join stuck with all network transfers done
Hi !
I have problem with 0.8.7 node joining cluster of two 0.8.9s (RF=2).
Seems all transfers ware done but joining node(.17) does not change it's state
[3].
Strange is "Nothing streaming from /192.168.3.8" netstats result [2] and still
raising number of pending tasks [1], while .8 not transferring anything [4].
I tried to restart each of nodes, it didn't help except joining process
started again, transferring all data again, stuck in same moment but having
doubled "load" on joining node.
Any hints ? Thanks in advance, regards.
[1]:
@192.168.3.17 ~]$ nodetool -h localhost compactionstats
pending tasks: 836
[2]:
@192.168.3.17 ~]$ nodetool -h localhost netstats
Mode: Bootstrapping
Not sending any streams.
Nothing streaming from /192.168.3.8
Pool Name Active Pending Completed
Commands n/a 0 171
Responses n/a 0 9387965
[3]:
@192.168.3.17 ~]$ nodetool -h localhost ring
Address DC Rack Status State Load Owns
Token
113427455640312821154458202477256070485
192.168.3.8 datacenter1 rack1 Up Normal 128.51 GB 33.33%
0
192.168.3.7 datacenter1 rack1 Up Normal 137.65 GB 50.00%
85070591730234615865843651857942052864
192.168.3.17 datacenter1 rack1 Up Joining 127.02 GB 16.67%
113427455640312821154458202477256070485
[4]:
@192.168.3.8 ~]$ nodetool -h localhost netstats
Mode: Normal
Not sending any streams.
Not receiving any streams.
Pool Name Active Pending Completed
Commands n/a 0 5261062
Responses n/a 0 2963742
--
Mateusz Korniak
Re: [0.8.x] Node join stuck with all network transfers done
Posted by Mateusz Korniak <ma...@ant.gliwice.pl>.
On Monday 09 of January 2012, aaron morton wrote:
> (...) Is there a reason you are not adding 0.8.9 ?
Only my mistake, I repeated procedure with 0.8.9 node joining and now, after
finishing net transfers, node was busy compacting, finally switching to
"Normal" state.
> Check the logs on .17 for errors.
No exceptions, just "Finished streaming session", netstats empty, node idling
(net, cpu, io ) and rising number of pending tasks without doing anything.
Big thanks for caring to answer, regards !
> http://www.thelastpickle.com
>
> On 9/01/2012, at 8:20 PM, Mateusz Korniak wrote:
> > Hi !
> > I have problem with 0.8.7 node joining cluster of two 0.8.9s (RF=2).
> > Seems all transfers ware done but joining node(.17) does not change it's
> > state [3].
> > Strange is "Nothing streaming from /192.168.3.8" netstats result [2] and
> > still raising number of pending tasks [1], while .8 not transferring
> > anything [4]. I tried to restart each of nodes, it didn't help except
> > joining process started again, transferring all data again, stuck in
> > same moment but having doubled "load" on joining node.
> >
> > Any hints ? Thanks in advance, regards.
> >
> > [1]:
> > @192.168.3.17 ~]$ nodetool -h localhost compactionstats
> > pending tasks: 836
> >
> > [2]:
> > @192.168.3.17 ~]$ nodetool -h localhost netstats
> > Mode: Bootstrapping
> > Not sending any streams.
> > Nothing streaming from /192.168.3.8
> > Pool Name Active Pending Completed
> > Commands n/a 0 171
> > Responses n/a 0 9387965
> >
> > [3]:
> > @192.168.3.17 ~]$ nodetool -h localhost ring
> > Address DC Rack Status State Load
> > Owns Token
> >
> > 113427455640312821154458202477256070485
> >
> > 192.168.3.8 datacenter1 rack1 Up Normal 128.51 GB
> > 33.33% 0
> > 192.168.3.7 datacenter1 rack1 Up Normal 137.65 GB
> > 50.00% 85070591730234615865843651857942052864
> > 192.168.3.17 datacenter1 rack1 Up Joining 127.02 GB
> > 16.67% 113427455640312821154458202477256070485
> >
> > [4]:
> > @192.168.3.8 ~]$ nodetool -h localhost netstats
> > Mode: Normal
> > Not sending any streams.
> > Not receiving any streams.
> > Pool Name Active Pending Completed
> > Commands n/a 0 5261062
> > Responses n/a 0 2963742
--
Mateusz Korniak
Re: [0.8.x] Node join stuck with all network transfers done
Posted by aaron morton <aa...@thelastpickle.com>.
Check the logs on .17 for errors. Also see what the most recent messages are. There should be messages about streams completing and messages about compaction running.
The bootstrapping node needs to build the tables for the data it has received. This is done in the compaction manager, but 836 seems like a lot of pending tasks there.
I've not looked into any code issues around adding a 0.8.7 node to a 0.8.9 cluster. Is there a reason you are not adding 0.8.9 ?
Cheers
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 9/01/2012, at 8:20 PM, Mateusz Korniak wrote:
> Hi !
> I have problem with 0.8.7 node joining cluster of two 0.8.9s (RF=2).
> Seems all transfers ware done but joining node(.17) does not change it's state
> [3].
> Strange is "Nothing streaming from /192.168.3.8" netstats result [2] and still
> raising number of pending tasks [1], while .8 not transferring anything [4].
> I tried to restart each of nodes, it didn't help except joining process
> started again, transferring all data again, stuck in same moment but having
> doubled "load" on joining node.
>
> Any hints ? Thanks in advance, regards.
>
> [1]:
> @192.168.3.17 ~]$ nodetool -h localhost compactionstats
> pending tasks: 836
>
> [2]:
> @192.168.3.17 ~]$ nodetool -h localhost netstats
> Mode: Bootstrapping
> Not sending any streams.
> Nothing streaming from /192.168.3.8
> Pool Name Active Pending Completed
> Commands n/a 0 171
> Responses n/a 0 9387965
>
> [3]:
> @192.168.3.17 ~]$ nodetool -h localhost ring
> Address DC Rack Status State Load Owns
> Token
> 113427455640312821154458202477256070485
> 192.168.3.8 datacenter1 rack1 Up Normal 128.51 GB 33.33%
> 0
> 192.168.3.7 datacenter1 rack1 Up Normal 137.65 GB 50.00%
> 85070591730234615865843651857942052864
> 192.168.3.17 datacenter1 rack1 Up Joining 127.02 GB 16.67%
> 113427455640312821154458202477256070485
>
> [4]:
> @192.168.3.8 ~]$ nodetool -h localhost netstats
> Mode: Normal
> Not sending any streams.
> Not receiving any streams.
> Pool Name Active Pending Completed
> Commands n/a 0 5261062
> Responses n/a 0 2963742
>
>
> --
> Mateusz Korniak