You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Mateusz Korniak <ma...@ant.gliwice.pl> on 2012/01/09 08:20:25 UTC

[0.8.x] Node join stuck with all network transfers done

Hi !
I have problem with 0.8.7 node joining cluster of two 0.8.9s (RF=2).
Seems all transfers ware done but joining node(.17) does not change it's state 
[3].
Strange is "Nothing streaming from /192.168.3.8" netstats result [2] and still 
raising number of pending tasks [1], while .8 not transferring anything [4].
I tried to restart each of nodes, it didn't help except joining process 
started again, transferring all data again, stuck in same moment but having 
doubled "load" on joining node.

Any hints ? Thanks in advance, regards.

[1]:
@192.168.3.17 ~]$ nodetool -h localhost compactionstats
pending tasks: 836

[2]:
@192.168.3.17 ~]$ nodetool -h localhost netstats
Mode: Bootstrapping
Not sending any streams.
 Nothing streaming from /192.168.3.8
Pool Name                    Active   Pending      Completed
Commands                        n/a         0            171
Responses                       n/a         0        9387965

[3]:
@192.168.3.17 ~]$ nodetool -h localhost ring
Address         DC          Rack        Status State   Load            Owns    
Token
    113427455640312821154458202477256070485
192.168.3.8     datacenter1 rack1       Up     Normal  128.51 GB       33.33%  
0
192.168.3.7     datacenter1 rack1       Up     Normal  137.65 GB       50.00%  
85070591730234615865843651857942052864
192.168.3.17    datacenter1 rack1       Up     Joining 127.02 GB       16.67%  
113427455640312821154458202477256070485

[4]:                                                                           
@192.168.3.8 ~]$ nodetool -h localhost netstats
Mode: Normal
Not sending any streams.
Not receiving any streams.
Pool Name                    Active   Pending      Completed
Commands                        n/a         0        5261062
Responses                       n/a         0        2963742


-- 
Mateusz Korniak

Re: [0.8.x] Node join stuck with all network transfers done

Posted by Mateusz Korniak <ma...@ant.gliwice.pl>.
On Monday 09 of January 2012, aaron morton wrote:
> (...) Is there a reason you are not adding 0.8.9 ?
Only my mistake, I repeated procedure with 0.8.9 node joining and now, after 
finishing net transfers, node was busy compacting, finally switching to 
"Normal" state.  

> Check the logs on .17 for errors. 
No exceptions, just "Finished streaming session", netstats empty, node idling 
(net, cpu, io ) and rising number of pending tasks without doing anything.

Big thanks for caring to answer, regards !
 
> http://www.thelastpickle.com
> 
> On 9/01/2012, at 8:20 PM, Mateusz Korniak wrote:
> > Hi !
> > I have problem with 0.8.7 node joining cluster of two 0.8.9s (RF=2).
> > Seems all transfers ware done but joining node(.17) does not change it's
> > state [3].
> > Strange is "Nothing streaming from /192.168.3.8" netstats result [2] and
> > still raising number of pending tasks [1], while .8 not transferring
> > anything [4]. I tried to restart each of nodes, it didn't help except
> > joining process started again, transferring all data again, stuck in
> > same moment but having doubled "load" on joining node.
> > 
> > Any hints ? Thanks in advance, regards.
> > 
> > [1]:
> > @192.168.3.17 ~]$ nodetool -h localhost compactionstats
> > pending tasks: 836
> > 
> > [2]:
> > @192.168.3.17 ~]$ nodetool -h localhost netstats
> > Mode: Bootstrapping
> > Not sending any streams.
> > Nothing streaming from /192.168.3.8
> > Pool Name                    Active   Pending      Completed
> > Commands                        n/a         0            171
> > Responses                       n/a         0        9387965
> > 
> > [3]:
> > @192.168.3.17 ~]$ nodetool -h localhost ring
> > Address         DC          Rack        Status State   Load           
> > Owns Token
> > 
> >    113427455640312821154458202477256070485
> > 
> > 192.168.3.8     datacenter1 rack1       Up     Normal  128.51 GB      
> > 33.33% 0
> > 192.168.3.7     datacenter1 rack1       Up     Normal  137.65 GB      
> > 50.00% 85070591730234615865843651857942052864
> > 192.168.3.17    datacenter1 rack1       Up     Joining 127.02 GB      
> > 16.67% 113427455640312821154458202477256070485
> > 
> > [4]:
> > @192.168.3.8 ~]$ nodetool -h localhost netstats
> > Mode: Normal
> > Not sending any streams.
> > Not receiving any streams.
> > Pool Name                    Active   Pending      Completed
> > Commands                        n/a         0        5261062
> > Responses                       n/a         0        2963742


-- 
Mateusz Korniak

Re: [0.8.x] Node join stuck with all network transfers done

Posted by aaron morton <aa...@thelastpickle.com>.
Check the logs on .17 for errors.  Also see what the most recent messages are. There should be messages about streams completing and messages about compaction running.

The bootstrapping node needs to build the tables for the data it has received. This is done in the compaction manager, but 836 seems like a lot of pending tasks there. 

I've not looked into any code issues around adding a 0.8.7 node to a 0.8.9 cluster. Is there a reason you are not adding 0.8.9 ? 

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 9/01/2012, at 8:20 PM, Mateusz Korniak wrote:

> Hi !
> I have problem with 0.8.7 node joining cluster of two 0.8.9s (RF=2).
> Seems all transfers ware done but joining node(.17) does not change it's state 
> [3].
> Strange is "Nothing streaming from /192.168.3.8" netstats result [2] and still 
> raising number of pending tasks [1], while .8 not transferring anything [4].
> I tried to restart each of nodes, it didn't help except joining process 
> started again, transferring all data again, stuck in same moment but having 
> doubled "load" on joining node.
> 
> Any hints ? Thanks in advance, regards.
> 
> [1]:
> @192.168.3.17 ~]$ nodetool -h localhost compactionstats
> pending tasks: 836
> 
> [2]:
> @192.168.3.17 ~]$ nodetool -h localhost netstats
> Mode: Bootstrapping
> Not sending any streams.
> Nothing streaming from /192.168.3.8
> Pool Name                    Active   Pending      Completed
> Commands                        n/a         0            171
> Responses                       n/a         0        9387965
> 
> [3]:
> @192.168.3.17 ~]$ nodetool -h localhost ring
> Address         DC          Rack        Status State   Load            Owns    
> Token
>    113427455640312821154458202477256070485
> 192.168.3.8     datacenter1 rack1       Up     Normal  128.51 GB       33.33%  
> 0
> 192.168.3.7     datacenter1 rack1       Up     Normal  137.65 GB       50.00%  
> 85070591730234615865843651857942052864
> 192.168.3.17    datacenter1 rack1       Up     Joining 127.02 GB       16.67%  
> 113427455640312821154458202477256070485
> 
> [4]:                                                                           
> @192.168.3.8 ~]$ nodetool -h localhost netstats
> Mode: Normal
> Not sending any streams.
> Not receiving any streams.
> Pool Name                    Active   Pending      Completed
> Commands                        n/a         0        5261062
> Responses                       n/a         0        2963742
> 
> 
> -- 
> Mateusz Korniak