You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Yan Chunlu <sp...@gmail.com> on 2011/08/05 05:45:01 UTC

move one node for load re-balancing then it status stuck at "Leaving"

I have 3 nodes and the RF used to be 2, after awhile I have changed it
to 3;  using Cassandra 0.7.4
I have tried the nodetool move but get the following error....
node3:~# nodetool -h node3 move 0
Exception in thread "main" java.lang.IllegalStateException:
replication factor (3) exceeds number of endpoints (2)
at org.apache.cassandra.locator.SimpleStrategy.calculateNaturalEndpoints(SimpleStrategy.java:60)
at org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:930)
at org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:896)
at org.apache.cassandra.service.StorageService.startLeaving(StorageService.java:1596)
at org.apache.cassandra.service.StorageService.move(StorageService.java:1734)
at org.apache.cassandra.service.StorageService.move(StorageService.java:1709)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
at sun.reflect.GeneratedMethodAccessor108.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
at sun.rmi.transport.Transport$1.run(Transport.java:159)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)



then nodetool shows the node is leaving....
nodetool -h node3 ring
Address         Status State   Load            Owns    Token

84944475733633104818662955375549269696
node1      Up     Normal  13.18 GB        81.09%
52773518586096316348543097376923124102
node2     Up     Normal  22.85 GB        10.48%
70597222385644499881390884416714081360
node3      Up     Leaving 25.44 GB        8.43%
84944475733633104818662955375549269696


after go through the code I found the following code:
    /**
     * iterator over the Tokens in the given ring, starting with the
token for the node owning start
     * (which does not have to be a Token in the ring)
     * @param includeMin True if the minimum token should be returned
in the ring even if it has no owner.
     */
    public static Iterator<Token> ringIterator(final ArrayList<Token>
ring, Token start, boolean includeMin)



does "starting with the token for the node owning start" means I need
to move node1 at first?   what should I do now?  restart node3 and
start over?

why does it stuck at "Leaving" anyway?   it supposed to do or not do
it, not just stuck on the way......

Re: move one node for load re-balancing then it status stuck at "Leaving"

Posted by Yan Chunlu <sp...@gmail.com>.

thanks for the help!

On Sun, Aug 7, 2011 at 2:10 PM, Dikang Gu <di...@gmail.com> wrote:

> Yes, I think you are right.
>
> The "nodetool move" will move the keys on the node to the other two nodes,
> and the required replication is 3, but you will only have 2 live nodes after
> the move, so you have the exception.
>
>
> On Sun, Aug 7, 2011 at 2:03 PM, Yan Chunlu <sp...@gmail.com> wrote:
>
>> is that possible that the implements of cassandra only calculate live
>> nodes?
>>
>> for example:
>> "node move node3" cause node3 "Leaving", then cassandra iterate over the
>> endpoints and found node1 and node2. so the endpoints is 2, but RF=3,
>> Exception raised.
>>
>> is that true?
>>
>>
>>
>> On Fri, Aug 5, 2011 at 3:20 PM, Yan Chunlu <sp...@gmail.com> wrote:
>>
>>> nothing...
>>>
>>> nodetool -h node3 netstats
>>> Mode: Normal
>>> Not sending any streams.
>>>  Nothing streaming from /10.28.53.11
>>> Pool Name                    Active   Pending      Completed
>>> Commands                        n/a         0      186669475
>>> Responses                       n/a         0      117986130
>>>
>>>
>>> nodetool -h node3 compactionstats
>>> compaction type: n/a
>>> column family: n/a
>>> bytes compacted: n/a
>>> bytes total in progress: n/a
>>> pending tasks: 0
>>>
>>>
>>>
>>> On Fri, Aug 5, 2011 at 1:47 PM, mcasandra <mo...@gmail.com>
>>> wrote:
>>> > Check things like netstats, disk space etc to see why it's in Leaving
>>> state.
>>> > Anything in the logs that shows Leaving?
>>> >
>>> > --
>>> > View this message in context:
>>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/move-one-node-for-load-re-balancing-then-it-status-stuck-at-Leaving-tp6655168p6655326.html
>>> > Sent from the cassandra-user@incubator.apache.org mailing list archive
>>> at Nabble.com.
>>> >
>>>
>>
>>
>
>
> --
> Dikang Gu
>
> 0086 - 18611140205
>
>

Re: move one node for load re-balancing then it status stuck at "Leaving"

Posted by Dikang Gu <di...@gmail.com>.

Yes, I think you are right.

The "nodetool move" will move the keys on the node to the other two nodes,
and the required replication is 3, but you will only have 2 live nodes after
the move, so you have the exception.


On Sun, Aug 7, 2011 at 2:03 PM, Yan Chunlu <sp...@gmail.com> wrote:

> is that possible that the implements of cassandra only calculate live
> nodes?
>
> for example:
> "node move node3" cause node3 "Leaving", then cassandra iterate over the
> endpoints and found node1 and node2. so the endpoints is 2, but RF=3,
> Exception raised.
>
> is that true?
>
>
>
> On Fri, Aug 5, 2011 at 3:20 PM, Yan Chunlu <sp...@gmail.com> wrote:
>
>> nothing...
>>
>> nodetool -h node3 netstats
>> Mode: Normal
>> Not sending any streams.
>>  Nothing streaming from /10.28.53.11
>> Pool Name                    Active   Pending      Completed
>> Commands                        n/a         0      186669475
>> Responses                       n/a         0      117986130
>>
>>
>> nodetool -h node3 compactionstats
>> compaction type: n/a
>> column family: n/a
>> bytes compacted: n/a
>> bytes total in progress: n/a
>> pending tasks: 0
>>
>>
>>
>> On Fri, Aug 5, 2011 at 1:47 PM, mcasandra <mo...@gmail.com> wrote:
>> > Check things like netstats, disk space etc to see why it's in Leaving
>> state.
>> > Anything in the logs that shows Leaving?
>> >
>> > --
>> > View this message in context:
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/move-one-node-for-load-re-balancing-then-it-status-stuck-at-Leaving-tp6655168p6655326.html
>> > Sent from the cassandra-user@incubator.apache.org mailing list archive
>> at Nabble.com.
>> >
>>
>
>


-- 
Dikang Gu

0086 - 18611140205

Re: move one node for load re-balancing then it status stuck at "Leaving"

Posted by Yan Chunlu <sp...@gmail.com>.

is that possible that the implements of cassandra only calculate live nodes?

for example:
"node move node3" cause node3 "Leaving", then cassandra iterate over the
endpoints and found node1 and node2. so the endpoints is 2, but RF=3,
Exception raised.

is that true?



On Fri, Aug 5, 2011 at 3:20 PM, Yan Chunlu <sp...@gmail.com> wrote:

> nothing...
>
> nodetool -h node3 netstats
> Mode: Normal
> Not sending any streams.
>  Nothing streaming from /10.28.53.11
> Pool Name                    Active   Pending      Completed
> Commands                        n/a         0      186669475
> Responses                       n/a         0      117986130
>
>
> nodetool -h node3 compactionstats
> compaction type: n/a
> column family: n/a
> bytes compacted: n/a
> bytes total in progress: n/a
> pending tasks: 0
>
>
>
> On Fri, Aug 5, 2011 at 1:47 PM, mcasandra <mo...@gmail.com> wrote:
> > Check things like netstats, disk space etc to see why it's in Leaving
> state.
> > Anything in the logs that shows Leaving?
> >
> > --
> > View this message in context:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/move-one-node-for-load-re-balancing-then-it-status-stuck-at-Leaving-tp6655168p6655326.html
> > Sent from the cassandra-user@incubator.apache.org mailing list archive
> at Nabble.com.
> >
>

Re: move one node for load re-balancing then it status stuck at "Leaving"

Posted by Yan Chunlu <sp...@gmail.com>.

nothing...

nodetool -h node3 netstats
Mode: Normal
Not sending any streams.
 Nothing streaming from /10.28.53.11
Pool Name                    Active   Pending      Completed
Commands                        n/a         0      186669475
Responses                       n/a         0      117986130

nodetool -h node3 compactionstats
compaction type: n/a
column family: n/a
bytes compacted: n/a
bytes total in progress: n/a
pending tasks: 0

On Fri, Aug 5, 2011 at 1:47 PM, mcasandra <mo...@gmail.com> wrote:
> Check things like netstats, disk space etc to see why it's in Leaving state.
> Anything in the logs that shows Leaving?
>
> --
> View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/move-one-node-for-load-re-balancing-then-it-status-stuck-at-Leaving-tp6655168p6655326.html
> Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.
>

Re: move one node for load re-balancing then it status stuck at "Leaving"

Posted by mcasandra <mo...@gmail.com>.

Check things like netstats, disk space etc to see why it's in Leaving state.
Anything in the logs that shows Leaving?

--
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/move-one-node-for-load-re-balancing-then-it-status-stuck-at-Leaving-tp6655168p6655326.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.