You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by laxmikanth sadula <la...@gmail.com> on 2016/09/22 20:57:23 UTC

Nodetool rebuild exception on c*-2.0.17

Hi,

We have c* 2.0.17 cluster with 2 DCs - DC1, DC2.  We tried to add new data
center DC3 and ran "nodetool rebuild 'DC1'" and we faced below exception on
few nodes after some data got streamed to new nodes in new data center DC3.


*Exception in thread "main" java.lang.RuntimeException: Error while
rebuilding node: Stream failed*
*at
org.apache.cassandra.service.StorageService.rebuild(StorageService.java:936)*
*at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)*
*at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)*
*at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)*
*at java.lang.reflect.Method.invoke(Method.java:606)*
*at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)*
*at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)*
*at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)*
*at java.lang.reflect.Method.invoke(Method.java:606)*
*at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)*
*at
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)*
*at
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)*
*at
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)*
*at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)*
*at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)*
*at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)*
*at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)*
*at
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)*
*at
javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)*
*at
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)*
*at
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)*
*at
javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)*
*at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)*
*at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)*
*at java.lang.reflect.Method.invoke(Method.java:606)*
*at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)*
*at sun.rmi.transport.Transport$2.run(Transport.java:202)*
*at sun.rmi.transport.Transport$2.run(Transport.java:199)*
*at java.security.AccessController.doPrivileged(Native Method)*
*at sun.rmi.transport.Transport.serviceCall(Transport.java:198)*
*at
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:567)*
*at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:828)*
*at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.access$400(TCPTransport.java:619)*
*at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(TCPTransport.java:684)*
*at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(TCPTransport.java:681)*
*at java.security.AccessController.doPrivileged(Native Method)*
*at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:681)*
*at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)*
*at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)*
*at java.lang.Thread.run(Thread.java:745)*


We  have 4  user keyspaces , so we altered all keyspaces as below before
issuing rebuild.

*ALTER KEYSPACE keyspace_name WITH replication = {'class':
'NetworkTopologyStrategy', 'DC1': '3' , 'DC2' : '3' , 'DC3' : '3'};*


*Output of describecluster*

./nodetool describecluster
Cluster Information:
Name: Ss Cluster
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
3b688e54-47be-39e8-ae45-e71ba98d69e2: [xxx.xxx.198.75, xxx.xxx.198.132,
xxx.xxx.198.133, xxx.xxx.12.115, xxx.xxx.198.78, xxx.xxx.12.123,
xxx.xxx.98.205, xxx.xxx.98.219, xxx.xxx.98.220, xxx.xxx.198.167,
xxx.xxx.98.172, xxx.xxx.98.173, xxx.xxx.98.170, xxx.xxx.198.168,
xxx.xxx.98.171, xxx.xxx.198.169, xxx.xxx.12.146, xxx.xxx.98.168,
xxx.xxx.12.145, xxx.xxx.98.169, xxx.xxx.12.144, xxx.xxx.12.143,
xxx.xxx.12.140, xxx.xxx.12.139, xxx.xxx.198.126, xxx.xxx.12.136,
xxx.xxx.12.135, xxx.xxx.198.191, xxx.xxx.12.133, xxx.xxx.12.131]



*nodetool status output:*

./nodetool status
Note: Ownership information does not include topology; for complete
information, specify a keyspace
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens  Owns   Host ID
          Rack
UN  xxx.xxx.198.75   639.41 GB  256     3.2%
fdd32c67-3cea-4174-b59b-c1ea14e1a334  GRP1
UN  xxx.xxx.198.132  581.94 GB  256     3.4%
6a465101-29e7-4792-8269-851200a70023  GRP2
UN  xxx.xxx.198.133  618.22 GB  256     3.6%
751ce15a-10f1-44cf-9357-04da7e21b511  GRP2
UN  xxx.xxx.198.167  566.83 GB  256     3.5%
45f6684f-d6a0-4cba-875c-9db459646545  GRP2
UN  xxx.xxx.198.78   661.98 GB  256     3.3%
a8332f22-a75f-4d7c-8b71-7284f6fe208f  GRP3
UN  xxx.xxx.198.126  603.37 GB  256     3.6%
71be90d8-97db-4155-b4fc-da59d78331ef  GRP1
UN  xxx.xxx.198.191  571.8 GB   256     3.2%
a9023df8-a8b3-484b-a03d-0fdea35007bd  GRP3
UN  xxx.xxx.198.168  631.98 GB  256     3.5%
2302491a-a8b5-4aa6-bda7-f1544064c4e3  GRP3
UN  xxx.xxx.198.169  631.12 GB  256     3.4%
d5e5bc3d-38de-4043-abca-08ac09f29a46  GRP1
Datacenter: DC2
==============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens  Owns   Host ID
          Rack
UN  xxx.xxx.98.172   630.36 GB  256     3.3%
4709fe54-2793-481a-95f0-a780ea55cdf1  GRP1
UN  xxx.xxx.98.219   606.85 GB  256     3.2%
792f3185-d5f2-4f4f-a60e-e0c888871cbb  GRP3
UN  xxx.xxx.98.173   654.75 GB  256     3.5%
345cdedc-d6d4-4b46-9cf3-5dc726af9b99  GRP3
UN  xxx.xxx.98.170   577.26 GB  256     3.6%
cfaf5519-4849-49f9-a1c2-6ca13eb19d3c  GRP3
UN  xxx.xxx.98.205   603.29 GB  256     2.9%
efafe9fd-65b1-41ec-b285-fa00279410ce  GRP2
UN  xxx.xxx.98.171   659.41 GB  256     3.4%
9585c52a-94ba-4b94-af22-97f80ab8c102  GRP2
UN  xxx.xxx.98.220   634.14 GB  256     3.1%
20a5f0ec-12b3-4db0-9483-f5ce984b362a  GRP1
UN  xxx.xxx.98.168   617.16 GB  256     3.3%
3ee473fc-6415-45c5-9466-4aaad9181e5a  GRP1
UN  xxx.xxx.98.169   614.25 GB  256     3.4%
dc1f04d7-0c84-4f8c-95bb-97005baaa328  GRP2
Datacenter: DC3
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens  Owns   Host ID
          Rack
UN  xxx.xxx.12.143   4.76 GB    256     3.5%
626facfb-8346-4f17-aa06-38424d9575ce  GRP2
UN  xxx.xxx.12.140   4.88 GB    256     3.9%
88517bf9-efa6-4214-b503-943c4d5d91ba  GRP1
UN  xxx.xxx.12.115   4.48 GB    256     3.3%
3878532d-3980-40a8-9276-0983eebdc9c6  GRP1
UN  xxx.xxx.12.139   86.72 GB   256     3.2%
2ad980fc-ffec-489e-850d-3b11824bb7c6  GRP3
UN  xxx.xxx.12.136   4.78 GB    256     3.0%
0b25753f-e0c4-448d-8459-6248b3072b79  GRP3
UN  xxx.xxx.12.135   4.92 GB    256     3.1%
29ac1861-da54-421f-b0b7-0ad1230460ac  GRP2
UN  xxx.xxx.12.133   4.15 GB    256     3.2%
fd048371-4994-486d-8de5-365e56ec912f  GRP1
UN  xxx.xxx.12.123   4.78 GB    256     3.3%
cbbb8973-4a33-4ce9-9610-8a7a1ddf975d  GRP3
UN  xxx.xxx.12.131   4.44 GB    256     3.1%
163961b1-96aa-4ef1-8517-ec0f24f0be39  GRP2
UN  xxx.xxx.12.146   4.46 GB    256     3.6%
23cc92fc-2bd8-453c-ae3c-341cfba172a9  GRP2
UN  xxx.xxx.12.145   5.18 GB    256     3.2%
7a929498-29ab-4921-9676-cce41714d8c3  GRP1
UN  xxx.xxx.12.144   62.52 GB   256     3.3%
1631630d-a6ed-437e-9368-30fccce4f12c  GRP3

We using GossipingPropertyFileSnitch , so we modified only
cassandra-rackdc.properties and made NO changes to
cassandra-topology.properties.

If anybody has faced this problem or know how to fix , please let us know.

Thanks in advance...!!!

Re: Nodetool rebuild exception on c*-2.0.17

Posted by Yabin Meng <ya...@gmail.com>.
Hi,

From "nodetool status" output, it looks like the cluster is running ok. The
exception itself simply says that data streaming fails during nodetool
rebuild. This could be due to possible network hiccup. It is hard to say.

You need to do further investigation. For example, you can run "nodetool
netstats" and check log file on the target node to get more information
about the failed streaming sessions, such as the source nodes of the failed
streaming sessions and then check the log files on those nodes.

Yabin

On Thu, Sep 22, 2016 at 4:57 PM, laxmikanth sadula <la...@gmail.com>
wrote:

> Hi,
>
> We have c* 2.0.17 cluster with 2 DCs - DC1, DC2.  We tried to add new data
> center DC3 and ran "nodetool rebuild 'DC1'" and we faced below exception on
> few nodes after some data got streamed to new nodes in new data center DC3.
>
>
> *Exception in thread "main" java.lang.RuntimeException: Error while
> rebuilding node: Stream failed*
> *at
> org.apache.cassandra.service.StorageService.rebuild(StorageService.java:936)*
> *at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)*
> *at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)*
> *at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)*
> *at java.lang.reflect.Method.invoke(Method.java:606)*
> *at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)*
> *at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)*
> *at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)*
> *at java.lang.reflect.Method.invoke(Method.java:606)*
> *at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)*
> *at
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)*
> *at
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)*
> *at
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)*
> *at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)*
> *at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)*
> *at
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)*
> *at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)*
> *at
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)*
> *at
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)*
> *at
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)*
> *at
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)*
> *at
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)*
> *at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)*
> *at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)*
> *at java.lang.reflect.Method.invoke(Method.java:606)*
> *at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)*
> *at sun.rmi.transport.Transport$2.run(Transport.java:202)*
> *at sun.rmi.transport.Transport$2.run(Transport.java:199)*
> *at java.security.AccessController.doPrivileged(Native Method)*
> *at sun.rmi.transport.Transport.serviceCall(Transport.java:198)*
> *at
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:567)*
> *at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:828)*
> *at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.access$400(TCPTransport.java:619)*
> *at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(TCPTransport.java:684)*
> *at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(TCPTransport.java:681)*
> *at java.security.AccessController.doPrivileged(Native Method)*
> *at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:681)*
> *at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)*
> *at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)*
> *at java.lang.Thread.run(Thread.java:745)*
>
>
> We  have 4  user keyspaces , so we altered all keyspaces as below before
> issuing rebuild.
>
> *ALTER KEYSPACE keyspace_name WITH replication = {'class':
> 'NetworkTopologyStrategy', 'DC1': '3' , 'DC2' : '3' , 'DC3' : '3'};*
>
>
> *Output of describecluster*
>
> ./nodetool describecluster
> Cluster Information:
> Name: Ss Cluster
> Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
> Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
> Schema versions:
> 3b688e54-47be-39e8-ae45-e71ba98d69e2: [xxx.xxx.198.75, xxx.xxx.198.132,
> xxx.xxx.198.133, xxx.xxx.12.115, xxx.xxx.198.78, xxx.xxx.12.123,
> xxx.xxx.98.205, xxx.xxx.98.219, xxx.xxx.98.220, xxx.xxx.198.167,
> xxx.xxx.98.172, xxx.xxx.98.173, xxx.xxx.98.170, xxx.xxx.198.168,
> xxx.xxx.98.171, xxx.xxx.198.169, xxx.xxx.12.146, xxx.xxx.98.168,
> xxx.xxx.12.145, xxx.xxx.98.169, xxx.xxx.12.144, xxx.xxx.12.143,
> xxx.xxx.12.140, xxx.xxx.12.139, xxx.xxx.198.126, xxx.xxx.12.136,
> xxx.xxx.12.135, xxx.xxx.198.191, xxx.xxx.12.133, xxx.xxx.12.131]
>
>
>
> *nodetool status output:*
>
> ./nodetool status
> Note: Ownership information does not include topology; for complete
> information, specify a keyspace
> Datacenter: DC1
> ===============
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address          Load       Tokens  Owns   Host ID
>           Rack
> UN  xxx.xxx.198.75   639.41 GB  256     3.2%   fdd32c67-3cea-4174-b59b-c1ea14e1a334
>  GRP1
> UN  xxx.xxx.198.132  581.94 GB  256     3.4%   6a465101-29e7-4792-8269-851200a70023
>  GRP2
> UN  xxx.xxx.198.133  618.22 GB  256     3.6%   751ce15a-10f1-44cf-9357-04da7e21b511
>  GRP2
> UN  xxx.xxx.198.167  566.83 GB  256     3.5%   45f6684f-d6a0-4cba-875c-9db459646545
>  GRP2
> UN  xxx.xxx.198.78   661.98 GB  256     3.3%   a8332f22-a75f-4d7c-8b71-7284f6fe208f
>  GRP3
> UN  xxx.xxx.198.126  603.37 GB  256     3.6%   71be90d8-97db-4155-b4fc-da59d78331ef
>  GRP1
> UN  xxx.xxx.198.191  571.8 GB   256     3.2%   a9023df8-a8b3-484b-a03d-0fdea35007bd
>  GRP3
> UN  xxx.xxx.198.168  631.98 GB  256     3.5%   2302491a-a8b5-4aa6-bda7-f1544064c4e3
>  GRP3
> UN  xxx.xxx.198.169  631.12 GB  256     3.4%   d5e5bc3d-38de-4043-abca-08ac09f29a46
>  GRP1
> Datacenter: DC2
> ==============
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address          Load       Tokens  Owns   Host ID
>           Rack
> UN  xxx.xxx.98.172   630.36 GB  256     3.3%   4709fe54-2793-481a-95f0-a780ea55cdf1
>  GRP1
> UN  xxx.xxx.98.219   606.85 GB  256     3.2%   792f3185-d5f2-4f4f-a60e-e0c888871cbb
>  GRP3
> UN  xxx.xxx.98.173   654.75 GB  256     3.5%   345cdedc-d6d4-4b46-9cf3-5dc726af9b99
>  GRP3
> UN  xxx.xxx.98.170   577.26 GB  256     3.6%   cfaf5519-4849-49f9-a1c2-6ca13eb19d3c
>  GRP3
> UN  xxx.xxx.98.205   603.29 GB  256     2.9%   efafe9fd-65b1-41ec-b285-fa00279410ce
>  GRP2
> UN  xxx.xxx.98.171   659.41 GB  256     3.4%   9585c52a-94ba-4b94-af22-97f80ab8c102
>  GRP2
> UN  xxx.xxx.98.220   634.14 GB  256     3.1%   20a5f0ec-12b3-4db0-9483-f5ce984b362a
>  GRP1
> UN  xxx.xxx.98.168   617.16 GB  256     3.3%   3ee473fc-6415-45c5-9466-4aaad9181e5a
>  GRP1
> UN  xxx.xxx.98.169   614.25 GB  256     3.4%   dc1f04d7-0c84-4f8c-95bb-97005baaa328
>  GRP2
> Datacenter: DC3
> ===============
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address          Load       Tokens  Owns   Host ID
>           Rack
> UN  xxx.xxx.12.143   4.76 GB    256     3.5%   626facfb-8346-4f17-aa06-38424d9575ce
>  GRP2
> UN  xxx.xxx.12.140   4.88 GB    256     3.9%   88517bf9-efa6-4214-b503-943c4d5d91ba
>  GRP1
> UN  xxx.xxx.12.115   4.48 GB    256     3.3%   3878532d-3980-40a8-9276-0983eebdc9c6
>  GRP1
> UN  xxx.xxx.12.139   86.72 GB   256     3.2%   2ad980fc-ffec-489e-850d-3b11824bb7c6
>  GRP3
> UN  xxx.xxx.12.136   4.78 GB    256     3.0%   0b25753f-e0c4-448d-8459-6248b3072b79
>  GRP3
> UN  xxx.xxx.12.135   4.92 GB    256     3.1%   29ac1861-da54-421f-b0b7-0ad1230460ac
>  GRP2
> UN  xxx.xxx.12.133   4.15 GB    256     3.2%   fd048371-4994-486d-8de5-365e56ec912f
>  GRP1
> UN  xxx.xxx.12.123   4.78 GB    256     3.3%   cbbb8973-4a33-4ce9-9610-8a7a1ddf975d
>  GRP3
> UN  xxx.xxx.12.131   4.44 GB    256     3.1%   163961b1-96aa-4ef1-8517-ec0f24f0be39
>  GRP2
> UN  xxx.xxx.12.146   4.46 GB    256     3.6%   23cc92fc-2bd8-453c-ae3c-341cfba172a9
>  GRP2
> UN  xxx.xxx.12.145   5.18 GB    256     3.2%   7a929498-29ab-4921-9676-cce41714d8c3
>  GRP1
> UN  xxx.xxx.12.144   62.52 GB   256     3.3%   1631630d-a6ed-437e-9368-30fccce4f12c
>  GRP3
>
> We using GossipingPropertyFileSnitch , so we modified only
> cassandra-rackdc.properties and made NO changes to
> cassandra-topology.properties.
>
> If anybody has faced this problem or know how to fix , please let us know.
>
> Thanks in advance...!!!
>