You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sasha Dolgy (JIRA)" <ji...@apache.org> on 2011/09/10 23:25:08 UTC
[jira] [Created] (CASSANDRA-3175) Ec2Snitch nodetool issue after
upgrade to 0.8.5
Ec2Snitch nodetool issue after upgrade to 0.8.5
-----------------------------------------------
Key: CASSANDRA-3175
URL: https://issues.apache.org/jira/browse/CASSANDRA-3175
Project: Cassandra
Issue Type: Bug
Components: Core
Affects Versions: 0.8.5
Environment: Amazon Ec2 with 4 nodes in region ap-southeast. 2 in 1a, 2 in 1b
Reporter: Sasha Dolgy
Upgraded one ring that has four nodes from 0.8.0 to 0.8.5 with only one minor problem. It relates to Ec2Snitch when running a 'nodetool ring' from two of the four nodes. the rest are all working fine:
Address DC Rack Status State Load Owns Token
148362247927262972740864614603570725035
172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
172.16.12.10 ap-southeast1a Up Normal 1.63 MB 22.11% 56713727820156410577229101238628035242
172.16.14.10 ap-southeast1b Up Normal 1.85 MB 33.33% 113427455640312821154458202477256070484
172.16.14.12 ap-southeast1b Up Normal 1.36 MB 20.53% 14836224792726297274086461460357072503
works ... on 2 nodes which happen to be on the 172.16.14.0/24 network.
the nodes where the error appears are on the 172.16.12.0/24 network and this is what is shown when nodetool ring is run:
Address DC Rack Status State Load Owns Token
148362247927262972740864614603570725035
172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
172.16.12.10 ap-southeast1a Up Normal 1.62 MB 22.11% 56713727820156410577229101238628035242
Exception in thread "main" java.lang.NullPointerException
at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
at org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
at sun.rmi.transport.Transport$1.run(Transport.java:159)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
I've stopped the node and started it ... still doesn't make a difference. I've also shut down all nodes in the ring so that it was fully offline, and then brought them all back up ... issue still persists on two of the nodes.
There are no firewall rules restricting traffic between these nodes. For example, on a node where the ring throws the exception, the two hosts that don't show up i can still get nestats for:
nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.10
Mode: Normal
Nothing streaming to /172.16.14.10
Nothing streaming from /172.16.14.10
Pool Name Active Pending Completed
Commands n/a 0 3
Responses n/a 1 1483
nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.12
Mode: Normal
Nothing streaming to /172.16.14.12
Nothing streaming from /172.16.14.12
Pool Name Active Pending Completed
Commands n/a 0 3
Responses n/a 0 1784
Problem only looks to appear via nodetool, and doesn't seem to be impacting anything else.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3175) Ec2Snitch nodetool issue after
upgrade to 0.8.5
Posted by "Vijay (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102366#comment-13102366 ]
Vijay commented on CASSANDRA-3175:
----------------------------------
I couldn't replicate this with US-East, may be partly because i am not able to set CIDR notation... can you plz change listern_address with a real IP and see if the issue goes away?
> Ec2Snitch nodetool issue after upgrade to 0.8.5
> -----------------------------------------------
>
> Key: CASSANDRA-3175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3175
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.5
> Environment: Amazon Ec2 with 4 nodes in region ap-southeast. 2 in 1a, 2 in 1b
> Reporter: Sasha Dolgy
> Assignee: Vijay
> Priority: Minor
> Labels: ec2snitch, nodetool
> Fix For: 0.8.6
>
>
> Upgraded one ring that has four nodes from 0.8.0 to 0.8.5 with only one minor problem. It relates to Ec2Snitch when running a 'nodetool ring' from two of the four nodes. the rest are all working fine:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.63 MB 22.11% 56713727820156410577229101238628035242
> 172.16.14.10 ap-southeast1b Up Normal 1.85 MB 33.33% 113427455640312821154458202477256070484
> 172.16.14.12 ap-southeast1b Up Normal 1.36 MB 20.53% 14836224792726297274086461460357072503
> works ... on 2 nodes which happen to be on the 172.16.14.0/24 network.
> the nodes where the error appears are on the 172.16.12.0/24 network and this is what is shown when nodetool ring is run:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.62 MB 22.11% 56713727820156410577229101238628035242
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
> at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
> at org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
> at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
> at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
> at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
> at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
> at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
> at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
> at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> I've stopped the node and started it ... still doesn't make a difference. I've also shut down all nodes in the ring so that it was fully offline, and then brought them all back up ... issue still persists on two of the nodes.
> There are no firewall rules restricting traffic between these nodes. For example, on a node where the ring throws the exception, the two hosts that don't show up i can still get nestats for:
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.10
> Mode: Normal
> Nothing streaming to /172.16.14.10
> Nothing streaming from /172.16.14.10
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 1 1483
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.12
> Mode: Normal
> Nothing streaming to /172.16.14.12
> Nothing streaming from /172.16.14.12
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 0 1784
> Problem only looks to appear via nodetool, and doesn't seem to be impacting anything else.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-3175) Ec2Snitch nodetool issue after
upgrade to 0.8.5
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis reassigned CASSANDRA-3175:
-----------------------------------------
Assignee: Brandon Williams
> Ec2Snitch nodetool issue after upgrade to 0.8.5
> -----------------------------------------------
>
> Key: CASSANDRA-3175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3175
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.5
> Environment: Amazon Ec2 with 4 nodes in region ap-southeast. 2 in 1a, 2 in 1b
> Reporter: Sasha Dolgy
> Assignee: Brandon Williams
> Labels: ec2snitch, nodetool
> Fix For: 0.8.6
>
>
> Upgraded one ring that has four nodes from 0.8.0 to 0.8.5 with only one minor problem. It relates to Ec2Snitch when running a 'nodetool ring' from two of the four nodes. the rest are all working fine:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.63 MB 22.11% 56713727820156410577229101238628035242
> 172.16.14.10 ap-southeast1b Up Normal 1.85 MB 33.33% 113427455640312821154458202477256070484
> 172.16.14.12 ap-southeast1b Up Normal 1.36 MB 20.53% 14836224792726297274086461460357072503
> works ... on 2 nodes which happen to be on the 172.16.14.0/24 network.
> the nodes where the error appears are on the 172.16.12.0/24 network and this is what is shown when nodetool ring is run:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.62 MB 22.11% 56713727820156410577229101238628035242
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
> at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
> at org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
> at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
> at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
> at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
> at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
> at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
> at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
> at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> I've stopped the node and started it ... still doesn't make a difference. I've also shut down all nodes in the ring so that it was fully offline, and then brought them all back up ... issue still persists on two of the nodes.
> There are no firewall rules restricting traffic between these nodes. For example, on a node where the ring throws the exception, the two hosts that don't show up i can still get nestats for:
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.10
> Mode: Normal
> Nothing streaming to /172.16.14.10
> Nothing streaming from /172.16.14.10
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 1 1483
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.12
> Mode: Normal
> Nothing streaming to /172.16.14.12
> Nothing streaming from /172.16.14.12
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 0 1784
> Problem only looks to appear via nodetool, and doesn't seem to be impacting anything else.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3175) Ec2Snitch nodetool issue after
upgrade to 0.8.5
Posted by "Sasha Dolgy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103025#comment-13103025 ]
Sasha Dolgy commented on CASSANDRA-3175:
----------------------------------------
Removed the issues I was having which caused this by running on the system keyspace: del LocationInfo[52696e67]; and then proceeding to shut down each node. once all were down, brought them all back up ...
> Ec2Snitch nodetool issue after upgrade to 0.8.5
> -----------------------------------------------
>
> Key: CASSANDRA-3175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3175
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.5
> Environment: Amazon Ec2 with 4 nodes in region ap-southeast. 2 in 1a, 2 in 1b
> Reporter: Sasha Dolgy
> Assignee: Vijay
> Priority: Minor
> Labels: ec2snitch, nodetool
> Fix For: 0.8.6
>
>
> Upgraded one ring that has four nodes from 0.8.0 to 0.8.5 with only one minor problem. It relates to Ec2Snitch when running a 'nodetool ring' from two of the four nodes. the rest are all working fine:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.63 MB 22.11% 56713727820156410577229101238628035242
> 172.16.14.10 ap-southeast1b Up Normal 1.85 MB 33.33% 113427455640312821154458202477256070484
> 172.16.14.12 ap-southeast1b Up Normal 1.36 MB 20.53% 14836224792726297274086461460357072503
> works ... on 2 nodes which happen to be on the 172.16.14.0/24 network.
> the nodes where the error appears are on the 172.16.12.0/24 network and this is what is shown when nodetool ring is run:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.62 MB 22.11% 56713727820156410577229101238628035242
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
> at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
> at org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
> at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
> at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
> at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
> at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
> at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
> at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
> at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> I've stopped the node and started it ... still doesn't make a difference. I've also shut down all nodes in the ring so that it was fully offline, and then brought them all back up ... issue still persists on two of the nodes.
> There are no firewall rules restricting traffic between these nodes. For example, on a node where the ring throws the exception, the two hosts that don't show up i can still get nestats for:
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.10
> Mode: Normal
> Nothing streaming to /172.16.14.10
> Nothing streaming from /172.16.14.10
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 1 1483
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.12
> Mode: Normal
> Nothing streaming to /172.16.14.12
> Nothing streaming from /172.16.14.12
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 0 1784
> Problem only looks to appear via nodetool, and doesn't seem to be impacting anything else.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3175) Ec2Snitch nodetool issue after
upgrade to 0.8.5
Posted by "Sasha Dolgy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102247#comment-13102247 ]
Sasha Dolgy commented on CASSANDRA-3175:
----------------------------------------
executing command on 172.16.12.10: nodetool -h 172.16.12.10 -p 9090 ring (error message)
executing command on 172.16.12.11: nodetool -h 172.16.12.11 -p 9090 ring (error message)
executing command on 172.16.14.10: nodetool -h 172.16.14.10 -p 9090 ring
executing command on 172.16.14.12: nodetool -h 172.16.14.12 -p 9090 ring
No errors in the system.log on any node.
nodetool -h 172.16.12.11 -p 9090 version
ReleaseVersion: 0.8.5
nodetool -h 172.16.12.10 -p 9090 version
ReleaseVersion: 0.8.5
nodetool -h 172.16.14.10 -p 9090 version
ReleaseVersion: 0.8.5
nodetool -h 172.16.14.12 -p 9090 version
ReleaseVersion: 0.8.5
netstat -an | grep 7000 (on 172.16.14.12):
tcp 0 0 172.16.14.12:45116 172.16.14.10:7000 ESTABLISHED
tcp 0 0 172.16.14.12:57148 172.16.12.10:7000 ESTABLISHED
tcp 0 0 172.16.14.12:7000 172.16.12.11:42336 ESTABLISHED
tcp 0 0 172.16.14.12:42728 172.16.12.11:7000 ESTABLISHED
tcp 0 0 172.16.14.12:7000 172.16.12.10:52259 ESTABLISHED
tcp 0 0 172.16.14.12:43229 172.16.12.11:7000 ESTABLISHED
tcp 0 0 172.16.14.12:7000 172.16.14.10:57429 ESTABLISHED
tcp 0 0 172.16.14.12:48923 172.16.12.10:7000 ESTABLISHED
tcp 0 0 172.16.14.12:7000 172.16.14.10:52346 ESTABLISHED
tcp 0 0 172.16.14.12:34007 172.16.14.10:7000 ESTABLISHED
tcp 0 0 172.16.14.12:7000 172.16.12.10:57794 ESTABLISHED
tcp 0 0 172.16.14.12:7000 172.16.12.11:33248 ESTABLISHED
netstat -an | grep 7000 (on 172.16.12.11):
netstat -an | grep 7000
tcp 0 0 172.16.12.11:7000 0.0.0.0:* LISTEN
tcp 0 0 172.16.12.11:37427 172.16.14.10:7000 ESTABLISHED
tcp 0 0 172.16.12.11:7000 172.16.14.10:50314 ESTABLISHED
tcp 0 0 172.16.12.11:7000 172.16.12.10:56657 ESTABLISHED
tcp 0 0 172.16.12.11:7000 172.16.14.10:47700 ESTABLISHED
tcp 0 0 172.16.12.11:50901 172.16.12.10:7000 ESTABLISHED
tcp 0 0 172.16.12.11:33248 172.16.14.12:7000 ESTABLISHED
tcp 0 0 172.16.12.11:51640 172.16.14.10:7000 ESTABLISHED
tcp 0 0 172.16.12.11:7000 172.16.14.12:43229 ESTABLISHED
tcp 0 0 172.16.12.11:34244 172.16.12.10:7000 ESTABLISHED
tcp 0 0 172.16.12.11:7000 172.16.14.12:42728 ESTABLISHED
tcp 0 0 172.16.12.11:7000 172.16.12.10:55986 ESTABLISHED
tcp 0 0 172.16.12.11:42336 172.16.14.12:7000 ESTABLISHED
tcp 0 1 172.16.12.11:42521 10.130.185.136:7000 SYN_SENT
cassandra.yaml configuration:
storage_port: 7000
listen_address: Configured as the 172.16.0.0/16 IP of each server listed above
rpc_address: 0.0.0.0
rpc_port: 9160
On the nodes having the issue, the "SYN_SENT" appears and is showing towards the internal EC2 IP of one of the "good" nodes. Which shouldn't be happening. It should be using the private IP we have defined as the listen_address on each node.
> Ec2Snitch nodetool issue after upgrade to 0.8.5
> -----------------------------------------------
>
> Key: CASSANDRA-3175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3175
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.5
> Environment: Amazon Ec2 with 4 nodes in region ap-southeast. 2 in 1a, 2 in 1b
> Reporter: Sasha Dolgy
> Assignee: Vijay
> Priority: Minor
> Labels: ec2snitch, nodetool
> Fix For: 0.8.6
>
>
> Upgraded one ring that has four nodes from 0.8.0 to 0.8.5 with only one minor problem. It relates to Ec2Snitch when running a 'nodetool ring' from two of the four nodes. the rest are all working fine:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.63 MB 22.11% 56713727820156410577229101238628035242
> 172.16.14.10 ap-southeast1b Up Normal 1.85 MB 33.33% 113427455640312821154458202477256070484
> 172.16.14.12 ap-southeast1b Up Normal 1.36 MB 20.53% 14836224792726297274086461460357072503
> works ... on 2 nodes which happen to be on the 172.16.14.0/24 network.
> the nodes where the error appears are on the 172.16.12.0/24 network and this is what is shown when nodetool ring is run:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.62 MB 22.11% 56713727820156410577229101238628035242
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
> at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
> at org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
> at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
> at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
> at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
> at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
> at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
> at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
> at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> I've stopped the node and started it ... still doesn't make a difference. I've also shut down all nodes in the ring so that it was fully offline, and then brought them all back up ... issue still persists on two of the nodes.
> There are no firewall rules restricting traffic between these nodes. For example, on a node where the ring throws the exception, the two hosts that don't show up i can still get nestats for:
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.10
> Mode: Normal
> Nothing streaming to /172.16.14.10
> Nothing streaming from /172.16.14.10
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 1 1483
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.12
> Mode: Normal
> Nothing streaming to /172.16.14.12
> Nothing streaming from /172.16.14.12
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 0 1784
> Problem only looks to appear via nodetool, and doesn't seem to be impacting anything else.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-3175) Ec2Snitch nodetool issue after
upgrade to 0.8.5
Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brandon Williams reassigned CASSANDRA-3175:
-------------------------------------------
Assignee: Vijay (was: Brandon Williams)
> Ec2Snitch nodetool issue after upgrade to 0.8.5
> -----------------------------------------------
>
> Key: CASSANDRA-3175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3175
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.5
> Environment: Amazon Ec2 with 4 nodes in region ap-southeast. 2 in 1a, 2 in 1b
> Reporter: Sasha Dolgy
> Assignee: Vijay
> Priority: Minor
> Labels: ec2snitch, nodetool
> Fix For: 0.8.6
>
>
> Upgraded one ring that has four nodes from 0.8.0 to 0.8.5 with only one minor problem. It relates to Ec2Snitch when running a 'nodetool ring' from two of the four nodes. the rest are all working fine:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.63 MB 22.11% 56713727820156410577229101238628035242
> 172.16.14.10 ap-southeast1b Up Normal 1.85 MB 33.33% 113427455640312821154458202477256070484
> 172.16.14.12 ap-southeast1b Up Normal 1.36 MB 20.53% 14836224792726297274086461460357072503
> works ... on 2 nodes which happen to be on the 172.16.14.0/24 network.
> the nodes where the error appears are on the 172.16.12.0/24 network and this is what is shown when nodetool ring is run:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.62 MB 22.11% 56713727820156410577229101238628035242
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
> at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
> at org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
> at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
> at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
> at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
> at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
> at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
> at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
> at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> I've stopped the node and started it ... still doesn't make a difference. I've also shut down all nodes in the ring so that it was fully offline, and then brought them all back up ... issue still persists on two of the nodes.
> There are no firewall rules restricting traffic between these nodes. For example, on a node where the ring throws the exception, the two hosts that don't show up i can still get nestats for:
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.10
> Mode: Normal
> Nothing streaming to /172.16.14.10
> Nothing streaming from /172.16.14.10
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 1 1483
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.12
> Mode: Normal
> Nothing streaming to /172.16.14.12
> Nothing streaming from /172.16.14.12
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 0 1784
> Problem only looks to appear via nodetool, and doesn't seem to be impacting anything else.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3175) Ec2Snitch nodetool issue after
upgrade to 0.8.5
Posted by "Vijay (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102979#comment-13102979 ]
Vijay commented on CASSANDRA-3175:
----------------------------------
which node owns 10.130.185.136 IP provided by EC2? i think you have to check your vpn settings...
> Ec2Snitch nodetool issue after upgrade to 0.8.5
> -----------------------------------------------
>
> Key: CASSANDRA-3175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3175
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.5
> Environment: Amazon Ec2 with 4 nodes in region ap-southeast. 2 in 1a, 2 in 1b
> Reporter: Sasha Dolgy
> Assignee: Vijay
> Priority: Minor
> Labels: ec2snitch, nodetool
> Fix For: 0.8.6
>
>
> Upgraded one ring that has four nodes from 0.8.0 to 0.8.5 with only one minor problem. It relates to Ec2Snitch when running a 'nodetool ring' from two of the four nodes. the rest are all working fine:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.63 MB 22.11% 56713727820156410577229101238628035242
> 172.16.14.10 ap-southeast1b Up Normal 1.85 MB 33.33% 113427455640312821154458202477256070484
> 172.16.14.12 ap-southeast1b Up Normal 1.36 MB 20.53% 14836224792726297274086461460357072503
> works ... on 2 nodes which happen to be on the 172.16.14.0/24 network.
> the nodes where the error appears are on the 172.16.12.0/24 network and this is what is shown when nodetool ring is run:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.62 MB 22.11% 56713727820156410577229101238628035242
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
> at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
> at org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
> at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
> at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
> at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
> at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
> at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
> at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
> at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> I've stopped the node and started it ... still doesn't make a difference. I've also shut down all nodes in the ring so that it was fully offline, and then brought them all back up ... issue still persists on two of the nodes.
> There are no firewall rules restricting traffic between these nodes. For example, on a node where the ring throws the exception, the two hosts that don't show up i can still get nestats for:
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.10
> Mode: Normal
> Nothing streaming to /172.16.14.10
> Nothing streaming from /172.16.14.10
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 1 1483
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.12
> Mode: Normal
> Nothing streaming to /172.16.14.12
> Nothing streaming from /172.16.14.12
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 0 1784
> Problem only looks to appear via nodetool, and doesn't seem to be impacting anything else.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3175) Ec2Snitch nodetool issue after
upgrade to 0.8.5
Posted by "Sasha Dolgy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102861#comment-13102861 ]
Sasha Dolgy commented on CASSANDRA-3175:
----------------------------------------
>From a node that works:
[default@system] describe cluster;
Cluster Information:
Snitch: org.apache.cassandra.locator.Ec2Snitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
1b871300-dbdc-11e0-0000-564008fe649f: [172.16.12.10, 172.16.12.11, 172.16.14.12, 172.16.14.10]
[default@system]
>From a node that doesn't work:
[default@unknown] describe cluster;
Cluster Information:
Snitch: org.apache.cassandra.locator.Ec2Snitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
1b871300-dbdc-11e0-0000-564008fe649f: [172.16.12.10, 172.16.12.11, 172.16.14.12, 172.16.14.10]
UNREACHABLE: [10.130.185.136]
[default@unknown]
We do not use any internal EC2 IP's ... how did this appear?
> Ec2Snitch nodetool issue after upgrade to 0.8.5
> -----------------------------------------------
>
> Key: CASSANDRA-3175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3175
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.5
> Environment: Amazon Ec2 with 4 nodes in region ap-southeast. 2 in 1a, 2 in 1b
> Reporter: Sasha Dolgy
> Assignee: Vijay
> Priority: Minor
> Labels: ec2snitch, nodetool
> Fix For: 0.8.6
>
>
> Upgraded one ring that has four nodes from 0.8.0 to 0.8.5 with only one minor problem. It relates to Ec2Snitch when running a 'nodetool ring' from two of the four nodes. the rest are all working fine:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.63 MB 22.11% 56713727820156410577229101238628035242
> 172.16.14.10 ap-southeast1b Up Normal 1.85 MB 33.33% 113427455640312821154458202477256070484
> 172.16.14.12 ap-southeast1b Up Normal 1.36 MB 20.53% 14836224792726297274086461460357072503
> works ... on 2 nodes which happen to be on the 172.16.14.0/24 network.
> the nodes where the error appears are on the 172.16.12.0/24 network and this is what is shown when nodetool ring is run:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.62 MB 22.11% 56713727820156410577229101238628035242
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
> at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
> at org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
> at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
> at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
> at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
> at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
> at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
> at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
> at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> I've stopped the node and started it ... still doesn't make a difference. I've also shut down all nodes in the ring so that it was fully offline, and then brought them all back up ... issue still persists on two of the nodes.
> There are no firewall rules restricting traffic between these nodes. For example, on a node where the ring throws the exception, the two hosts that don't show up i can still get nestats for:
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.10
> Mode: Normal
> Nothing streaming to /172.16.14.10
> Nothing streaming from /172.16.14.10
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 1 1483
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.12
> Mode: Normal
> Nothing streaming to /172.16.14.12
> Nothing streaming from /172.16.14.12
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 0 1784
> Problem only looks to appear via nodetool, and doesn't seem to be impacting anything else.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3175) Ec2Snitch nodetool issue after
upgrade to 0.8.5
Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brandon Williams resolved CASSANDRA-3175.
-----------------------------------------
Resolution: Invalid
We tracked this down to the issue describe in CASSANDRA-3186, so closing this ticket in favor of that one.
> Ec2Snitch nodetool issue after upgrade to 0.8.5
> -----------------------------------------------
>
> Key: CASSANDRA-3175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3175
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.5
> Environment: Amazon Ec2 with 4 nodes in region ap-southeast. 2 in 1a, 2 in 1b
> Reporter: Sasha Dolgy
> Assignee: Vijay
> Priority: Minor
> Labels: ec2snitch, nodetool
> Fix For: 0.8.6
>
>
> Upgraded one ring that has four nodes from 0.8.0 to 0.8.5 with only one minor problem. It relates to Ec2Snitch when running a 'nodetool ring' from two of the four nodes. the rest are all working fine:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.63 MB 22.11% 56713727820156410577229101238628035242
> 172.16.14.10 ap-southeast1b Up Normal 1.85 MB 33.33% 113427455640312821154458202477256070484
> 172.16.14.12 ap-southeast1b Up Normal 1.36 MB 20.53% 14836224792726297274086461460357072503
> works ... on 2 nodes which happen to be on the 172.16.14.0/24 network.
> the nodes where the error appears are on the 172.16.12.0/24 network and this is what is shown when nodetool ring is run:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.62 MB 22.11% 56713727820156410577229101238628035242
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
> at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
> at org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
> at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
> at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
> at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
> at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
> at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
> at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
> at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> I've stopped the node and started it ... still doesn't make a difference. I've also shut down all nodes in the ring so that it was fully offline, and then brought them all back up ... issue still persists on two of the nodes.
> There are no firewall rules restricting traffic between these nodes. For example, on a node where the ring throws the exception, the two hosts that don't show up i can still get nestats for:
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.10
> Mode: Normal
> Nothing streaming to /172.16.14.10
> Nothing streaming from /172.16.14.10
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 1 1483
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.12
> Mode: Normal
> Nothing streaming to /172.16.14.12
> Nothing streaming from /172.16.14.12
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 0 1784
> Problem only looks to appear via nodetool, and doesn't seem to be impacting anything else.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3175) Ec2Snitch nodetool issue after
upgrade to 0.8.5
Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102893#comment-13102893 ]
Brandon Williams commented on CASSANDRA-3175:
---------------------------------------------
It looks like at some point 10.130.185.136 was part of the cluster. I would try shutting down any nodes that show it, at the same time, and then starting them back up. If it comes back after that it's being re-gossipped from other machines and you'll likely have to do a removetoken on, but that's complicated by the fact that getting the token will be difficult. A full ring restart will definitely solve it.
> Ec2Snitch nodetool issue after upgrade to 0.8.5
> -----------------------------------------------
>
> Key: CASSANDRA-3175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3175
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.5
> Environment: Amazon Ec2 with 4 nodes in region ap-southeast. 2 in 1a, 2 in 1b
> Reporter: Sasha Dolgy
> Assignee: Vijay
> Priority: Minor
> Labels: ec2snitch, nodetool
> Fix For: 0.8.6
>
>
> Upgraded one ring that has four nodes from 0.8.0 to 0.8.5 with only one minor problem. It relates to Ec2Snitch when running a 'nodetool ring' from two of the four nodes. the rest are all working fine:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.63 MB 22.11% 56713727820156410577229101238628035242
> 172.16.14.10 ap-southeast1b Up Normal 1.85 MB 33.33% 113427455640312821154458202477256070484
> 172.16.14.12 ap-southeast1b Up Normal 1.36 MB 20.53% 14836224792726297274086461460357072503
> works ... on 2 nodes which happen to be on the 172.16.14.0/24 network.
> the nodes where the error appears are on the 172.16.12.0/24 network and this is what is shown when nodetool ring is run:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.62 MB 22.11% 56713727820156410577229101238628035242
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
> at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
> at org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
> at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
> at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
> at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
> at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
> at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
> at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
> at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> I've stopped the node and started it ... still doesn't make a difference. I've also shut down all nodes in the ring so that it was fully offline, and then brought them all back up ... issue still persists on two of the nodes.
> There are no firewall rules restricting traffic between these nodes. For example, on a node where the ring throws the exception, the two hosts that don't show up i can still get nestats for:
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.10
> Mode: Normal
> Nothing streaming to /172.16.14.10
> Nothing streaming from /172.16.14.10
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 1 1483
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.12
> Mode: Normal
> Nothing streaming to /172.16.14.12
> Nothing streaming from /172.16.14.12
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 0 1784
> Problem only looks to appear via nodetool, and doesn't seem to be impacting anything else.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3175) Ec2Snitch nodetool issue after
upgrade to 0.8.5
Posted by "Sasha Dolgy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102548#comment-13102548 ]
Sasha Dolgy commented on CASSANDRA-3175:
----------------------------------------
Hi Vijay,
We don't use the "real IP" for intra-node communication. To circumvent the problems with Ec2Snitch between regions, we leverage vpn tunnels, as per: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/advice-for-EC2-deployment-td6294613i20.html .. "we use a combination of Vyatta & OpenVPN on the nodes that are EC2 and
nodes that aren't Ec2....works a treat." this statement is still accurate ... except with 0.8.5, 'nodetool ring' throws an error ... as the issue exists on only 2 of the 4 nodes, could you help me understand what is actually happening to cause this error?
Recently I've tried to put entries for the nodes in /etc/hosts on the nodes where the error occurs ... but that didn't fix it.
How does Cassandra determine the NAME of the node of each Data Center? Where does it get this information?
> Ec2Snitch nodetool issue after upgrade to 0.8.5
> -----------------------------------------------
>
> Key: CASSANDRA-3175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3175
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.5
> Environment: Amazon Ec2 with 4 nodes in region ap-southeast. 2 in 1a, 2 in 1b
> Reporter: Sasha Dolgy
> Assignee: Vijay
> Priority: Minor
> Labels: ec2snitch, nodetool
> Fix For: 0.8.6
>
>
> Upgraded one ring that has four nodes from 0.8.0 to 0.8.5 with only one minor problem. It relates to Ec2Snitch when running a 'nodetool ring' from two of the four nodes. the rest are all working fine:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.63 MB 22.11% 56713727820156410577229101238628035242
> 172.16.14.10 ap-southeast1b Up Normal 1.85 MB 33.33% 113427455640312821154458202477256070484
> 172.16.14.12 ap-southeast1b Up Normal 1.36 MB 20.53% 14836224792726297274086461460357072503
> works ... on 2 nodes which happen to be on the 172.16.14.0/24 network.
> the nodes where the error appears are on the 172.16.12.0/24 network and this is what is shown when nodetool ring is run:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.62 MB 22.11% 56713727820156410577229101238628035242
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
> at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
> at org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
> at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
> at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
> at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
> at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
> at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
> at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
> at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> I've stopped the node and started it ... still doesn't make a difference. I've also shut down all nodes in the ring so that it was fully offline, and then brought them all back up ... issue still persists on two of the nodes.
> There are no firewall rules restricting traffic between these nodes. For example, on a node where the ring throws the exception, the two hosts that don't show up i can still get nestats for:
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.10
> Mode: Normal
> Nothing streaming to /172.16.14.10
> Nothing streaming from /172.16.14.10
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 1 1483
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.12
> Mode: Normal
> Nothing streaming to /172.16.14.12
> Nothing streaming from /172.16.14.12
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 0 1784
> Problem only looks to appear via nodetool, and doesn't seem to be impacting anything else.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3175) Ec2Snitch nodetool issue after
upgrade to 0.8.5
Posted by "Vijay (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102891#comment-13102891 ]
Vijay commented on CASSANDRA-3175:
----------------------------------
I am kind of in the dark on what happened.... Can you grep for the 10.130.185.136 in the logs (system.log) and see if it was marked alive at some point in time? which node owns 10.130.185.136?
Can you also try downgrading it to .8.2 to see if the issue goes away?
> Ec2Snitch nodetool issue after upgrade to 0.8.5
> -----------------------------------------------
>
> Key: CASSANDRA-3175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3175
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.5
> Environment: Amazon Ec2 with 4 nodes in region ap-southeast. 2 in 1a, 2 in 1b
> Reporter: Sasha Dolgy
> Assignee: Vijay
> Priority: Minor
> Labels: ec2snitch, nodetool
> Fix For: 0.8.6
>
>
> Upgraded one ring that has four nodes from 0.8.0 to 0.8.5 with only one minor problem. It relates to Ec2Snitch when running a 'nodetool ring' from two of the four nodes. the rest are all working fine:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.63 MB 22.11% 56713727820156410577229101238628035242
> 172.16.14.10 ap-southeast1b Up Normal 1.85 MB 33.33% 113427455640312821154458202477256070484
> 172.16.14.12 ap-southeast1b Up Normal 1.36 MB 20.53% 14836224792726297274086461460357072503
> works ... on 2 nodes which happen to be on the 172.16.14.0/24 network.
> the nodes where the error appears are on the 172.16.12.0/24 network and this is what is shown when nodetool ring is run:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.62 MB 22.11% 56713727820156410577229101238628035242
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
> at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
> at org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
> at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
> at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
> at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
> at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
> at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
> at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
> at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> I've stopped the node and started it ... still doesn't make a difference. I've also shut down all nodes in the ring so that it was fully offline, and then brought them all back up ... issue still persists on two of the nodes.
> There are no firewall rules restricting traffic between these nodes. For example, on a node where the ring throws the exception, the two hosts that don't show up i can still get nestats for:
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.10
> Mode: Normal
> Nothing streaming to /172.16.14.10
> Nothing streaming from /172.16.14.10
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 1 1483
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.12
> Mode: Normal
> Nothing streaming to /172.16.14.12
> Nothing streaming from /172.16.14.12
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 0 1784
> Problem only looks to appear via nodetool, and doesn't seem to be impacting anything else.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-3175) Ec2Snitch nodetool
issue after upgrade to 0.8.5
Posted by "Sasha Dolgy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102861#comment-13102861 ]
Sasha Dolgy edited comment on CASSANDRA-3175 at 9/12/11 6:10 PM:
-----------------------------------------------------------------
>From the two nodes that work:
[default@system] describe cluster;
Cluster Information:
Snitch: org.apache.cassandra.locator.Ec2Snitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
1b871300-dbdc-11e0-0000-564008fe649f: [172.16.12.10, 172.16.12.11, 172.16.14.12, 172.16.14.10]
[default@system]
>From the two nodes that don't work:
[default@unknown] describe cluster;
Cluster Information:
Snitch: org.apache.cassandra.locator.Ec2Snitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
1b871300-dbdc-11e0-0000-564008fe649f: [172.16.12.10, 172.16.12.11, 172.16.14.12, 172.16.14.10]
UNREACHABLE: [10.130.185.136]
[default@unknown]
We do not use any internal EC2 IP's ... how did this appear? Or better still, how can it get flushed?
was (Author: sdolgy):
From a node that works:
[default@system] describe cluster;
Cluster Information:
Snitch: org.apache.cassandra.locator.Ec2Snitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
1b871300-dbdc-11e0-0000-564008fe649f: [172.16.12.10, 172.16.12.11, 172.16.14.12, 172.16.14.10]
[default@system]
>From a node that doesn't work:
[default@unknown] describe cluster;
Cluster Information:
Snitch: org.apache.cassandra.locator.Ec2Snitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
1b871300-dbdc-11e0-0000-564008fe649f: [172.16.12.10, 172.16.12.11, 172.16.14.12, 172.16.14.10]
UNREACHABLE: [10.130.185.136]
[default@unknown]
We do not use any internal EC2 IP's ... how did this appear?
> Ec2Snitch nodetool issue after upgrade to 0.8.5
> -----------------------------------------------
>
> Key: CASSANDRA-3175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3175
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.5
> Environment: Amazon Ec2 with 4 nodes in region ap-southeast. 2 in 1a, 2 in 1b
> Reporter: Sasha Dolgy
> Assignee: Vijay
> Priority: Minor
> Labels: ec2snitch, nodetool
> Fix For: 0.8.6
>
>
> Upgraded one ring that has four nodes from 0.8.0 to 0.8.5 with only one minor problem. It relates to Ec2Snitch when running a 'nodetool ring' from two of the four nodes. the rest are all working fine:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.63 MB 22.11% 56713727820156410577229101238628035242
> 172.16.14.10 ap-southeast1b Up Normal 1.85 MB 33.33% 113427455640312821154458202477256070484
> 172.16.14.12 ap-southeast1b Up Normal 1.36 MB 20.53% 14836224792726297274086461460357072503
> works ... on 2 nodes which happen to be on the 172.16.14.0/24 network.
> the nodes where the error appears are on the 172.16.12.0/24 network and this is what is shown when nodetool ring is run:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.62 MB 22.11% 56713727820156410577229101238628035242
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
> at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
> at org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
> at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
> at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
> at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
> at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
> at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
> at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
> at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> I've stopped the node and started it ... still doesn't make a difference. I've also shut down all nodes in the ring so that it was fully offline, and then brought them all back up ... issue still persists on two of the nodes.
> There are no firewall rules restricting traffic between these nodes. For example, on a node where the ring throws the exception, the two hosts that don't show up i can still get nestats for:
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.10
> Mode: Normal
> Nothing streaming to /172.16.14.10
> Nothing streaming from /172.16.14.10
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 1 1483
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.12
> Mode: Normal
> Nothing streaming to /172.16.14.12
> Nothing streaming from /172.16.14.12
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 0 1784
> Problem only looks to appear via nodetool, and doesn't seem to be impacting anything else.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3175) Ec2Snitch nodetool issue after
upgrade to 0.8.5
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-3175:
--------------------------------------
Priority: Minor (was: Major)
Fix Version/s: 0.8.6
> Ec2Snitch nodetool issue after upgrade to 0.8.5
> -----------------------------------------------
>
> Key: CASSANDRA-3175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3175
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.5
> Environment: Amazon Ec2 with 4 nodes in region ap-southeast. 2 in 1a, 2 in 1b
> Reporter: Sasha Dolgy
> Assignee: Brandon Williams
> Priority: Minor
> Labels: ec2snitch, nodetool
> Fix For: 0.8.6
>
>
> Upgraded one ring that has four nodes from 0.8.0 to 0.8.5 with only one minor problem. It relates to Ec2Snitch when running a 'nodetool ring' from two of the four nodes. the rest are all working fine:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.63 MB 22.11% 56713727820156410577229101238628035242
> 172.16.14.10 ap-southeast1b Up Normal 1.85 MB 33.33% 113427455640312821154458202477256070484
> 172.16.14.12 ap-southeast1b Up Normal 1.36 MB 20.53% 14836224792726297274086461460357072503
> works ... on 2 nodes which happen to be on the 172.16.14.0/24 network.
> the nodes where the error appears are on the 172.16.12.0/24 network and this is what is shown when nodetool ring is run:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.62 MB 22.11% 56713727820156410577229101238628035242
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
> at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
> at org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
> at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
> at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
> at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
> at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
> at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
> at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
> at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> I've stopped the node and started it ... still doesn't make a difference. I've also shut down all nodes in the ring so that it was fully offline, and then brought them all back up ... issue still persists on two of the nodes.
> There are no firewall rules restricting traffic between these nodes. For example, on a node where the ring throws the exception, the two hosts that don't show up i can still get nestats for:
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.10
> Mode: Normal
> Nothing streaming to /172.16.14.10
> Nothing streaming from /172.16.14.10
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 1 1483
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.12
> Mode: Normal
> Nothing streaming to /172.16.14.12
> Nothing streaming from /172.16.14.12
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 0 1784
> Problem only looks to appear via nodetool, and doesn't seem to be impacting anything else.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3175) Ec2Snitch nodetool issue after
upgrade to 0.8.5
Posted by "Vijay (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102188#comment-13102188 ]
Vijay commented on CASSANDRA-3175:
----------------------------------
do you see a error in the system.log? i couldn't reproduce the issue yet... All the nodes are on .8.5 right? Are using nodetool -h localhost?
> Ec2Snitch nodetool issue after upgrade to 0.8.5
> -----------------------------------------------
>
> Key: CASSANDRA-3175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3175
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.5
> Environment: Amazon Ec2 with 4 nodes in region ap-southeast. 2 in 1a, 2 in 1b
> Reporter: Sasha Dolgy
> Assignee: Vijay
> Priority: Minor
> Labels: ec2snitch, nodetool
> Fix For: 0.8.6
>
>
> Upgraded one ring that has four nodes from 0.8.0 to 0.8.5 with only one minor problem. It relates to Ec2Snitch when running a 'nodetool ring' from two of the four nodes. the rest are all working fine:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.63 MB 22.11% 56713727820156410577229101238628035242
> 172.16.14.10 ap-southeast1b Up Normal 1.85 MB 33.33% 113427455640312821154458202477256070484
> 172.16.14.12 ap-southeast1b Up Normal 1.36 MB 20.53% 14836224792726297274086461460357072503
> works ... on 2 nodes which happen to be on the 172.16.14.0/24 network.
> the nodes where the error appears are on the 172.16.12.0/24 network and this is what is shown when nodetool ring is run:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.62 MB 22.11% 56713727820156410577229101238628035242
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
> at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
> at org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
> at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
> at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
> at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
> at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
> at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
> at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
> at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> I've stopped the node and started it ... still doesn't make a difference. I've also shut down all nodes in the ring so that it was fully offline, and then brought them all back up ... issue still persists on two of the nodes.
> There are no firewall rules restricting traffic between these nodes. For example, on a node where the ring throws the exception, the two hosts that don't show up i can still get nestats for:
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.10
> Mode: Normal
> Nothing streaming to /172.16.14.10
> Nothing streaming from /172.16.14.10
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 1 1483
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.12
> Mode: Normal
> Nothing streaming to /172.16.14.12
> Nothing streaming from /172.16.14.12
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 0 1784
> Problem only looks to appear via nodetool, and doesn't seem to be impacting anything else.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3175) Ec2Snitch nodetool issue after
upgrade to 0.8.5
Posted by "Vijay (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102749#comment-13102749 ]
Vijay commented on CASSANDRA-3175:
----------------------------------
Sasha, This may be happening when at least one node's gossip is not received from the other node which is the 172.16.14.x network.... Which means you might want to see if the connection from all of the nodes in 14.x network to 12.x network is working as expected.
"How does Cassandra determine the NAME of the node of each Data Center? Where does it get this information?"
the way EC2Snitch works is by adding the gossip (application state) to the node and letting the nodes discover the Rack/DC information by gossiping, if we dont know any one nodes information then this ex will occur.
> Ec2Snitch nodetool issue after upgrade to 0.8.5
> -----------------------------------------------
>
> Key: CASSANDRA-3175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3175
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.5
> Environment: Amazon Ec2 with 4 nodes in region ap-southeast. 2 in 1a, 2 in 1b
> Reporter: Sasha Dolgy
> Assignee: Vijay
> Priority: Minor
> Labels: ec2snitch, nodetool
> Fix For: 0.8.6
>
>
> Upgraded one ring that has four nodes from 0.8.0 to 0.8.5 with only one minor problem. It relates to Ec2Snitch when running a 'nodetool ring' from two of the four nodes. the rest are all working fine:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.63 MB 22.11% 56713727820156410577229101238628035242
> 172.16.14.10 ap-southeast1b Up Normal 1.85 MB 33.33% 113427455640312821154458202477256070484
> 172.16.14.12 ap-southeast1b Up Normal 1.36 MB 20.53% 14836224792726297274086461460357072503
> works ... on 2 nodes which happen to be on the 172.16.14.0/24 network.
> the nodes where the error appears are on the 172.16.12.0/24 network and this is what is shown when nodetool ring is run:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.62 MB 22.11% 56713727820156410577229101238628035242
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
> at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
> at org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
> at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
> at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
> at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
> at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
> at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
> at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
> at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> I've stopped the node and started it ... still doesn't make a difference. I've also shut down all nodes in the ring so that it was fully offline, and then brought them all back up ... issue still persists on two of the nodes.
> There are no firewall rules restricting traffic between these nodes. For example, on a node where the ring throws the exception, the two hosts that don't show up i can still get nestats for:
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.10
> Mode: Normal
> Nothing streaming to /172.16.14.10
> Nothing streaming from /172.16.14.10
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 1 1483
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.12
> Mode: Normal
> Nothing streaming to /172.16.14.12
> Nothing streaming from /172.16.14.12
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 0 1784
> Problem only looks to appear via nodetool, and doesn't seem to be impacting anything else.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3175) Ec2Snitch nodetool issue after
upgrade to 0.8.5
Posted by "Sasha Dolgy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102912#comment-13102912 ]
Sasha Dolgy commented on CASSANDRA-3175:
----------------------------------------
Shut down both nodes having the problem. I brought only one back online and it still continues to have the problem.
[default@unknown] describe cluster;
Cluster Information:
Snitch: org.apache.cassandra.locator.Ec2Snitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
UNREACHABLE: [10.130.185.136, 172.16.12.11]
03893e00-dd6b-11e0-0000-564008fe649f: [172.16.12.10, 172.16.14.12, 172.16.14.10]
[default@unknown]
A full ring restart didn't solve this ... I've already tried that the other day.
> Ec2Snitch nodetool issue after upgrade to 0.8.5
> -----------------------------------------------
>
> Key: CASSANDRA-3175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3175
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.5
> Environment: Amazon Ec2 with 4 nodes in region ap-southeast. 2 in 1a, 2 in 1b
> Reporter: Sasha Dolgy
> Assignee: Vijay
> Priority: Minor
> Labels: ec2snitch, nodetool
> Fix For: 0.8.6
>
>
> Upgraded one ring that has four nodes from 0.8.0 to 0.8.5 with only one minor problem. It relates to Ec2Snitch when running a 'nodetool ring' from two of the four nodes. the rest are all working fine:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.63 MB 22.11% 56713727820156410577229101238628035242
> 172.16.14.10 ap-southeast1b Up Normal 1.85 MB 33.33% 113427455640312821154458202477256070484
> 172.16.14.12 ap-southeast1b Up Normal 1.36 MB 20.53% 14836224792726297274086461460357072503
> works ... on 2 nodes which happen to be on the 172.16.14.0/24 network.
> the nodes where the error appears are on the 172.16.12.0/24 network and this is what is shown when nodetool ring is run:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.62 MB 22.11% 56713727820156410577229101238628035242
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
> at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
> at org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
> at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
> at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
> at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
> at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
> at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
> at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
> at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> I've stopped the node and started it ... still doesn't make a difference. I've also shut down all nodes in the ring so that it was fully offline, and then brought them all back up ... issue still persists on two of the nodes.
> There are no firewall rules restricting traffic between these nodes. For example, on a node where the ring throws the exception, the two hosts that don't show up i can still get nestats for:
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.10
> Mode: Normal
> Nothing streaming to /172.16.14.10
> Nothing streaming from /172.16.14.10
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 1 1483
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.12
> Mode: Normal
> Nothing streaming to /172.16.14.12
> Nothing streaming from /172.16.14.12
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 0 1784
> Problem only looks to appear via nodetool, and doesn't seem to be impacting anything else.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3175) Ec2Snitch nodetool issue after
upgrade to 0.8.5
Posted by "Sasha Dolgy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102883#comment-13102883 ]
Sasha Dolgy commented on CASSANDRA-3175:
----------------------------------------
/mnt/cassandra/logs/system.log: INFO [AntiEntropySessions:5] 2011-09-10 21:20:02,182 AntiEntropyService.java (line 658) Could not proceed on repair because a neighbor (/10.130.185.136) is dead: manual-repair-d8cdb59a-04a4-4596-b73f-cba3bd2b9eab failed.
/mnt/cassandra/logs/system.log: INFO [AntiEntropySessions:7] 2011-09-11 21:20:02,258 AntiEntropyService.java (line 658) Could not proceed on repair because a neighbor (/10.130.185.136) is dead: manual-repair-ad17e938-f474-469c-9180-d88a9007b6b9 failed.
/mnt/cassandra/logs/system.log: INFO [AntiEntropySessions:9] 2011-09-12 21:20:02,256 AntiEntropyService.java (line 658) Could not proceed on repair because a neighbor (/10.130.185.136) is dead: manual-repair-636150a5-4f0e-45b7-b400-24d8471a1c88 failed.
Appears only in the logs for one node that is generating the issue. 172.16.12.10
Where do I find where the AntiEntropyService.getNeighbors(tablename, range) is pulling it's information from?
> Ec2Snitch nodetool issue after upgrade to 0.8.5
> -----------------------------------------------
>
> Key: CASSANDRA-3175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3175
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.5
> Environment: Amazon Ec2 with 4 nodes in region ap-southeast. 2 in 1a, 2 in 1b
> Reporter: Sasha Dolgy
> Assignee: Vijay
> Priority: Minor
> Labels: ec2snitch, nodetool
> Fix For: 0.8.6
>
>
> Upgraded one ring that has four nodes from 0.8.0 to 0.8.5 with only one minor problem. It relates to Ec2Snitch when running a 'nodetool ring' from two of the four nodes. the rest are all working fine:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.63 MB 22.11% 56713727820156410577229101238628035242
> 172.16.14.10 ap-southeast1b Up Normal 1.85 MB 33.33% 113427455640312821154458202477256070484
> 172.16.14.12 ap-southeast1b Up Normal 1.36 MB 20.53% 14836224792726297274086461460357072503
> works ... on 2 nodes which happen to be on the 172.16.14.0/24 network.
> the nodes where the error appears are on the 172.16.12.0/24 network and this is what is shown when nodetool ring is run:
> Address DC Rack Status State Load Owns Token
> 148362247927262972740864614603570725035
> 172.16.12.11 ap-southeast1a Up Normal 1.58 MB 24.02% 19095547144942516281182777765338228798
> 172.16.12.10 ap-southeast1a Up Normal 1.62 MB 22.11% 56713727820156410577229101238628035242
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
> at org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
> at org.apache.cassandra.locator.EndpointSnitchInfo.getDatacenter(EndpointSnitchInfo.java:49)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
> at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
> at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
> at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
> at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
> at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
> at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
> at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
> at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> I've stopped the node and started it ... still doesn't make a difference. I've also shut down all nodes in the ring so that it was fully offline, and then brought them all back up ... issue still persists on two of the nodes.
> There are no firewall rules restricting traffic between these nodes. For example, on a node where the ring throws the exception, the two hosts that don't show up i can still get nestats for:
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.10
> Mode: Normal
> Nothing streaming to /172.16.14.10
> Nothing streaming from /172.16.14.10
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 1 1483
> nodetool -h 172.16.12.11 -p 9090 netstats 172.16.14.12
> Mode: Normal
> Nothing streaming to /172.16.14.12
> Nothing streaming from /172.16.14.12
> Pool Name Active Pending Completed
> Commands n/a 0 3
> Responses n/a 0 1784
> Problem only looks to appear via nodetool, and doesn't seem to be impacting anything else.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira