You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by K F <kf...@yahoo.com> on 2015/09/15 04:53:24 UTC

Seeing null pointer exception 2.0.14 after purging gossip state

Hi,
I have cassandra 2.0.14 deployed and after following the method described in Apache Cassandra™ 2.0 to clear the gossip state of the node in one of the dc of my cluster
|   |
|   |   |   |   |   |
| Apache Cassandra™ 2.0Correcting a problem in the gossip state. | Version 2.0 |
|  |
| View on docs.datastax.com | Preview by Yahoo |
|  |
|   |

 I see wierd exception on the nodes not many but a few in an hour for nodes that have already successfully decommissioned from the cluster, you can see from below exception that 10.0.0.1 has been already decommissioned. Below is the exception snippet. 
Has anyone observed a similar behaviour? e.g. 10.0.0.1 is decommissioned
Is this a bug in cassandra 2.0.x?
2015-09-15 02:35:14,056 [GossipStage:9] INFO Gossiper InetAddress /10.0.0.1 is now DOWN2015-09-15 02:35:14,058 [GossipStage:9] INFO StorageService Removing tokens [15950735949418990474845684723364134913] for /10.0.0.1
2015-09-15 02:35:14,061 [GossipStage:9] ERROR CassandraDaemon Exception in thread Thread[GossipStage:9,5,main]java.lang.NullPointerException at org.apache.cassandra.service.StorageService.getRpcaddress(StorageService.java:1067) at org.apache.cassandra.transport.Server$EventNotifier.getRpcAddress(Server.java:345) at org.apache.cassandra.transport.Server$EventNotifier.onLeaveCluster(Server.java:366) at org.apache.cassandra.service.StorageService.excise(StorageService.java:1790) at org.apache.cassandra.service.StorageService.excise(StorageService.java:1798) at org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1701) at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1361) at org.apache.cassandra.service.StorageService.onJoin(StorageService.java:1995) at org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1003) at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1102) at org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)

Re: Seeing null pointer exception 2.0.14 after purging gossip state

Posted by Ryan Svihla <rs...@foundev.pro>.

could it be related to CASSANDRA-9180 <https://issues.apache.org/jira/browse/CASSANDRA-9180> which was fixed in 2.0.15? although it really behaves like CASSANDRA-10231 <https://issues.apache.org/jira/browse/CASSANDRA-10231> which I don’t see any reference to it being in 2.0.x

> On Sep 24, 2015, at 12:57 PM, Robert Coli <rc...@eventbrite.com> wrote:
> 
> On Mon, Sep 14, 2015 at 7:53 PM, K F <kf200467@yahoo.com <ma...@yahoo.com>> wrote:
> I have cassandra 2.0.14 deployed and after following the method described in Apache Cassandra™ 2.0 <http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_gossip_purge.html> to clear the gossip state of the node in one of the dc of my cluster
> 
> Why did you need to do this?
> 
>  I see wierd exception on the nodes not many but a few in an hour for nodes that have already successfully decommissioned from the cluster, you can see from below exception that 10.0.0.1 has been already decommissioned. Below is the exception snippet. 
> 
> Have you done :
> 
> nodetool gossipinfo |grep SCHEMA |sort | uniq -c | sort -n
> 
> and checked for schema agreement... ?
> 
> =Rob
>  

Regards,

Ryan Svihla

Re: Seeing null pointer exception 2.0.14 after purging gossip state

Posted by Robert Coli <rc...@eventbrite.com>.

On Mon, Sep 14, 2015 at 7:53 PM, K F <kf...@yahoo.com> wrote:

> I have cassandra 2.0.14 deployed and after following the method described
> in Apache Cassandra™ 2.0
> <http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_gossip_purge.html> to
> clear the gossip state of the node in one of the dc of my cluster
>

Why did you need to do this?

 I see wierd exception on the nodes not many but a few in an hour for nodes
> that have already successfully decommissioned from the cluster, you can see
> from below exception that 10.0.0.1 has been already decommissioned. Below
> is the exception snippet.
>

Have you done :

nodetool gossipinfo |grep SCHEMA |sort | uniq -c | sort -n

and checked for schema agreement... ?

=Rob