You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Richard Low (JIRA)" <ji...@apache.org> on 2014/01/24 15:12:37 UTC

[jira] [Commented] (CASSANDRA-5631) NPE when creating column family shortly after multinode startup

    [ https://issues.apache.org/jira/browse/CASSANDRA-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13880982#comment-13880982 ] 

Richard Low commented on CASSANDRA-5631:
----------------------------------------

I've seen this on Cassandra 1.2.11.  It happens if you create a keyspace, following quickly by creating a column family within that keyspace.  The NPE is thrown because Schema.instance.getTableDefinition returns null for the keyspace but it isn't checked.

In the case I saw, the node that threw the NPE had problems so it wasn't receiving many messages - it didn't get the create keyspace message but did get the create CF message.  Even if a node doesn't have any problems, the ordering of these messages is not guaranteed.  The node will get the create keyspace message some time later (probably about 60 seconds later when another node has noticed the schema version is wrong) but it won't attempt to recreate the CF unless there is a further CF change (create, update or delete) within that keyspace.  Only then is the current cached schema compared with the on disk schema (in DefsTable.mergeColumnFamilies).  It then notices the CF doesn't exist so creates it.  This could never happen, so the node won't ever create the CF (unless it is restarted).

I think a fix would be to catch the NPEs above, and then, on learning about a new keyspace, check to see if any CFs should have been created for that keyspace.

I haven't tried to repro this on 2.0 but the code looks almost identical so I would expect it to still be present.

Could someone reopen the ticket please?

> NPE when creating column family shortly after multinode startup
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-5631
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5631
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.2.0
>            Reporter: Martin Serrano
>
> I'm testing a 2-node cluster and creating a column family right after the nodes startup.  I am using the Astyanax client.  Sometimes column family creation fails and I see NPEs on the cassandra server:
> {noformat}
> 2013-06-12 14:55:31,773 ERROR CassandraDaemon [MigrationStage:1] - Exception in thread Thread[MigrationStage:1,5,main]
> java.lang.NullPointerException
> 	at org.apache.cassandra.db.DefsTable.addColumnFamily(DefsTable.java:510)
> 	at org.apache.cassandra.db.DefsTable.mergeColumnFamilies(DefsTable.java:444)
> 	at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:354)
> 	at org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:55)
> 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:722)
> {noformat}
> {noformat}
> 2013-06-12 14:55:31,880 ERROR CassandraDaemon [MigrationStage:1] - Exception in thread Thread[MigrationStage:1,5,main]
> java.lang.NullPointerException
> 	at org.apache.cassandra.db.DefsTable.mergeColumnFamilies(DefsTable.java:475)
> 	at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:354)
> 	at org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:55)
> 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:722)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)