You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "david.pan (JIRA)" <ji...@apache.org> on 2010/01/26 11:00:34 UTC
[jira] Created: (CASSANDRA-742) write operation will throw internal
error if the bootstrapping node is down
write operation will throw internal error if the bootstrapping node is down
---------------------------------------------------------------------------
Key: CASSANDRA-742
URL: https://issues.apache.org/jira/browse/CASSANDRA-742
Project: Cassandra
Issue Type: Bug
Components: Core
Affects Versions: 0.5
Environment: linux2.6
Reporter: david.pan
Fix For: 0.6
the opertions are that :
1) bootstrap a node A;
2) keep on inserting data while bootstrapping;
3) stop the service of the node A;
4) then the following exception was found:
ERROR [pool-1-thread-9] 2010-01-26 10:32:39,688 Cassandra.java (line 1064) Internal error processing insert
java.lang.AssertionError
at org.apache.cassandra.locator.TokenMetadata.getToken(TokenMetadata.java:213)
at org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedMapForEndpoints(AbstractReplicationStrategy.java:142)
at org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedEndpoints(AbstractReplicationStrategy.java:76)
at org.apache.cassandra.service.StorageService.getHintedEndpointMap(StorageService.java:1188)
at org.apache.cassandra.service.StorageProxy.insertBlocking(StorageProxy.java:169)
at org.apache.cassandra.service.CassandraServer.doInsert(CassandraServer.java:466)
at org.apache.cassandra.service.CassandraServer.insert(CassandraServer.java:417)
at org.apache.cassandra.service.Cassandra$Processor$insert.process(Cassandra.java:1056)
at org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:817)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:619)
I traced the code and found that "org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedMapForEndpoints(Collection<InetAddress>)" will select a hinted endpoint for a dead endpoint, no mater whether it's a normal node or a bootstrapping node. To get the tokenID of the endpoint, this method will call "tokenMetadata_.getToken(ep);", but getToken() asserts that the endpoint should be a member of the ring only. Of course, the bootstrapping endpoint is not a member and a internal exception is throwed out.
This exception will always be throwed out until I re-boostrapping. This is really a big prolem for me, because the bootstrapping will last 30 hours and my machines are not very durable. I have to get up from bed at night to deal with this accident. :-(
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-742) write operation will throw internal
error if the bootstrapping node is down
Posted by "david.pan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
david.pan updated CASSANDRA-742:
--------------------------------
Attachment: 742-write_failed_when_bootstrapping_down.patch
This patch is not a perfect solution for this issue, but I can have a sweet dream at night and I can deal with this accident the next morning. :-)
This patch will remove the bootstrapping endpoint from the tokenMetadata if other nodes find this node is down.
The write opertion will be timeout before other nodes find the bootstrapping node is down, but it will be OK after other nodes remove the bootstrapping node from the pendingRanges.
> write operation will throw internal error if the bootstrapping node is down
> ---------------------------------------------------------------------------
>
> Key: CASSANDRA-742
> URL: https://issues.apache.org/jira/browse/CASSANDRA-742
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.5
> Environment: linux2.6
> Reporter: david.pan
> Fix For: 0.6
>
> Attachments: 742-write_failed_when_bootstrapping_down.patch
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> the opertions are that :
> 1) bootstrap a node A;
> 2) keep on inserting data while bootstrapping;
> 3) stop the service of the node A;
> 4) then the following exception was found:
> ERROR [pool-1-thread-9] 2010-01-26 10:32:39,688 Cassandra.java (line 1064) Internal error processing insert
> java.lang.AssertionError
> at org.apache.cassandra.locator.TokenMetadata.getToken(TokenMetadata.java:213)
> at org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedMapForEndpoints(AbstractReplicationStrategy.java:142)
> at org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedEndpoints(AbstractReplicationStrategy.java:76)
> at org.apache.cassandra.service.StorageService.getHintedEndpointMap(StorageService.java:1188)
> at org.apache.cassandra.service.StorageProxy.insertBlocking(StorageProxy.java:169)
> at org.apache.cassandra.service.CassandraServer.doInsert(CassandraServer.java:466)
> at org.apache.cassandra.service.CassandraServer.insert(CassandraServer.java:417)
> at org.apache.cassandra.service.Cassandra$Processor$insert.process(Cassandra.java:1056)
> at org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:817)
> at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
> at java.lang.Thread.run(Thread.java:619)
> I traced the code and found that "org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedMapForEndpoints(Collection<InetAddress>)" will select a hinted endpoint for a dead endpoint, no mater whether it's a normal node or a bootstrapping node. To get the tokenID of the endpoint, this method will call "tokenMetadata_.getToken(ep);", but getToken() asserts that the endpoint should be a member of the ring only. Of course, the bootstrapping endpoint is not a member and a internal exception is throwed out.
> This exception will always be throwed out until I re-boostrapping. This is really a big prolem for me, because the bootstrapping will last 30 hours and my machines are not very durable. I have to get up from bed at night to deal with this accident. :-(
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (CASSANDRA-742) write operation will throw
internal error if the bootstrapping node is down
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis resolved CASSANDRA-742.
--------------------------------------
Resolution: Duplicate
fixed in CASSANDRA-722 for 0.5.1
> write operation will throw internal error if the bootstrapping node is down
> ---------------------------------------------------------------------------
>
> Key: CASSANDRA-742
> URL: https://issues.apache.org/jira/browse/CASSANDRA-742
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.5
> Environment: linux2.6
> Reporter: david.pan
> Fix For: 0.6
>
> Attachments: 742-write_failed_when_bootstrapping_down.patch
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> the opertions are that :
> 1) bootstrap a node A;
> 2) keep on inserting data while bootstrapping;
> 3) stop the service of the node A;
> 4) then the following exception was found:
> ERROR [pool-1-thread-9] 2010-01-26 10:32:39,688 Cassandra.java (line 1064) Internal error processing insert
> java.lang.AssertionError
> at org.apache.cassandra.locator.TokenMetadata.getToken(TokenMetadata.java:213)
> at org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedMapForEndpoints(AbstractReplicationStrategy.java:142)
> at org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedEndpoints(AbstractReplicationStrategy.java:76)
> at org.apache.cassandra.service.StorageService.getHintedEndpointMap(StorageService.java:1188)
> at org.apache.cassandra.service.StorageProxy.insertBlocking(StorageProxy.java:169)
> at org.apache.cassandra.service.CassandraServer.doInsert(CassandraServer.java:466)
> at org.apache.cassandra.service.CassandraServer.insert(CassandraServer.java:417)
> at org.apache.cassandra.service.Cassandra$Processor$insert.process(Cassandra.java:1056)
> at org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:817)
> at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
> at java.lang.Thread.run(Thread.java:619)
> I traced the code and found that "org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedMapForEndpoints(Collection<InetAddress>)" will select a hinted endpoint for a dead endpoint, no mater whether it's a normal node or a bootstrapping node. To get the tokenID of the endpoint, this method will call "tokenMetadata_.getToken(ep);", but getToken() asserts that the endpoint should be a member of the ring only. Of course, the bootstrapping endpoint is not a member and a internal exception is throwed out.
> This exception will always be throwed out until I re-boostrapping. This is really a big prolem for me, because the bootstrapping will last 30 hours and my machines are not very durable. I have to get up from bed at night to deal with this accident. :-(
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.