You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Harikrishnan (JIRA)" <ji...@apache.org> on 2017/01/12 01:34:16 UTC

[jira] [Comment Edited] (CASSANDRA-12844) nodetool drain causing mutiple nodes crashing with hint file corruption in Cassandra 3.9

    [ https://issues.apache.org/jira/browse/CASSANDRA-12844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15819815#comment-15819815 ] 

Harikrishnan edited comment on CASSANDRA-12844 at 1/12/17 1:33 AM:
-------------------------------------------------------------------

Hi,
We reproduced this  two times , we were  trying to bring down a node by issuing nodetool drain. One interesting aspect is there were lot mutation drops and hint replay was happening to most of the nodes while drain is being issued.Will try to reproduce it again .


was (Author: hari708):
Hi,
We reproduced this  two times , we were  trying to bring down a node by issuing nodetool drain. One interesting aspect is there were lot mutation drops and hint replay was happening to most of the nodes while drain is being issued.

> nodetool drain causing mutiple nodes crashing with hint file corruption in Cassandra 3.9
> ----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-12844
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12844
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Harikrishnan
>            Priority: Critical
>              Labels: hints
>
> The steps are as follows.
> we have 4/4 node cassandra running in 3.9 version.
> In one node made some changes to cassanra.yaml. issued a nodetool drain 
> killed the cassandra process and restarted the node. After sometime nodetool status reported multiple nodes are down in that DC.
> Went and check the system.log of all the files and found the hint corruption occuring(CASSANDRA-12728).  nodetool drain causing this corruption and bringing multiple nodes down is a big concern.
> ERROR [HintsDispatcher:2] 2016-10-26 12:17:59,361 HintsDispatchExecutor.java:225 - Failed to dispatch hints file 4d1362f0-053c-4042-80a7-bfc85a26c90f-1477509190999-1.hints: file is corrupted ({})
> org.apache.cassandra.io.FSReadError: java.io.EOFException
>         at org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:284) ~[apache-cassandra-3.9.jar:3.9]
>         at org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:254) ~[apache-cassandra-3.9.jar:3.9]
>         at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.9.jar:3.9]
>         at org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156) ~[apache-cassandra-3.9.jar:3.9]
>         at org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137) ~[apache-cassandra-3.9.jar:3.9]
>         at org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) ~[apache-cassandra-3.9.jar:3.9]
>         at org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) ~[apache-cassandra-3.9.jar:3.9]
>         at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259) [apache-cassandra-3.9.jar:3.9]
>         at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242) [apache-cassandra-3.9.jar:3.9]
>         at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220) [apache-cassandra-3.9.jar:3.9]
>         at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199) [apache-cassandra-3.9.jar:3.9]
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_102]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_102]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_102]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_102]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)