You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Yuki Morishita (JIRA)" <ji...@apache.org> on 2015/06/16 20:56:00 UTC

[jira] [Commented] (CASSANDRA-9270) Running resetlocalschema during repair can cause repair to hang

    [ https://issues.apache.org/jira/browse/CASSANDRA-9270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14588569#comment-14588569 ] 

Yuki Morishita commented on CASSANDRA-9270:
-------------------------------------------

I think this happens in many places where they access {{Keyspace#getColumnFamilyStore}} or related {{Schema#getCF}} and alikes.
We should re-consider those API so that caller has to explicitly handle dropped KS/CF case.
For example, we can use java8 Optional for return value or throw checked exception instead.

Any opinion?

> Running resetlocalschema during repair can cause repair to hang
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-9270
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9270
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: T Jake Luciani
>            Assignee: Yuki Morishita
>            Priority: Minor
>             Fix For: 2.0.x
>
>
> If you run resetlocalschema during a repair the node doing the repair can hang.
> The following test reproduces the issue quite frequently:
> https://github.com/riptano/cassandra-dtest/pull/247
> This is from trunk, but happens in 2.0 and 2.1 as well.  My guess is there is some count down latch that isn't count down when the repair msg fails to be parsed.
> {code}
> ERROR [Repair#1:10] 2015-04-30 12:57:56,675 CassandraDaemon.java: Exception in thread Thread[Repair#1:10,5,RMI Runtime]
> java.lang.IllegalArgumentException: Unknown keyspace/cf pair (keyspace1.standard1)
> 	at org.apache.cassandra.db.Keyspace.getColumnFamilyStore(Keyspace.java:172) ~[main/:na]
> 	at org.apache.cassandra.repair.RepairJob.sendValidationRequest(RepairJob.java:189) ~[main/:na]
> 	at org.apache.cassandra.repair.RepairJob.run(RepairJob.java:110) ~[main/:na]
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_72]
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_72]
> 	at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_72]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)