You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Paulo Motta (JIRA)" <ji...@apache.org> on 2015/11/28 02:48:10 UTC

[jira] [Commented] (CASSANDRA-10774) Fail stream session if receiver cannot process data

    [ https://issues.apache.org/jira/browse/CASSANDRA-10774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15030340#comment-15030340 ] 

Paulo Motta commented on CASSANDRA-10774:
-----------------------------------------

I was able to confirm the suspicion by injecting an exception on the {{OnCompletionRunnable}} of the receiving node during a ccm testing bootstrap session. On 2.1 and 2.2, the bootstrap hangs indefinitely, while on 3.0 the bootstrap succeeds even with a failure during processing.

Simple fix is to wrap the {{OnCompletionRunnable.run()}} in a try-catch block, and fail the stream session in case an exception is catch. Otherwise the stream receive task is completed as usual. I tested with a similar scenario as before and the bootstrap fails as expected in 2.1, 2.2 and 3.0.

Below are branches and test results:
||2.1||2.2||3.0||3.1||trunk||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:2.1-10774]|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-10774]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-10774]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.1...pauloricardomg:3.1-10774]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-10774]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-10774-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-10774-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-10774-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.1-10774-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-10774-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-10774-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-10774-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-10774-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.1-10774-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-10774-dtest/lastCompletedBuild/testReport/]|

> Fail stream session if receiver cannot process data
> ---------------------------------------------------
>
>                 Key: CASSANDRA-10774
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10774
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Streaming and Messaging
>            Reporter: Paulo Motta
>            Assignee: Paulo Motta
>            Priority: Critical
>
> [~tjake] on CASSANDRA-10674:
> {quote}
> I think the underlying issue here is streaming failures only account for problems during the file send. Not any subsequent errors.
> We should probably add an acknowledgement to the streaming operation that it was processed by the receiver correctly.
> {quote}
> It seems the stream receive task (and thus the stream sesssion) is only completed on [2.1|https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java#L175] and [2.2|https://github.com/apache/cassandra/blob/cassandra-2.2/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java#L171] after the files are processed (otherwise it just hangs), but on [3.0|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java#L231] it's always completed even if there was a failure, what seems more critical. In any case, we should probably fail the stream session if there is a problem while processing the received data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)