You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Paulo Motta (JIRA)" <ji...@apache.org> on 2016/01/12 22:32:39 UTC

[jira] [Comment Edited] (CASSANDRA-10992) Hanging streaming sessions

    [ https://issues.apache.org/jira/browse/CASSANDRA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094989#comment-15094989 ] 

Paulo Motta edited comment on CASSANDRA-10992 at 1/12/16 9:31 PM:
------------------------------------------------------------------

I don't know exactly what's happening, but the {{AsynchronousCloseException}} makes it smell like the interrupt workaround for CASSANDRA-10012 is closing the channel after a genuine timeout, preventing a retry. This was fixed on CASSANDRA-10961, so to test that hypothesis, could you try replacing the jar I attached (which contains the 2.1 revert for CASSANDRA-10012) in all nodes involved in a repair of a specific subrange? A rolling restart will be needed.  If this does not solve the issue, please attach corresponding trace logs as instructed before (making sure to enable trace logs in the logback configuration before triggering the faulty repair operation after replacing the jars).


was (Author: pauloricardomg):
I don't know exactly what's happening, but the {{AsynchronousCloseException}} makes it smell like the interrupt workaround for CASSANDRA-10012 is closing the channel after a genuine timeout, preventing a retry. This was fixed on CASSANDRA-10961, so to test that hypothesis, could you try replacing the jar I attached (which contains the 2.1 revert for CASSANDRA-10012) in a subset of the nodes involved in the repair? A rolling restart will be needed.  If this does not solve the issue, please attach corresponding trace logs as instructed before (making sure to enable trace logs in the logback configuration before triggering the faulty repair operation after replacing the jars).

> Hanging streaming sessions
> --------------------------
>
>                 Key: CASSANDRA-10992
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10992
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: C* 2.1.12, Debian Wheezy
>            Reporter: mlowicki
>            Assignee: Paulo Motta
>             Fix For: 2.1.12
>
>         Attachments: apache-cassandra-2.1.12-SNAPSHOT.jar
>
>
> I've started recently running repair using [Cassandra Reaper|https://github.com/spotify/cassandra-reaper]  (built-in {{nodetool repair}} doesn't work for me - CASSANDRA-9935). It behaves fine but I've noticed hanging streaming sessions:
> {code}
> root@db1:~# date
> Sat Jan  9 16:43:00 UTC 2016
> root@db1:~# nt netstats -H | grep total
>         Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB total
>         Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
>         Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB total
>         Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
>         Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB total
>         Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
>         Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 MB total
>         Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
>         Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB total
>         Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB total
> root@db1:~# date
> Sat Jan  9 17:45:42 UTC 2016
> root@db1:~# nt netstats -H | grep total
>         Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB total
>         Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
>         Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB total
>         Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
>         Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB total
>         Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
>         Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 MB total
>         Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
>         Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB total
>         Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB total
> {code}
> Such sessions are left even when repair job is long time done (confirmed by checking Reaper's and Cassandra's logs). {{streaming_socket_timeout_in_ms}} in cassandra.yaml is set to default value (3600000).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)