You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Tania S Engel (JIRA)" <ji...@apache.org> on 2017/06/15 15:24:00 UTC

[jira] [Created] (CASSANDRA-13608) Connection closed/reopened during join causes Cassandra stream to close

Tania S Engel created CASSANDRA-13608:
-----------------------------------------

             Summary: Connection closed/reopened during join causes Cassandra stream to close
                 Key: CASSANDRA-13608
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13608
             Project: Cassandra
          Issue Type: Bug
          Components: Streaming and Messaging
         Environment: Cassandra 3.10. Windows Server 2016, 32GB ram, 2TB hard disk, RAID10 with 4 spindles, 8 Cores
            Reporter: Tania S Engel
             Fix For: 3.10
         Attachments: Cassandra 3.10 Join with lots GC collection leads to socket closure and join hang.mht

We start a JOIN bootstrap. Primary seed node streams to the replica. The replica requires some GC cleanup and experiences frequent pauses including a 12 second old gen cleanup following a memTable flush. Both replica and primary show _MessagingService IOException: An existing connection was forcibly closed by the remote host_. The replica MessagingService-Outgoing reestablishes the connection immediately but the primary StreamKeepAliveExecutor throws a _java.RuntimeException: Outgoing stream handler has been closed_. From that point forward, the replica stays in JOIN mode, sending keeping alive to the primary. The primary receives the keep alive, but does not send its own and it repeatedly fails to send a hints file to the replica. It seems this limping condition would continue indefinitely, but stops as we stop the replica Cassandra. If we restart the replica Cassandra the JOIN picks up again but fails with _java.io.IOException: Corrupt value length 355151036 encountered, as it exceeds the maximum of 268435456, which is set via max_value_size_in_mb in cassandra.yaml_. We have not increased this value as we do not have values that large in our data so we presume it is indeed corrupt and moving past it would not be a good idea. Please see the attachment for details.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org