You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "David Capwell (Jira)" <ji...@apache.org> on 2022/01/06 02:55:00 UTC

[jira] [Comment Edited] (CASSANDRA-17116) When zero-copy-streaming sees a channel close this triggers the disk failure policy

    [ https://issues.apache.org/jira/browse/CASSANDRA-17116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17469639#comment-17469639 ] 

David Capwell edited comment on CASSANDRA-17116 at 1/6/22, 2:54 AM:
--------------------------------------------------------------------

heh. https://app.circleci.com/pipelines/github/dcapwell/cassandra/1150/workflows/e1f0b429-c024-4b3e-8e97-07669f60996a/jobs/8438

test_preview - repair_tests.preview_repair_test.TestPreviewRepair

{code}
test teardown failure
Unexpected error found in node logs (see stdout for full details). Errors: [ERROR [Stream-Deserializer-/127.0.0.1:7000-e3a2ecc2] 2022-01-06 02:26:28,485 StreamSession.java:650 - [Stream #0d9ed480-6e98-11ec-91be-5f7fa67bb742] Socket closed before session completion, peer 127.0.0.1:7000 is probably down.
java.nio.channels.ClosedChannelException: null
	at org.apache.cassandra.net.AsyncStreamingInputPlus.reBuffer(AsyncStreamingInputPlus.java:136)
	at org.apache.cassandra.io.util.RebufferingInputStream.readByte(RebufferingInputStream.java:178)
	at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
	at org.apache.cassandra.streaming.StreamDeserializingTask.run(StreamDeserializingTask.java:59)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:834), ERROR [Stream-Deserializer-/127.0.0.1:7000-e3a2ecc2] 2022-01-06 02:26:28,485 StreamSession.java:650 - [Stream #0d9ed480-6e98-11ec-91be-5f7fa67bb742] Socket closed before session completion, peer 127.0.0.1:7000 is probably down.
java.nio.channels.ClosedChannelException: null
	at org.apache.cassandra.net.AsyncStreamingInputPlus.reBuffer(AsyncStreamingInputPlus.java:136)
	at org.apache.cassandra.io.util.RebufferingInputStream.readByte(RebufferingInputStream.java:178)
	at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
	at org.apache.cassandra.streaming.StreamDeserializingTask.run(StreamDeserializingTask.java:59)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:834)]
{code}

{code}
boolean isEofException = e instanceof EOFException || e instanceof ClosedChannelException;
        if (isEofException)
        {
            if (state.finalState)
            {
                logger.debug("[Stream #{}] Socket closed after session completed with state {}", planId(), state);

                return null;
            }
            else
            {
                logger.error("[Stream #{}] Socket closed before session completion, peer {} is probably down.",
                             planId(),
                             peer.getHostAddressAndPort(),
                             e);

                return closeSession(State.FAILED);
            }
        }
{code}


was (Author: dcapwell):
heh. https://app.circleci.com/pipelines/github/dcapwell/cassandra/1150/workflows/e1f0b429-c024-4b3e-8e97-07669f60996a/jobs/8438

test_preview - repair_tests.preview_repair_test.TestPreviewRepair

{code}
test teardown failure
Unexpected error found in node logs (see stdout for full details). Errors: [ERROR [Stream-Deserializer-/127.0.0.1:7000-e3a2ecc2] 2022-01-06 02:26:28,485 StreamSession.java:650 - [Stream #0d9ed480-6e98-11ec-91be-5f7fa67bb742] Socket closed before session completion, peer 127.0.0.1:7000 is probably down.
java.nio.channels.ClosedChannelException: null
	at org.apache.cassandra.net.AsyncStreamingInputPlus.reBuffer(AsyncStreamingInputPlus.java:136)
	at org.apache.cassandra.io.util.RebufferingInputStream.readByte(RebufferingInputStream.java:178)
	at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
	at org.apache.cassandra.streaming.StreamDeserializingTask.run(StreamDeserializingTask.java:59)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:834), ERROR [Stream-Deserializer-/127.0.0.1:7000-e3a2ecc2] 2022-01-06 02:26:28,485 StreamSession.java:650 - [Stream #0d9ed480-6e98-11ec-91be-5f7fa67bb742] Socket closed before session completion, peer 127.0.0.1:7000 is probably down.
java.nio.channels.ClosedChannelException: null
	at org.apache.cassandra.net.AsyncStreamingInputPlus.reBuffer(AsyncStreamingInputPlus.java:136)
	at org.apache.cassandra.io.util.RebufferingInputStream.readByte(RebufferingInputStream.java:178)
	at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49)
	at org.apache.cassandra.streaming.StreamDeserializingTask.run(StreamDeserializingTask.java:59)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:834)]
{code}

> When zero-copy-streaming sees a channel close this triggers the disk failure policy
> -----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-17116
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17116
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Streaming
>            Reporter: David Capwell
>            Assignee: David Capwell
>            Priority: Normal
>             Fix For: 4.x
>
>
> Found in CASSANDRA-17085.
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/1069/workflows/26b7b83a-686f-4516-a56a-0709d428d4f2/jobs/7264
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/1069/workflows/26b7b83a-686f-4516-a56a-0709d428d4f2/jobs/7256
> {code}
> ERROR [Stream-Deserializer-/127.0.0.1:7000-f2eb1a15] 2021-11-02 21:35:40,983 DefaultFSErrorHandler.java:104 - Exiting forcefully due to file system exception on startup, disk failure policy "stop"
> org.apache.cassandra.io.FSWriteError: java.nio.channels.ClosedChannelException
> 	at org.apache.cassandra.io.sstable.format.big.BigTableZeroCopyWriter.write(BigTableZeroCopyWriter.java:227)
> 	at org.apache.cassandra.io.sstable.format.big.BigTableZeroCopyWriter.writeComponent(BigTableZeroCopyWriter.java:206)
> 	at org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:125)
> 	at org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:84)
> 	at org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:51)
> 	at org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:37)
> 	at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:50)
> 	at org.apache.cassandra.streaming.StreamDeserializingTask.run(StreamDeserializingTask.java:62)
> 	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> 	at java.lang.Thread.run(Thread.java:748)
> Caused by: java.nio.channels.ClosedChannelException: null
> 	at org.apache.cassandra.net.AsyncStreamingInputPlus.reBuffer(AsyncStreamingInputPlus.java:136)
> 	at org.apache.cassandra.net.AsyncStreamingInputPlus.consume(AsyncStreamingInputPlus.java:155)
> 	at org.apache.cassandra.io.sstable.format.big.BigTableZeroCopyWriter.write(BigTableZeroCopyWriter.java:217)
> 	... 9 common frames omitted
> {code}
> When bootstrap fails and streaming is closed, this triggers the disk failure policy which causes the JVM to halt by default (if this happens outside of bootstrap, then we stop transports and keep the JVM up).
> org.apache.cassandra.streaming.StreamDeserializingTask attempts to handle this by ignoring this exception, but the call to org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize
>  Does try/catch and inspects exception; triggering this condition.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org