You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by George Sigletos <si...@textkernel.nl> on 2016/05/25 06:31:23 UTC

Error while rebuilding a node: Stream failed

I am getting this error repeatedly while I am trying to add a new DC
consisting of one node in AWS to my existing cluster. I have tried 5 times
already. Running Cassandra 2.1.13

I have also set:
streaming_socket_timeout_in_ms: 3600000
in all of my nodes

Does anybody have any idea how this can be fixed? Thanks in advance

Kind regards,
George

P.S.
The complete stack trace:
-- StackTrace --
java.lang.RuntimeException: Error while rebuilding node: Stream failed
        at
org.apache.cassandra.service.StorageService.rebuild(StorageService.java:1076)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at sun.reflect.misc.Trampoline.invoke(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at sun.reflect.misc.MethodUtil.invoke(Unknown Source)
        at
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Unknown Source)
        at
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Unknown Source)
        at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(Unknown Source)
        at com.sun.jmx.mbeanserver.PerInterface.invoke(Unknown Source)
        at com.sun.jmx.mbeanserver.MBeanSupport.invoke(Unknown Source)
        at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(Unknown Source)
        at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(Unknown Source)
        at
javax.management.remote.rmi.RMIConnectionImpl.doOperation(Unknown Source)
        at javax.management.remote.rmi.RMIConnectionImpl.access$300(Unknown
Source)
        at
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(Unknown
Source)
        at
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(Unknown
Source)
        at javax.management.remote.rmi.RMIConnectionImpl.invoke(Unknown
Source)
        at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at sun.rmi.server.UnicastServerRef.dispatch(Unknown Source)
        at sun.rmi.transport.Transport$2.run(Unknown Source)
        at sun.rmi.transport.Transport$2.run(Unknown Source)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.rmi.transport.Transport.serviceCall(Unknown Source)
        at sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown Source)
        at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown Source)
        at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.access$400(Unknown
Source)
        at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(Unknown Source)
        at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(Unknown Source)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown
Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
        at java.lang.Thread.run(Unknown Source)

Re: Error while rebuilding a node: Stream failed

Posted by Sebastian Estevez <se...@datastax.com>.
Check ifconfig for dripped tpc messages. Let's rule out your network.

all the best,

Sebastián
On May 27, 2016 10:45 AM, "George Sigletos" <si...@textkernel.nl> wrote:

> Hello,
>
> No there is no version mix. The first stack traces were indeed from
> 2.1.13. Then I upgraded all nodes to 2.1.14. Still getting the same errors
>
>
> On Fri, May 27, 2016 at 4:39 PM, Eric Evans <jo...@gmail.com>
> wrote:
>
>> From the various stacktraces in this thread, it's obvious you are
>> mixing versions 2.1.13 and 2.1.14.  Topology changes like this aren't
>> supported with mixed Cassandra versions.  Sometimes it will work,
>> sometimes it won't (and it will definitely not work in this instance).
>>
>> You should either upgrade your 2.1.13 nodes to 2.1.14 first, or add
>> the new nodes using 2.1.13, and upgrade after.
>>
>> On Fri, May 27, 2016 at 8:41 AM, George Sigletos <si...@textkernel.nl>
>> wrote:
>>
>> >>>> ERROR [STREAM-IN-/192.168.1.141] 2016-05-26 09:08:05,027
>> >>>> StreamSession.java:505 - [Stream
>> #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>> >>>> Streaming error occurred
>> >>>> java.lang.RuntimeException: Outgoing stream handler has been closed
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:138)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:568)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:457)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:263)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>> >>>>
>> >>>> And this is from the source node:
>> >>>>
>> >>>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,097
>> >>>> StreamSession.java:505 - [Stream
>> #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>> >>>> Streaming error occurred
>> >>>> java.io.IOException: Broken pipe
>> >>>>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>> >>>> ~[na:1.7.0_79]
>> >>>>         at sun.nio.ch.FileChannelImpl.transferToDirectly(Unknown
>> Source)
>> >>>> ~[na:1.7.0_79]
>> >>>>         at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source)
>> >>>> ~[na:1.7.0_79]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:84)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:88)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:49)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:358)
>> >>>> [apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:330)
>> >>>> [apache-cassandra-2.1.14.jar:2.1.14]
>>
>>
>> >>>>>>>>>>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
>> >>>>>>>>>>> StreamSession.java:620 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> >>>>>>>>>>> Remote peer 192.168.1.140 failed stream session.
>> >>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
>> >>>>>>>>>>> StreamSession.java:505 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> >>>>>>>>>>> Streaming error occurred
>> >>>>>>>>>>> java.io.IOException: Connection timed out
>> >>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>> >>>>>>>>>>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>> >>>>>>>>>>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
>> >>>>>>>>>>> Source) ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source)
>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>> >>>>>>>>>>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>> >>>>>>>>>>> INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
>> >>>>>>>>>>> StreamResultFuture.java:180 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> >>>>>>>>>>> Session with /192.168.1.140 is complete
>> >>>>>>>>>>> WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
>> >>>>>>>>>>> StreamResultFuture.java:207 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> >>>>>>>>>>> Stream failed
>> >>>>>>>>>>> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24
>> 22:44:58,628
>> >>>>>>>>>>> StorageService.java:1075 - Error while rebuilding node
>> >>>>>>>>>>> org.apache.cassandra.streaming.StreamException: Stream failed
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) ~[na:1.7.0_79]
>> >>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
>> >>>>>>>>>>> StreamSession.java:505 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> >>>>>>>>>>> Streaming error occurred
>> >>>>>>>>>>> java.io.IOException: Broken pipe
>> >>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>> >>>>>>>>>>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>> >>>>>>>>>>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
>> >>>>>>>>>>> Source) ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source)
>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>> >>>>>>>>>>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>
>>
>>
>> --
>> Eric Evans
>> john.eric.evans@gmail.com
>>
>
>

Re: Error while rebuilding a node: Stream failed

Posted by George Sigletos <si...@textkernel.nl>.
Hello,

No there is no version mix. The first stack traces were indeed from 2.1.13.
Then I upgraded all nodes to 2.1.14. Still getting the same errors


On Fri, May 27, 2016 at 4:39 PM, Eric Evans <jo...@gmail.com>
wrote:

> From the various stacktraces in this thread, it's obvious you are
> mixing versions 2.1.13 and 2.1.14.  Topology changes like this aren't
> supported with mixed Cassandra versions.  Sometimes it will work,
> sometimes it won't (and it will definitely not work in this instance).
>
> You should either upgrade your 2.1.13 nodes to 2.1.14 first, or add
> the new nodes using 2.1.13, and upgrade after.
>
> On Fri, May 27, 2016 at 8:41 AM, George Sigletos <si...@textkernel.nl>
> wrote:
>
> >>>> ERROR [STREAM-IN-/192.168.1.141] 2016-05-26 09:08:05,027
> >>>> StreamSession.java:505 - [Stream
> #74c57bc0-231a-11e6-a698-1b05ac77baf9]
> >>>> Streaming error occurred
> >>>> java.lang.RuntimeException: Outgoing stream handler has been closed
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:138)
> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:568)
> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:457)
> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:263)
> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
> >>>>
> >>>> And this is from the source node:
> >>>>
> >>>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,097
> >>>> StreamSession.java:505 - [Stream
> #74c57bc0-231a-11e6-a698-1b05ac77baf9]
> >>>> Streaming error occurred
> >>>> java.io.IOException: Broken pipe
> >>>>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
> >>>> ~[na:1.7.0_79]
> >>>>         at sun.nio.ch.FileChannelImpl.transferToDirectly(Unknown
> Source)
> >>>> ~[na:1.7.0_79]
> >>>>         at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source)
> >>>> ~[na:1.7.0_79]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:84)
> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:88)
> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:49)
> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:358)
> >>>> [apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:330)
> >>>> [apache-cassandra-2.1.14.jar:2.1.14]
>
>
> >>>>>>>>>>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
> >>>>>>>>>>> StreamSession.java:620 - [Stream
> #2c290460-20d4-11e6-930f-1b05ac77baf9]
> >>>>>>>>>>> Remote peer 192.168.1.140 failed stream session.
> >>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
> >>>>>>>>>>> StreamSession.java:505 - [Stream
> #2c290460-20d4-11e6-930f-1b05ac77baf9]
> >>>>>>>>>>> Streaming error occurred
> >>>>>>>>>>> java.io.IOException: Connection timed out
> >>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> >>>>>>>>>>> ~[na:1.7.0_79]
> >>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
> >>>>>>>>>>> ~[na:1.7.0_79]
> >>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
> >>>>>>>>>>> Source) ~[na:1.7.0_79]
> >>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source)
> ~[na:1.7.0_79]
> >>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
> >>>>>>>>>>> ~[na:1.7.0_79]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
> >>>>>>>>>>> INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
> >>>>>>>>>>> StreamResultFuture.java:180 - [Stream
> #2c290460-20d4-11e6-930f-1b05ac77baf9]
> >>>>>>>>>>> Session with /192.168.1.140 is complete
> >>>>>>>>>>> WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
> >>>>>>>>>>> StreamResultFuture.java:207 - [Stream
> #2c290460-20d4-11e6-930f-1b05ac77baf9]
> >>>>>>>>>>> Stream failed
> >>>>>>>>>>> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24
> 22:44:58,628
> >>>>>>>>>>> StorageService.java:1075 - Error while rebuilding node
> >>>>>>>>>>> org.apache.cassandra.streaming.StreamException: Stream failed
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
> >>>>>>>>>>> ~[guava-16.0.jar:na]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
> >>>>>>>>>>> ~[guava-16.0.jar:na]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
> >>>>>>>>>>> ~[guava-16.0.jar:na]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
> >>>>>>>>>>> ~[guava-16.0.jar:na]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
> >>>>>>>>>>> ~[guava-16.0.jar:na]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) ~[na:1.7.0_79]
> >>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
> >>>>>>>>>>> StreamSession.java:505 - [Stream
> #2c290460-20d4-11e6-930f-1b05ac77baf9]
> >>>>>>>>>>> Streaming error occurred
> >>>>>>>>>>> java.io.IOException: Broken pipe
> >>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> >>>>>>>>>>> ~[na:1.7.0_79]
> >>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
> >>>>>>>>>>> ~[na:1.7.0_79]
> >>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
> >>>>>>>>>>> Source) ~[na:1.7.0_79]
> >>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source)
> ~[na:1.7.0_79]
> >>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
> >>>>>>>>>>> ~[na:1.7.0_79]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>
>
>
> --
> Eric Evans
> john.eric.evans@gmail.com
>

Re: Error while rebuilding a node: Stream failed

Posted by George Sigletos <si...@textkernel.nl>.
I gave up completely with rebuild.

Now I am running `nodetool repair` and in case of network issues I retry
for the token ranges that failed using the -st and -et options of `nodetool
repair`.

That would be good enough for now, till we fix our network problems.

On Sat, May 28, 2016 at 7:05 PM, George Sigletos <si...@textkernel.nl>
wrote:

> No luck unfortunately. It seems that the connection to the destination
> node was lost.
>
> However there was progress compared to the previous times. A lot more data
> was streamed.
>
> (From source node)
> INFO  [GossipTasks:1] 2016-05-28 17:53:57,155 Gossiper.java:1008 -
> InetAddress /54.172.235.227 is now DOWN
> INFO  [HANDSHAKE-/54.172.235.227] 2016-05-28 17:53:58,238
> OutboundTcpConnection.java:487 - Handshaking version with /54.172.235.227
> ERROR [STREAM-IN-/54.172.235.227] 2016-05-28 17:54:08,938
> StreamSession.java:505 - [Stream #d25a05c0-241f-11e6-bb50-1b05ac77baf9]
> Streaming error occurred
> java.io.IOException: Connection timed out
>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> ~[na:1.7.0_79]
>         at sun.nio.ch.SocketDispatcher.read(Unknown Source) ~[na:1.7.0_79]
>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
> ~[na:1.7.0_79]
>         at sun.nio.ch.IOUtil.read(Unknown Source) ~[na:1.7.0_79]
>         at sun.nio.ch.SocketChannelImpl.read(Unknown Source) ~[na:1.7.0_79]
>         at sun.nio.ch.SocketAdaptor$SocketInputStream.read(Unknown Source)
> ~[na:1.7.0_79]
>         at sun.nio.ch.ChannelInputStream.read(Unknown Source)
> ~[na:1.7.0_79]
>         at java.nio.channels.Channels$ReadableByteChannelImpl.read(Unknown
> Source) ~[na:1.7.0_79]
>         at
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:51)
> ~[apache-cassandra-2.1.14.jar:2.1.14]
>         at
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:257)
> ~[apache-cassandra-2.1.14.jar:2.1.14]
>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
> INFO  [SharedPool-Worker-1] 2016-05-28 17:54:59,612 Gossiper.java:993 -
> InetAddress /54.172.235.227 is now UP
>
> On Fri, May 27, 2016 at 5:37 PM, George Sigletos <si...@textkernel.nl>
> wrote:
>
>> I am trying once more using more aggressive tcp settings, as recommended
>> here
>> <https://docs.datastax.com/en/cassandra/2.1/cassandra/troubleshooting/trblshootIdleFirewall.html>
>>
>> sudo sysctl -w net.ipv4.tcp_keepalive_time=60 net.ipv4.tcp_keepalive_probes=3 net.ipv4.tcp_keepalive_intvl=10
>>
>> (added to /etc/sysctl.conf and run sysctl -p /etc/sysctl.conf on all
>> nodes)
>>
>> Let's see what happens. I don't know what else to try. I have even
>> further increased streaming_socket_timeout_in_ms
>>
>>
>>
>> On Fri, May 27, 2016 at 4:56 PM, Paulo Motta <pa...@gmail.com>
>> wrote:
>>
>>> I'm afraid raising streaming_socket_timeout_in_ms won't help much in
>>> this case because the incoming connection on the source node is timing out
>>> on the network layer, and streaming_socket_timeout_in_ms controls the
>>> socket timeout in the app layer and throws SocketTimeoutException (not java.io.IOException:
>>> Connection timed out). So you should probably use more aggressive tcp
>>> keep-alive settings (net.ipv4.tcp_keepalive_*) on both hosts, did you try
>>> tuning that? Even that might not be sufficient as some routers tend to
>>> ignore tcp keep-alives and just kill idle connections.
>>>
>>> As said before, this will ultimately be fixed by adding keep-alive to
>>> the app layer on CASSANDRA-11841. If tuning tcp keep-alives does not help,
>>> one extreme approach would be to backport this to 2.1 (unless some
>>> experienced operator out there has a more creative approach).
>>>
>>> @eevans, I'm not sure he is using a mixed version cluster, it seem he
>>> finished the upgrade from 2.1.13 to 2.1.14 before performing the rebuild.
>>>
>>> 2016-05-27 11:39 GMT-03:00 Eric Evans <jo...@gmail.com>:
>>>
>>>> From the various stacktraces in this thread, it's obvious you are
>>>> mixing versions 2.1.13 and 2.1.14.  Topology changes like this aren't
>>>> supported with mixed Cassandra versions.  Sometimes it will work,
>>>> sometimes it won't (and it will definitely not work in this instance).
>>>>
>>>> You should either upgrade your 2.1.13 nodes to 2.1.14 first, or add
>>>> the new nodes using 2.1.13, and upgrade after.
>>>>
>>>> On Fri, May 27, 2016 at 8:41 AM, George Sigletos <
>>>> sigletos@textkernel.nl> wrote:
>>>>
>>>> >>>> ERROR [STREAM-IN-/192.168.1.141] 2016-05-26 09:08:05,027
>>>> >>>> StreamSession.java:505 - [Stream
>>>> #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>>>> >>>> Streaming error occurred
>>>> >>>> java.lang.RuntimeException: Outgoing stream handler has been closed
>>>> >>>>         at
>>>> >>>>
>>>> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:138)
>>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>> >>>>         at
>>>> >>>>
>>>> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:568)
>>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>> >>>>         at
>>>> >>>>
>>>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:457)
>>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>> >>>>         at
>>>> >>>>
>>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:263)
>>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>> >>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>> >>>>
>>>> >>>> And this is from the source node:
>>>> >>>>
>>>> >>>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,097
>>>> >>>> StreamSession.java:505 - [Stream
>>>> #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>>>> >>>> Streaming error occurred
>>>> >>>> java.io.IOException: Broken pipe
>>>> >>>>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>>>> >>>> ~[na:1.7.0_79]
>>>> >>>>         at sun.nio.ch.FileChannelImpl.transferToDirectly(Unknown
>>>> Source)
>>>> >>>> ~[na:1.7.0_79]
>>>> >>>>         at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source)
>>>> >>>> ~[na:1.7.0_79]
>>>> >>>>         at
>>>> >>>>
>>>> org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:84)
>>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>> >>>>         at
>>>> >>>>
>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:88)
>>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>> >>>>         at
>>>> >>>>
>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:49)
>>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>> >>>>         at
>>>> >>>>
>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>> >>>>         at
>>>> >>>>
>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>> >>>>         at
>>>> >>>>
>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:358)
>>>> >>>> [apache-cassandra-2.1.14.jar:2.1.14]
>>>> >>>>         at
>>>> >>>>
>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:330)
>>>> >>>> [apache-cassandra-2.1.14.jar:2.1.14]
>>>>
>>>>
>>>> >>>>>>>>>>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
>>>> >>>>>>>>>>> StreamSession.java:620 - [Stream
>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>> >>>>>>>>>>> Remote peer 192.168.1.140 failed stream session.
>>>> >>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
>>>> >>>>>>>>>>> StreamSession.java:505 - [Stream
>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>> >>>>>>>>>>> Streaming error occurred
>>>> >>>>>>>>>>> java.io.IOException: Connection timed out
>>>> >>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native
>>>> Method)
>>>> >>>>>>>>>>> ~[na:1.7.0_79]
>>>> >>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>>> >>>>>>>>>>> ~[na:1.7.0_79]
>>>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
>>>> >>>>>>>>>>> Source) ~[na:1.7.0_79]
>>>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source)
>>>> ~[na:1.7.0_79]
>>>> >>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown
>>>> Source)
>>>> >>>>>>>>>>> ~[na:1.7.0_79]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
>>>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source)
>>>> [na:1.7.0_79]
>>>> >>>>>>>>>>> INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
>>>> >>>>>>>>>>> StreamResultFuture.java:180 - [Stream
>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>> >>>>>>>>>>> Session with /192.168.1.140 is complete
>>>> >>>>>>>>>>> WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
>>>> >>>>>>>>>>> StreamResultFuture.java:207 - [Stream
>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>> >>>>>>>>>>> Stream failed
>>>> >>>>>>>>>>> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24
>>>> 22:44:58,628
>>>> >>>>>>>>>>> StorageService.java:1075 - Error while rebuilding node
>>>> >>>>>>>>>>> org.apache.cassandra.streaming.StreamException: Stream
>>>> failed
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
>>>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>>>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>>>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>>>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>>>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
>>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
>>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
>>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
>>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
>>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
>>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source)
>>>> ~[na:1.7.0_79]
>>>> >>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
>>>> >>>>>>>>>>> StreamSession.java:505 - [Stream
>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>> >>>>>>>>>>> Streaming error occurred
>>>> >>>>>>>>>>> java.io.IOException: Broken pipe
>>>> >>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native
>>>> Method)
>>>> >>>>>>>>>>> ~[na:1.7.0_79]
>>>> >>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>>> >>>>>>>>>>> ~[na:1.7.0_79]
>>>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
>>>> >>>>>>>>>>> Source) ~[na:1.7.0_79]
>>>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source)
>>>> ~[na:1.7.0_79]
>>>> >>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown
>>>> Source)
>>>> >>>>>>>>>>> ~[na:1.7.0_79]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>> >>>>>>>>>>>         at
>>>> >>>>>>>>>>>
>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>>>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source)
>>>> [na:1.7.0_79]
>>>>
>>>>
>>>>
>>>> --
>>>> Eric Evans
>>>> john.eric.evans@gmail.com
>>>>
>>>
>>>
>>
>

Re: Error while rebuilding a node: Stream failed

Posted by George Sigletos <si...@textkernel.nl>.
No luck unfortunately. It seems that the connection to the destination node
was lost.

However there was progress compared to the previous times. A lot more data
was streamed.

(From source node)
INFO  [GossipTasks:1] 2016-05-28 17:53:57,155 Gossiper.java:1008 -
InetAddress /54.172.235.227 is now DOWN
INFO  [HANDSHAKE-/54.172.235.227] 2016-05-28 17:53:58,238
OutboundTcpConnection.java:487 - Handshaking version with /54.172.235.227
ERROR [STREAM-IN-/54.172.235.227] 2016-05-28 17:54:08,938
StreamSession.java:505 - [Stream #d25a05c0-241f-11e6-bb50-1b05ac77baf9]
Streaming error occurred
java.io.IOException: Connection timed out
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_79]
        at sun.nio.ch.SocketDispatcher.read(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
~[na:1.7.0_79]
        at sun.nio.ch.IOUtil.read(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.SocketChannelImpl.read(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.SocketAdaptor$SocketInputStream.read(Unknown Source)
~[na:1.7.0_79]
        at sun.nio.ch.ChannelInputStream.read(Unknown Source) ~[na:1.7.0_79]
        at java.nio.channels.Channels$ReadableByteChannelImpl.read(Unknown
Source) ~[na:1.7.0_79]
        at
org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:51)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:257)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
INFO  [SharedPool-Worker-1] 2016-05-28 17:54:59,612 Gossiper.java:993 -
InetAddress /54.172.235.227 is now UP

On Fri, May 27, 2016 at 5:37 PM, George Sigletos <si...@textkernel.nl>
wrote:

> I am trying once more using more aggressive tcp settings, as recommended
> here
> <https://docs.datastax.com/en/cassandra/2.1/cassandra/troubleshooting/trblshootIdleFirewall.html>
>
> sudo sysctl -w net.ipv4.tcp_keepalive_time=60 net.ipv4.tcp_keepalive_probes=3 net.ipv4.tcp_keepalive_intvl=10
>
> (added to /etc/sysctl.conf and run sysctl -p /etc/sysctl.conf on all nodes)
>
> Let's see what happens. I don't know what else to try. I have even further
> increased streaming_socket_timeout_in_ms
>
>
>
> On Fri, May 27, 2016 at 4:56 PM, Paulo Motta <pa...@gmail.com>
> wrote:
>
>> I'm afraid raising streaming_socket_timeout_in_ms won't help much in this
>> case because the incoming connection on the source node is timing out on
>> the network layer, and streaming_socket_timeout_in_ms controls the socket
>> timeout in the app layer and throws SocketTimeoutException (not java.io.IOException:
>> Connection timed out). So you should probably use more aggressive tcp
>> keep-alive settings (net.ipv4.tcp_keepalive_*) on both hosts, did you try
>> tuning that? Even that might not be sufficient as some routers tend to
>> ignore tcp keep-alives and just kill idle connections.
>>
>> As said before, this will ultimately be fixed by adding keep-alive to the
>> app layer on CASSANDRA-11841. If tuning tcp keep-alives does not help, one
>> extreme approach would be to backport this to 2.1 (unless some experienced
>> operator out there has a more creative approach).
>>
>> @eevans, I'm not sure he is using a mixed version cluster, it seem he
>> finished the upgrade from 2.1.13 to 2.1.14 before performing the rebuild.
>>
>> 2016-05-27 11:39 GMT-03:00 Eric Evans <jo...@gmail.com>:
>>
>>> From the various stacktraces in this thread, it's obvious you are
>>> mixing versions 2.1.13 and 2.1.14.  Topology changes like this aren't
>>> supported with mixed Cassandra versions.  Sometimes it will work,
>>> sometimes it won't (and it will definitely not work in this instance).
>>>
>>> You should either upgrade your 2.1.13 nodes to 2.1.14 first, or add
>>> the new nodes using 2.1.13, and upgrade after.
>>>
>>> On Fri, May 27, 2016 at 8:41 AM, George Sigletos <si...@textkernel.nl>
>>> wrote:
>>>
>>> >>>> ERROR [STREAM-IN-/192.168.1.141] 2016-05-26 09:08:05,027
>>> >>>> StreamSession.java:505 - [Stream
>>> #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>>> >>>> Streaming error occurred
>>> >>>> java.lang.RuntimeException: Outgoing stream handler has been closed
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:138)
>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:568)
>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:457)
>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:263)
>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>> >>>>
>>> >>>> And this is from the source node:
>>> >>>>
>>> >>>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,097
>>> >>>> StreamSession.java:505 - [Stream
>>> #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>>> >>>> Streaming error occurred
>>> >>>> java.io.IOException: Broken pipe
>>> >>>>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>>> >>>> ~[na:1.7.0_79]
>>> >>>>         at sun.nio.ch.FileChannelImpl.transferToDirectly(Unknown
>>> Source)
>>> >>>> ~[na:1.7.0_79]
>>> >>>>         at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source)
>>> >>>> ~[na:1.7.0_79]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:84)
>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:88)
>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:49)
>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:358)
>>> >>>> [apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:330)
>>> >>>> [apache-cassandra-2.1.14.jar:2.1.14]
>>>
>>>
>>> >>>>>>>>>>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
>>> >>>>>>>>>>> StreamSession.java:620 - [Stream
>>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>> >>>>>>>>>>> Remote peer 192.168.1.140 failed stream session.
>>> >>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
>>> >>>>>>>>>>> StreamSession.java:505 - [Stream
>>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>> >>>>>>>>>>> Streaming error occurred
>>> >>>>>>>>>>> java.io.IOException: Connection timed out
>>> >>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native
>>> Method)
>>> >>>>>>>>>>> ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>> >>>>>>>>>>> ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
>>> >>>>>>>>>>> Source) ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source)
>>> ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>> >>>>>>>>>>> ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
>>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>> >>>>>>>>>>> INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
>>> >>>>>>>>>>> StreamResultFuture.java:180 - [Stream
>>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>> >>>>>>>>>>> Session with /192.168.1.140 is complete
>>> >>>>>>>>>>> WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
>>> >>>>>>>>>>> StreamResultFuture.java:207 - [Stream
>>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>> >>>>>>>>>>> Stream failed
>>> >>>>>>>>>>> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24
>>> 22:44:58,628
>>> >>>>>>>>>>> StorageService.java:1075 - Error while rebuilding node
>>> >>>>>>>>>>> org.apache.cassandra.streaming.StreamException: Stream failed
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
>>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source)
>>> ~[na:1.7.0_79]
>>> >>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
>>> >>>>>>>>>>> StreamSession.java:505 - [Stream
>>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>> >>>>>>>>>>> Streaming error occurred
>>> >>>>>>>>>>> java.io.IOException: Broken pipe
>>> >>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native
>>> Method)
>>> >>>>>>>>>>> ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>> >>>>>>>>>>> ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
>>> >>>>>>>>>>> Source) ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source)
>>> ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>> >>>>>>>>>>> ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>
>>>
>>>
>>> --
>>> Eric Evans
>>> john.eric.evans@gmail.com
>>>
>>
>>
>

Re: Error while rebuilding a node: Stream failed

Posted by George Sigletos <si...@textkernel.nl>.
I am trying once more using more aggressive tcp settings, as recommended
here
<https://docs.datastax.com/en/cassandra/2.1/cassandra/troubleshooting/trblshootIdleFirewall.html>

sudo sysctl -w net.ipv4.tcp_keepalive_time=60
net.ipv4.tcp_keepalive_probes=3 net.ipv4.tcp_keepalive_intvl=10

(added to /etc/sysctl.conf and run sysctl -p /etc/sysctl.conf on all nodes)

Let's see what happens. I don't know what else to try. I have even further
increased streaming_socket_timeout_in_ms



On Fri, May 27, 2016 at 4:56 PM, Paulo Motta <pa...@gmail.com>
wrote:

> I'm afraid raising streaming_socket_timeout_in_ms won't help much in this
> case because the incoming connection on the source node is timing out on
> the network layer, and streaming_socket_timeout_in_ms controls the socket
> timeout in the app layer and throws SocketTimeoutException (not java.io.IOException:
> Connection timed out). So you should probably use more aggressive tcp
> keep-alive settings (net.ipv4.tcp_keepalive_*) on both hosts, did you try
> tuning that? Even that might not be sufficient as some routers tend to
> ignore tcp keep-alives and just kill idle connections.
>
> As said before, this will ultimately be fixed by adding keep-alive to the
> app layer on CASSANDRA-11841. If tuning tcp keep-alives does not help, one
> extreme approach would be to backport this to 2.1 (unless some experienced
> operator out there has a more creative approach).
>
> @eevans, I'm not sure he is using a mixed version cluster, it seem he
> finished the upgrade from 2.1.13 to 2.1.14 before performing the rebuild.
>
> 2016-05-27 11:39 GMT-03:00 Eric Evans <jo...@gmail.com>:
>
>> From the various stacktraces in this thread, it's obvious you are
>> mixing versions 2.1.13 and 2.1.14.  Topology changes like this aren't
>> supported with mixed Cassandra versions.  Sometimes it will work,
>> sometimes it won't (and it will definitely not work in this instance).
>>
>> You should either upgrade your 2.1.13 nodes to 2.1.14 first, or add
>> the new nodes using 2.1.13, and upgrade after.
>>
>> On Fri, May 27, 2016 at 8:41 AM, George Sigletos <si...@textkernel.nl>
>> wrote:
>>
>> >>>> ERROR [STREAM-IN-/192.168.1.141] 2016-05-26 09:08:05,027
>> >>>> StreamSession.java:505 - [Stream
>> #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>> >>>> Streaming error occurred
>> >>>> java.lang.RuntimeException: Outgoing stream handler has been closed
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:138)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:568)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:457)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:263)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>> >>>>
>> >>>> And this is from the source node:
>> >>>>
>> >>>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,097
>> >>>> StreamSession.java:505 - [Stream
>> #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>> >>>> Streaming error occurred
>> >>>> java.io.IOException: Broken pipe
>> >>>>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>> >>>> ~[na:1.7.0_79]
>> >>>>         at sun.nio.ch.FileChannelImpl.transferToDirectly(Unknown
>> Source)
>> >>>> ~[na:1.7.0_79]
>> >>>>         at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source)
>> >>>> ~[na:1.7.0_79]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:84)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:88)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:49)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:358)
>> >>>> [apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:330)
>> >>>> [apache-cassandra-2.1.14.jar:2.1.14]
>>
>>
>> >>>>>>>>>>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
>> >>>>>>>>>>> StreamSession.java:620 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> >>>>>>>>>>> Remote peer 192.168.1.140 failed stream session.
>> >>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
>> >>>>>>>>>>> StreamSession.java:505 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> >>>>>>>>>>> Streaming error occurred
>> >>>>>>>>>>> java.io.IOException: Connection timed out
>> >>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>> >>>>>>>>>>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>> >>>>>>>>>>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
>> >>>>>>>>>>> Source) ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source)
>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>> >>>>>>>>>>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>> >>>>>>>>>>> INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
>> >>>>>>>>>>> StreamResultFuture.java:180 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> >>>>>>>>>>> Session with /192.168.1.140 is complete
>> >>>>>>>>>>> WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
>> >>>>>>>>>>> StreamResultFuture.java:207 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> >>>>>>>>>>> Stream failed
>> >>>>>>>>>>> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24
>> 22:44:58,628
>> >>>>>>>>>>> StorageService.java:1075 - Error while rebuilding node
>> >>>>>>>>>>> org.apache.cassandra.streaming.StreamException: Stream failed
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) ~[na:1.7.0_79]
>> >>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
>> >>>>>>>>>>> StreamSession.java:505 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> >>>>>>>>>>> Streaming error occurred
>> >>>>>>>>>>> java.io.IOException: Broken pipe
>> >>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>> >>>>>>>>>>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>> >>>>>>>>>>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
>> >>>>>>>>>>> Source) ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source)
>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>> >>>>>>>>>>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>
>>
>>
>> --
>> Eric Evans
>> john.eric.evans@gmail.com
>>
>
>

Re: Error while rebuilding a node: Stream failed

Posted by Paulo Motta <pa...@gmail.com>.
I'm afraid raising streaming_socket_timeout_in_ms won't help much in this
case because the incoming connection on the source node is timing out on
the network layer, and streaming_socket_timeout_in_ms controls the socket
timeout in the app layer and throws SocketTimeoutException (not
java.io.IOException:
Connection timed out). So you should probably use more aggressive tcp
keep-alive settings (net.ipv4.tcp_keepalive_*) on both hosts, did you try
tuning that? Even that might not be sufficient as some routers tend to
ignore tcp keep-alives and just kill idle connections.

As said before, this will ultimately be fixed by adding keep-alive to the
app layer on CASSANDRA-11841. If tuning tcp keep-alives does not help, one
extreme approach would be to backport this to 2.1 (unless some experienced
operator out there has a more creative approach).

@eevans, I'm not sure he is using a mixed version cluster, it seem he
finished the upgrade from 2.1.13 to 2.1.14 before performing the rebuild.

2016-05-27 11:39 GMT-03:00 Eric Evans <jo...@gmail.com>:

> From the various stacktraces in this thread, it's obvious you are
> mixing versions 2.1.13 and 2.1.14.  Topology changes like this aren't
> supported with mixed Cassandra versions.  Sometimes it will work,
> sometimes it won't (and it will definitely not work in this instance).
>
> You should either upgrade your 2.1.13 nodes to 2.1.14 first, or add
> the new nodes using 2.1.13, and upgrade after.
>
> On Fri, May 27, 2016 at 8:41 AM, George Sigletos <si...@textkernel.nl>
> wrote:
>
> >>>> ERROR [STREAM-IN-/192.168.1.141] 2016-05-26 09:08:05,027
> >>>> StreamSession.java:505 - [Stream
> #74c57bc0-231a-11e6-a698-1b05ac77baf9]
> >>>> Streaming error occurred
> >>>> java.lang.RuntimeException: Outgoing stream handler has been closed
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:138)
> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:568)
> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:457)
> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:263)
> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
> >>>>
> >>>> And this is from the source node:
> >>>>
> >>>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,097
> >>>> StreamSession.java:505 - [Stream
> #74c57bc0-231a-11e6-a698-1b05ac77baf9]
> >>>> Streaming error occurred
> >>>> java.io.IOException: Broken pipe
> >>>>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
> >>>> ~[na:1.7.0_79]
> >>>>         at sun.nio.ch.FileChannelImpl.transferToDirectly(Unknown
> Source)
> >>>> ~[na:1.7.0_79]
> >>>>         at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source)
> >>>> ~[na:1.7.0_79]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:84)
> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:88)
> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:49)
> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:358)
> >>>> [apache-cassandra-2.1.14.jar:2.1.14]
> >>>>         at
> >>>>
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:330)
> >>>> [apache-cassandra-2.1.14.jar:2.1.14]
>
>
> >>>>>>>>>>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
> >>>>>>>>>>> StreamSession.java:620 - [Stream
> #2c290460-20d4-11e6-930f-1b05ac77baf9]
> >>>>>>>>>>> Remote peer 192.168.1.140 failed stream session.
> >>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
> >>>>>>>>>>> StreamSession.java:505 - [Stream
> #2c290460-20d4-11e6-930f-1b05ac77baf9]
> >>>>>>>>>>> Streaming error occurred
> >>>>>>>>>>> java.io.IOException: Connection timed out
> >>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> >>>>>>>>>>> ~[na:1.7.0_79]
> >>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
> >>>>>>>>>>> ~[na:1.7.0_79]
> >>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
> >>>>>>>>>>> Source) ~[na:1.7.0_79]
> >>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source)
> ~[na:1.7.0_79]
> >>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
> >>>>>>>>>>> ~[na:1.7.0_79]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
> >>>>>>>>>>> INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
> >>>>>>>>>>> StreamResultFuture.java:180 - [Stream
> #2c290460-20d4-11e6-930f-1b05ac77baf9]
> >>>>>>>>>>> Session with /192.168.1.140 is complete
> >>>>>>>>>>> WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
> >>>>>>>>>>> StreamResultFuture.java:207 - [Stream
> #2c290460-20d4-11e6-930f-1b05ac77baf9]
> >>>>>>>>>>> Stream failed
> >>>>>>>>>>> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24
> 22:44:58,628
> >>>>>>>>>>> StorageService.java:1075 - Error while rebuilding node
> >>>>>>>>>>> org.apache.cassandra.streaming.StreamException: Stream failed
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
> >>>>>>>>>>> ~[guava-16.0.jar:na]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
> >>>>>>>>>>> ~[guava-16.0.jar:na]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
> >>>>>>>>>>> ~[guava-16.0.jar:na]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
> >>>>>>>>>>> ~[guava-16.0.jar:na]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
> >>>>>>>>>>> ~[guava-16.0.jar:na]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) ~[na:1.7.0_79]
> >>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
> >>>>>>>>>>> StreamSession.java:505 - [Stream
> #2c290460-20d4-11e6-930f-1b05ac77baf9]
> >>>>>>>>>>> Streaming error occurred
> >>>>>>>>>>> java.io.IOException: Broken pipe
> >>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> >>>>>>>>>>> ~[na:1.7.0_79]
> >>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
> >>>>>>>>>>> ~[na:1.7.0_79]
> >>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
> >>>>>>>>>>> Source) ~[na:1.7.0_79]
> >>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source)
> ~[na:1.7.0_79]
> >>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
> >>>>>>>>>>> ~[na:1.7.0_79]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>
>
>
> --
> Eric Evans
> john.eric.evans@gmail.com
>

Re: Error while rebuilding a node: Stream failed

Posted by Eric Evans <jo...@gmail.com>.
From the various stacktraces in this thread, it's obvious you are
mixing versions 2.1.13 and 2.1.14.  Topology changes like this aren't
supported with mixed Cassandra versions.  Sometimes it will work,
sometimes it won't (and it will definitely not work in this instance).

You should either upgrade your 2.1.13 nodes to 2.1.14 first, or add
the new nodes using 2.1.13, and upgrade after.

On Fri, May 27, 2016 at 8:41 AM, George Sigletos <si...@textkernel.nl> wrote:

>>>> ERROR [STREAM-IN-/192.168.1.141] 2016-05-26 09:08:05,027
>>>> StreamSession.java:505 - [Stream #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>>>> Streaming error occurred
>>>> java.lang.RuntimeException: Outgoing stream handler has been closed
>>>>         at
>>>> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:138)
>>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>>         at
>>>> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:568)
>>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>>         at
>>>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:457)
>>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>>         at
>>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:263)
>>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>
>>>> And this is from the source node:
>>>>
>>>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,097
>>>> StreamSession.java:505 - [Stream #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>>>> Streaming error occurred
>>>> java.io.IOException: Broken pipe
>>>>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>>>> ~[na:1.7.0_79]
>>>>         at sun.nio.ch.FileChannelImpl.transferToDirectly(Unknown Source)
>>>> ~[na:1.7.0_79]
>>>>         at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source)
>>>> ~[na:1.7.0_79]
>>>>         at
>>>> org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:84)
>>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>>         at
>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:88)
>>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>>         at
>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:49)
>>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>>         at
>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>>         at
>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>>         at
>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:358)
>>>> [apache-cassandra-2.1.14.jar:2.1.14]
>>>>         at
>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:330)
>>>> [apache-cassandra-2.1.14.jar:2.1.14]


>>>>>>>>>>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
>>>>>>>>>>> StreamSession.java:620 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>>>>>> Remote peer 192.168.1.140 failed stream session.
>>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
>>>>>>>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>>>>>> Streaming error occurred
>>>>>>>>>>> java.io.IOException: Connection timed out
>>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
>>>>>>>>>>> Source) ~[na:1.7.0_79]
>>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
>>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>>>>>>>> INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
>>>>>>>>>>> StreamResultFuture.java:180 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>>>>>> Session with /192.168.1.140 is complete
>>>>>>>>>>> WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
>>>>>>>>>>> StreamResultFuture.java:207 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>>>>>> Stream failed
>>>>>>>>>>> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24 22:44:58,628
>>>>>>>>>>> StorageService.java:1075 - Error while rebuilding node
>>>>>>>>>>> org.apache.cassandra.streaming.StreamException: Stream failed
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>>         at
>>>>>>>>>>> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
>>>>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>>>>         at
>>>>>>>>>>> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>>>>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>>>>         at
>>>>>>>>>>> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>>>>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>>>>         at
>>>>>>>>>>> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>>>>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>>>>         at
>>>>>>>>>>> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>>>>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
>>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
>>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
>>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
>>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
>>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
>>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) ~[na:1.7.0_79]
>>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
>>>>>>>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>>>>>> Streaming error occurred
>>>>>>>>>>> java.io.IOException: Broken pipe
>>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
>>>>>>>>>>> Source) ~[na:1.7.0_79]
>>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]



-- 
Eric Evans
john.eric.evans@gmail.com

Re: Error while rebuilding a node: Stream failed

Posted by George Sigletos <si...@textkernel.nl>.
Still failing.

Should I maybe set a higher value for streaming_socket_timeout_in_ms? Maybe
2-3 days?

Source: node
ERROR [STREAM-OUT-/54.172.235.227] 2016-05-27 14:30:34,401
StreamSession.java:505 - [Stream #45017970-234c-11e6-9452-1b05ac77baf9]
Streaming error occurred
java.lang.AssertionError: Memory was freed
        at
org.apache.cassandra.io.util.SafeMemory.checkBounds(SafeMemory.java:97)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at org.apache.cassandra.io.util.Memory.getLong(Memory.java:249)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.io.compress.CompressionMetadata.getTotalSizeForSections(CompressionMetadata.java:247)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.messages.FileMessageHeader.size(FileMessageHeader.java:112)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:546)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:358)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:338)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]

(Previous error on source node 6 hours before)
ERROR [STREAM-IN-/54.172.235.227] 2016-05-27 08:31:45,868
StreamSession.java:505 - [Stream #45017970-234c-11e6-9452-1b05ac77baf9]
Streaming error occurred
java.io.IOException: Connection timed out
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_79]
        at sun.nio.ch.SocketDispatcher.read(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
~[na:1.7.0_79]
        at sun.nio.ch.IOUtil.read(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.SocketChannelImpl.read(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.SocketAdaptor$SocketInputStream.read(Unknown Source)
~[na:1.7.0_79]
        at sun.nio.ch.ChannelInputStream.read(Unknown Source) ~[na:1.7.0_79]
        at java.nio.channels.Channels$ReadableByteChannelImpl.read(Unknown
Source) ~[na:1.7.0_79]
        at
org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:51)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:257)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]


Destination node (amazon):
ERROR [STREAM-OUT-/192.168.3.4] 2016-05-27 12:30:37,116
StreamSession.java:505 - [Stream #45017970-234c-11e6-9452-1b05ac77baf9]
Streaming error occurred
java.io.IOException: Connection timed out
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
~[na:1.7.0_79]
        at sun.nio.ch.SocketDispatcher.write(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
~[na:1.7.0_79]
        at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.SocketChannelImpl.write(Unknown Source) ~[na:1.7.0_79]
        at
org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:358)
[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:330)
[apache-cassandra-2.1.14.jar:2.1.14]
        at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
INFO  [STREAM-IN-/192.168.3.4] 2016-05-27 12:30:37,593
StreamResultFuture.java:180 - [Stream
#45017970-234c-11e6-9452-1b05ac77baf9] Session with /192.168.3.4 is complete
ERROR [STREAM-OUT-/192.168.3.4] 2016-05-27 12:30:37,594
StreamSession.java:505 - [Stream #45017970-234c-11e6-9452-1b05ac77baf9]
Streaming error occurred
java.io.IOException: Broken pipe
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
~[na:1.7.0_79]
        at sun.nio.ch.SocketDispatcher.write(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
~[na:1.7.0_79]
        at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.SocketChannelImpl.write(Unknown Source) ~[na:1.7.0_79]
        at
org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:358)
[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:338)
[apache-cassandra-2.1.14.jar:2.1.14]
        at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]

On Thu, May 26, 2016 at 7:05 PM, George Sigletos <si...@textkernel.nl>
wrote:

> The time the first streaming failure occurs varies from a few hours to 1+
> day.
>
> We also experience slowness problems with the destination node on Amazon.
> Rebuild is slow. That may also contribute to the problem.
>
> Unfortunately we only kept the logs of the source node and there is no
> other error prior to the streaming failure.
>
> Only compaction, flushing and writing memtable info messages.
>
> We are running the rebuild once more using destination node's external IP.
> If it fails again I will post the errors here.
>
> On Thu, May 26, 2016 at 5:20 PM, Paulo Motta <pa...@gmail.com>
> wrote:
>
>> How long does it take after you trigger the rebuild process before it
>> fails?
>>
>> Was there any error before [STREAM-IN-/192.168.1.141] on the destination
>> node or [STREAM-OUT-/172.31.22.104] on the source node? Those are
>> showing consequences of the root error. In particular what were the last
>> messages on [STREAM-OUT-/192.168.1.141] and [STREAM-IN-/172.31.22.104] ?
>>
>> > Streaming does not seem to be resumed again from this node. Shall I
>> just kill again the entire rebuild process?
>>
>> Yes, resumable rebuild will be supported on CASSANDRA-10810.
>>
>> 2016-05-26 8:20 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>>
>>> I tried again with setting streaming_socket_timeout_in_ms to 1 day on
>>> all nodes and after having upgraded to 2.1.14.
>>>
>>> My tcp_keep_alive_time is set to 2 hours and tcp_keepalive_probes to 9.
>>> That should be ok I would believe.
>>>
>>> I get streaming error again, shortly after starting the rebuild process.
>>> This is from the destination node:
>>>
>>> ERROR [STREAM-IN-/192.168.1.141] 2016-05-26 09:08:05,027
>>> StreamSession.java:505 - [Stream #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>>> Streaming error occurred
>>> java.lang.RuntimeException: Outgoing stream handler has been closed
>>>         at
>>> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:138)
>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>         at
>>> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:568)
>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>         at
>>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:457)
>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>         at
>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:263)
>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>
>>> And this is from the source node:
>>>
>>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,097
>>> StreamSession.java:505 - [Stream #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>>> Streaming error occurred
>>> java.io.IOException: Broken pipe
>>>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>>> ~[na:1.7.0_79]
>>>         at sun.nio.ch.FileChannelImpl.transferToDirectly(Unknown Source)
>>> ~[na:1.7.0_79]
>>>         at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source)
>>> ~[na:1.7.0_79]
>>>         at
>>> org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:84)
>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>         at
>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:88)
>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>         at
>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:49)
>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>         at
>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>         at
>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>>         at
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:358)
>>> [apache-cassandra-2.1.14.jar:2.1.14]
>>>         at
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:330)
>>> [apache-cassandra-2.1.14.jar:2.1.14]
>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>> INFO  [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,111
>>> StreamResultFuture.java:180 - [Stream
>>> #74c57bc0-231a-11e6-a698-1b05ac77baf9] Session with /172.31.22.104 is
>>> complete
>>> WARN  [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,114
>>> StreamResultFuture.java:207 - [Stream
>>> #74c57bc0-231a-11e6-a698-1b05ac77baf9] Stream failed
>>>
>>> > Streaming does not seem to be resumed again from this node. Shall I
>>> just kill again the entire rebuild process?
>>>
>>
>>> On Thu, May 26, 2016 at 12:17 AM, Paulo Motta <pa...@gmail.com>
>>> wrote:
>>>
>>>> If increasing or disabling streaming_socket_timeout_in_ms on the source
>>>> node does not fix it, you may want to have a look on your tcp keep alive
>>>> settings on the source and destination nodes as intermediate
>>>> routers/firewalls may be killing the connections due to inactivity. See
>>>> this for more information:
>>>> https://docs.datastax.com/en/cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html
>>>>
>>>> This will ultimately fixed by CASSANDRA-11841 by adding keep-alive to
>>>> the streaming protocol.
>>>>
>>>> 2016-05-25 18:09 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>>>>
>>>>> Thanks a lot for your help. I will try that tomorrow. The first time
>>>>> that I tried to rebuild, streaming_socket_timeout_in_ms was 0 and still
>>>>> failed. Below is the directly previous error on the source node:
>>>>>
>>>>> ERROR [STREAM-IN-/172.31.22.104] 2016-05-24 22:32:20,437
>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>> Streaming error occurred
>>>>> java.io.IOException: Connection timed out
>>>>>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>>>> ~[na:1.7.0_79]
>>>>>         at sun.nio.ch.SocketDispatcher.read(Unknown Source)
>>>>> ~[na:1.7.0_79]
>>>>>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
>>>>> ~[na:1.7.0_79]
>>>>>         at sun.nio.ch.IOUtil.read(Unknown Source) ~[na:1.7.0_79]
>>>>>         at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
>>>>> ~[na:1.7.0_79]
>>>>>         at
>>>>> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:51)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:250)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>>
>>>>> On Wed, May 25, 2016 at 10:28 PM, Paulo Motta <
>>>>> pauloricardomg@gmail.com> wrote:
>>>>>
>>>>>> > Workaround is to set to a larger streaming_socket_timeout_in_ms **on
>>>>>> the source node**., the new default will be 86400000ms (1 day).
>>>>>>
>>>>>> 2016-05-25 17:23 GMT-03:00 Paulo Motta <pa...@gmail.com>:
>>>>>>
>>>>>>> Was there any other ERROR preceding this on this node (in particular
>>>>>>> the last few lines of [STREAM-IN-/172.31.22.104])? If it's a
>>>>>>> SocketTimeoutException, then what is happening is that the default
>>>>>>> streaming socket timeout of 1 hour is not sufficient to stream a single
>>>>>>> file and the stream session is failed. Workaround is to set to a larger
>>>>>>> streaming_socket_timeout_in_ms, the new default will be 86400000ms
>>>>>>> (1 day).
>>>>>>>
>>>>>>> We are addressing this on
>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-11839.
>>>>>>>
>>>>>>> 2016-05-25 16:42 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>>>>>>>
>>>>>>>> Hello again,
>>>>>>>>
>>>>>>>> Here is the error message from the source
>>>>>>>>
>>>>>>>> INFO  [STREAM-IN-/172.31.22.104] 2016-05-25 00:44:57,275
>>>>>>>> StreamResultFuture.java:180 - [Stream
>>>>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /172.31.22.104
>>>>>>>> is complete
>>>>>>>> WARN  [STREAM-IN-/172.31.22.104] 2016-05-25 00:44:57,276
>>>>>>>> StreamResultFuture.java:207 - [Stream
>>>>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
>>>>>>>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-25 00:44:57,353
>>>>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>>> Streaming error occurred
>>>>>>>> java.lang.AssertionError: Memory was freed
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.io.util.SafeMemory.checkBounds(SafeMemory.java:97)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.io.util.Memory.getLong(Memory.java:249)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.io.compress.CompressionMetadata.getTotalSizeForSections(CompressionMetadata.java:247)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.messages.FileMessageHeader.size(FileMessageHeader.java:112)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:546)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>>>>>
>>>>>>>> On Wed, May 25, 2016 at 8:49 PM, Paulo Motta <
>>>>>>>> pauloricardomg@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> This is the log of the destination/rebuilding node, you need to
>>>>>>>>> check what is the error message on the stream source node (192.168.1.140).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2016-05-25 15:22 GMT-03:00 George Sigletos <sigletos@textkernel.nl
>>>>>>>>> >:
>>>>>>>>>
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> Here is additional stack trace from system.log:
>>>>>>>>>>
>>>>>>>>>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
>>>>>>>>>> StreamSession.java:620 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>>>>> Remote peer 192.168.1.140 failed stream session.
>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
>>>>>>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>>>>> Streaming error occurred
>>>>>>>>>> java.io.IOException: Connection timed out
>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
>>>>>>>>>> Source) ~[na:1.7.0_79]
>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>>         at
>>>>>>>>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>         at
>>>>>>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>         at
>>>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>         at
>>>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>>>>>>> INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
>>>>>>>>>> StreamResultFuture.java:180 - [Stream
>>>>>>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /
>>>>>>>>>> 192.168.1.140 is complete
>>>>>>>>>> WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
>>>>>>>>>> StreamResultFuture.java:207 - [Stream
>>>>>>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
>>>>>>>>>> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24 22:44:58,628
>>>>>>>>>> StorageService.java:1075 - Error while rebuilding node
>>>>>>>>>> org.apache.cassandra.streaming.StreamException: Stream failed
>>>>>>>>>>         at
>>>>>>>>>> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>         at
>>>>>>>>>> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
>>>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>>>         at
>>>>>>>>>> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>>>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>>>         at
>>>>>>>>>> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>>>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>>>         at
>>>>>>>>>> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>>>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>>>         at
>>>>>>>>>> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>>>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>>>         at
>>>>>>>>>> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>         at
>>>>>>>>>> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>         at
>>>>>>>>>> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>         at
>>>>>>>>>> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>         at
>>>>>>>>>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>         at
>>>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) ~[na:1.7.0_79]
>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
>>>>>>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>>>>> Streaming error occurred
>>>>>>>>>> java.io.IOException: Broken pipe
>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
>>>>>>>>>> Source) ~[na:1.7.0_79]
>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>>         at
>>>>>>>>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>         at
>>>>>>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>         at
>>>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>         at
>>>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, May 25, 2016 at 5:23 PM, Paulo Motta <
>>>>>>>>>> pauloricardomg@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> The stack trace from the rebuild command not show the root cause
>>>>>>>>>>> of the rebuild stream error. Can you check the system.log for ERROR logs
>>>>>>>>>>> during streaming and paste here?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Error while rebuilding a node: Stream failed

Posted by George Sigletos <si...@textkernel.nl>.
The time the first streaming failure occurs varies from a few hours to 1+
day.

We also experience slowness problems with the destination node on Amazon.
Rebuild is slow. That may also contribute to the problem.

Unfortunately we only kept the logs of the source node and there is no
other error prior to the streaming failure.

Only compaction, flushing and writing memtable info messages.

We are running the rebuild once more using destination node's external IP.
If it fails again I will post the errors here.

On Thu, May 26, 2016 at 5:20 PM, Paulo Motta <pa...@gmail.com>
wrote:

> How long does it take after you trigger the rebuild process before it
> fails?
>
> Was there any error before [STREAM-IN-/192.168.1.141] on the destination
> node or [STREAM-OUT-/172.31.22.104] on the source node? Those are showing
> consequences of the root error. In particular what were the last messages
> on [STREAM-OUT-/192.168.1.141] and [STREAM-IN-/172.31.22.104] ?
>
> > Streaming does not seem to be resumed again from this node. Shall I just
> kill again the entire rebuild process?
>
> Yes, resumable rebuild will be supported on CASSANDRA-10810.
>
> 2016-05-26 8:20 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>
>> I tried again with setting streaming_socket_timeout_in_ms to 1 day on all
>> nodes and after having upgraded to 2.1.14.
>>
>> My tcp_keep_alive_time is set to 2 hours and tcp_keepalive_probes to 9.
>> That should be ok I would believe.
>>
>> I get streaming error again, shortly after starting the rebuild process.
>> This is from the destination node:
>>
>> ERROR [STREAM-IN-/192.168.1.141] 2016-05-26 09:08:05,027
>> StreamSession.java:505 - [Stream #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>> Streaming error occurred
>> java.lang.RuntimeException: Outgoing stream handler has been closed
>>         at
>> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:138)
>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>         at
>> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:568)
>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>         at
>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:457)
>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>         at
>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:263)
>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>
>> And this is from the source node:
>>
>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,097
>> StreamSession.java:505 - [Stream #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>> Streaming error occurred
>> java.io.IOException: Broken pipe
>>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>> ~[na:1.7.0_79]
>>         at sun.nio.ch.FileChannelImpl.transferToDirectly(Unknown Source)
>> ~[na:1.7.0_79]
>>         at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source)
>> ~[na:1.7.0_79]
>>         at
>> org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:84)
>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>         at
>> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:88)
>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>         at
>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:49)
>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>         at
>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>         at
>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>         at
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:358)
>> [apache-cassandra-2.1.14.jar:2.1.14]
>>         at
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:330)
>> [apache-cassandra-2.1.14.jar:2.1.14]
>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>> INFO  [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,111
>> StreamResultFuture.java:180 - [Stream
>> #74c57bc0-231a-11e6-a698-1b05ac77baf9] Session with /172.31.22.104 is
>> complete
>> WARN  [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,114
>> StreamResultFuture.java:207 - [Stream
>> #74c57bc0-231a-11e6-a698-1b05ac77baf9] Stream failed
>>
>> > Streaming does not seem to be resumed again from this node. Shall I
>> just kill again the entire rebuild process?
>>
>
>> On Thu, May 26, 2016 at 12:17 AM, Paulo Motta <pa...@gmail.com>
>> wrote:
>>
>>> If increasing or disabling streaming_socket_timeout_in_ms on the source
>>> node does not fix it, you may want to have a look on your tcp keep alive
>>> settings on the source and destination nodes as intermediate
>>> routers/firewalls may be killing the connections due to inactivity. See
>>> this for more information:
>>> https://docs.datastax.com/en/cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html
>>>
>>> This will ultimately fixed by CASSANDRA-11841 by adding keep-alive to
>>> the streaming protocol.
>>>
>>> 2016-05-25 18:09 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>>>
>>>> Thanks a lot for your help. I will try that tomorrow. The first time
>>>> that I tried to rebuild, streaming_socket_timeout_in_ms was 0 and still
>>>> failed. Below is the directly previous error on the source node:
>>>>
>>>> ERROR [STREAM-IN-/172.31.22.104] 2016-05-24 22:32:20,437
>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>> Streaming error occurred
>>>> java.io.IOException: Connection timed out
>>>>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>>> ~[na:1.7.0_79]
>>>>         at sun.nio.ch.SocketDispatcher.read(Unknown Source)
>>>> ~[na:1.7.0_79]
>>>>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
>>>> ~[na:1.7.0_79]
>>>>         at sun.nio.ch.IOUtil.read(Unknown Source) ~[na:1.7.0_79]
>>>>         at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
>>>> ~[na:1.7.0_79]
>>>>         at
>>>> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:51)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:250)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>
>>>> On Wed, May 25, 2016 at 10:28 PM, Paulo Motta <pauloricardomg@gmail.com
>>>> > wrote:
>>>>
>>>>> > Workaround is to set to a larger streaming_socket_timeout_in_ms **on
>>>>> the source node**., the new default will be 86400000ms (1 day).
>>>>>
>>>>> 2016-05-25 17:23 GMT-03:00 Paulo Motta <pa...@gmail.com>:
>>>>>
>>>>>> Was there any other ERROR preceding this on this node (in particular
>>>>>> the last few lines of [STREAM-IN-/172.31.22.104])? If it's a
>>>>>> SocketTimeoutException, then what is happening is that the default
>>>>>> streaming socket timeout of 1 hour is not sufficient to stream a single
>>>>>> file and the stream session is failed. Workaround is to set to a larger
>>>>>> streaming_socket_timeout_in_ms, the new default will be 86400000ms
>>>>>> (1 day).
>>>>>>
>>>>>> We are addressing this on
>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-11839.
>>>>>>
>>>>>> 2016-05-25 16:42 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>>>>>>
>>>>>>> Hello again,
>>>>>>>
>>>>>>> Here is the error message from the source
>>>>>>>
>>>>>>> INFO  [STREAM-IN-/172.31.22.104] 2016-05-25 00:44:57,275
>>>>>>> StreamResultFuture.java:180 - [Stream
>>>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /172.31.22.104
>>>>>>> is complete
>>>>>>> WARN  [STREAM-IN-/172.31.22.104] 2016-05-25 00:44:57,276
>>>>>>> StreamResultFuture.java:207 - [Stream
>>>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
>>>>>>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-25 00:44:57,353
>>>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>> Streaming error occurred
>>>>>>> java.lang.AssertionError: Memory was freed
>>>>>>>         at
>>>>>>> org.apache.cassandra.io.util.SafeMemory.checkBounds(SafeMemory.java:97)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.io.util.Memory.getLong(Memory.java:249)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.io.compress.CompressionMetadata.getTotalSizeForSections(CompressionMetadata.java:247)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.messages.FileMessageHeader.size(FileMessageHeader.java:112)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:546)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>>>>
>>>>>>> On Wed, May 25, 2016 at 8:49 PM, Paulo Motta <
>>>>>>> pauloricardomg@gmail.com> wrote:
>>>>>>>
>>>>>>>> This is the log of the destination/rebuilding node, you need to
>>>>>>>> check what is the error message on the stream source node (192.168.1.140).
>>>>>>>>
>>>>>>>>
>>>>>>>> 2016-05-25 15:22 GMT-03:00 George Sigletos <si...@textkernel.nl>
>>>>>>>> :
>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> Here is additional stack trace from system.log:
>>>>>>>>>
>>>>>>>>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
>>>>>>>>> StreamSession.java:620 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>>>> Remote peer 192.168.1.140 failed stream session.
>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
>>>>>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>>>> Streaming error occurred
>>>>>>>>> java.io.IOException: Connection timed out
>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>         at
>>>>>>>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>         at
>>>>>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>         at
>>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>         at
>>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>>>>>> INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
>>>>>>>>> StreamResultFuture.java:180 - [Stream
>>>>>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /192.168.1.140
>>>>>>>>> is complete
>>>>>>>>> WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
>>>>>>>>> StreamResultFuture.java:207 - [Stream
>>>>>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
>>>>>>>>> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24 22:44:58,628
>>>>>>>>> StorageService.java:1075 - Error while rebuilding node
>>>>>>>>> org.apache.cassandra.streaming.StreamException: Stream failed
>>>>>>>>>         at
>>>>>>>>> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>         at
>>>>>>>>> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
>>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>>         at
>>>>>>>>> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>>         at
>>>>>>>>> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>>         at
>>>>>>>>> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>>         at
>>>>>>>>> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>>         at
>>>>>>>>> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>         at
>>>>>>>>> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>         at
>>>>>>>>> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>         at
>>>>>>>>> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>         at
>>>>>>>>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>         at
>>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>         at java.lang.Thread.run(Unknown Source) ~[na:1.7.0_79]
>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
>>>>>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>>>> Streaming error occurred
>>>>>>>>> java.io.IOException: Broken pipe
>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>>         at
>>>>>>>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>         at
>>>>>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>         at
>>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>         at
>>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, May 25, 2016 at 5:23 PM, Paulo Motta <
>>>>>>>>> pauloricardomg@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> The stack trace from the rebuild command not show the root cause
>>>>>>>>>> of the rebuild stream error. Can you check the system.log for ERROR logs
>>>>>>>>>> during streaming and paste here?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Error while rebuilding a node: Stream failed

Posted by Paulo Motta <pa...@gmail.com>.
How long does it take after you trigger the rebuild process before it fails?

Was there any error before [STREAM-IN-/192.168.1.141] on the destination
node or [STREAM-OUT-/172.31.22.104] on the source node? Those are showing
consequences of the root error. In particular what were the last messages
on [STREAM-OUT-/192.168.1.141] and [STREAM-IN-/172.31.22.104] ?

> Streaming does not seem to be resumed again from this node. Shall I just
kill again the entire rebuild process?

Yes, resumable rebuild will be supported on CASSANDRA-10810.

2016-05-26 8:20 GMT-03:00 George Sigletos <si...@textkernel.nl>:

> I tried again with setting streaming_socket_timeout_in_ms to 1 day on all
> nodes and after having upgraded to 2.1.14.
>
> My tcp_keep_alive_time is set to 2 hours and tcp_keepalive_probes to 9.
> That should be ok I would believe.
>
> I get streaming error again, shortly after starting the rebuild process.
> This is from the destination node:
>
> ERROR [STREAM-IN-/192.168.1.141] 2016-05-26 09:08:05,027
> StreamSession.java:505 - [Stream #74c57bc0-231a-11e6-a698-1b05ac77baf9]
> Streaming error occurred
> java.lang.RuntimeException: Outgoing stream handler has been closed
>         at
> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:138)
> ~[apache-cassandra-2.1.14.jar:2.1.14]
>         at
> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:568)
> ~[apache-cassandra-2.1.14.jar:2.1.14]
>         at
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:457)
> ~[apache-cassandra-2.1.14.jar:2.1.14]
>         at
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:263)
> ~[apache-cassandra-2.1.14.jar:2.1.14]
>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>
> And this is from the source node:
>
> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,097
> StreamSession.java:505 - [Stream #74c57bc0-231a-11e6-a698-1b05ac77baf9]
> Streaming error occurred
> java.io.IOException: Broken pipe
>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
> ~[na:1.7.0_79]
>         at sun.nio.ch.FileChannelImpl.transferToDirectly(Unknown Source)
> ~[na:1.7.0_79]
>         at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source)
> ~[na:1.7.0_79]
>         at
> org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:84)
> ~[apache-cassandra-2.1.14.jar:2.1.14]
>         at
> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:88)
> ~[apache-cassandra-2.1.14.jar:2.1.14]
>         at
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:49)
> ~[apache-cassandra-2.1.14.jar:2.1.14]
>         at
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
> ~[apache-cassandra-2.1.14.jar:2.1.14]
>         at
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
> ~[apache-cassandra-2.1.14.jar:2.1.14]
>         at
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:358)
> [apache-cassandra-2.1.14.jar:2.1.14]
>         at
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:330)
> [apache-cassandra-2.1.14.jar:2.1.14]
>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
> INFO  [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,111
> StreamResultFuture.java:180 - [Stream
> #74c57bc0-231a-11e6-a698-1b05ac77baf9] Session with /172.31.22.104 is
> complete
> WARN  [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,114
> StreamResultFuture.java:207 - [Stream
> #74c57bc0-231a-11e6-a698-1b05ac77baf9] Stream failed
>
> > Streaming does not seem to be resumed again from this node. Shall I just
> kill again the entire rebuild process?
>

> On Thu, May 26, 2016 at 12:17 AM, Paulo Motta <pa...@gmail.com>
> wrote:
>
>> If increasing or disabling streaming_socket_timeout_in_ms on the source
>> node does not fix it, you may want to have a look on your tcp keep alive
>> settings on the source and destination nodes as intermediate
>> routers/firewalls may be killing the connections due to inactivity. See
>> this for more information:
>> https://docs.datastax.com/en/cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html
>>
>> This will ultimately fixed by CASSANDRA-11841 by adding keep-alive to the
>> streaming protocol.
>>
>> 2016-05-25 18:09 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>>
>>> Thanks a lot for your help. I will try that tomorrow. The first time
>>> that I tried to rebuild, streaming_socket_timeout_in_ms was 0 and still
>>> failed. Below is the directly previous error on the source node:
>>>
>>> ERROR [STREAM-IN-/172.31.22.104] 2016-05-24 22:32:20,437
>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>> Streaming error occurred
>>> java.io.IOException: Connection timed out
>>>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>> ~[na:1.7.0_79]
>>>         at sun.nio.ch.SocketDispatcher.read(Unknown Source)
>>> ~[na:1.7.0_79]
>>>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
>>> ~[na:1.7.0_79]
>>>         at sun.nio.ch.IOUtil.read(Unknown Source) ~[na:1.7.0_79]
>>>         at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
>>> ~[na:1.7.0_79]
>>>         at
>>> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:51)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:250)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>
>>> On Wed, May 25, 2016 at 10:28 PM, Paulo Motta <pa...@gmail.com>
>>> wrote:
>>>
>>>> > Workaround is to set to a larger streaming_socket_timeout_in_ms **on
>>>> the source node**., the new default will be 86400000ms (1 day).
>>>>
>>>> 2016-05-25 17:23 GMT-03:00 Paulo Motta <pa...@gmail.com>:
>>>>
>>>>> Was there any other ERROR preceding this on this node (in particular
>>>>> the last few lines of [STREAM-IN-/172.31.22.104])? If it's a
>>>>> SocketTimeoutException, then what is happening is that the default
>>>>> streaming socket timeout of 1 hour is not sufficient to stream a single
>>>>> file and the stream session is failed. Workaround is to set to a larger
>>>>> streaming_socket_timeout_in_ms, the new default will be 86400000ms (1
>>>>> day).
>>>>>
>>>>> We are addressing this on
>>>>> https://issues.apache.org/jira/browse/CASSANDRA-11839.
>>>>>
>>>>> 2016-05-25 16:42 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>>>>>
>>>>>> Hello again,
>>>>>>
>>>>>> Here is the error message from the source
>>>>>>
>>>>>> INFO  [STREAM-IN-/172.31.22.104] 2016-05-25 00:44:57,275
>>>>>> StreamResultFuture.java:180 - [Stream
>>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /172.31.22.104
>>>>>> is complete
>>>>>> WARN  [STREAM-IN-/172.31.22.104] 2016-05-25 00:44:57,276
>>>>>> StreamResultFuture.java:207 - [Stream
>>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
>>>>>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-25 00:44:57,353
>>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>> Streaming error occurred
>>>>>> java.lang.AssertionError: Memory was freed
>>>>>>         at
>>>>>> org.apache.cassandra.io.util.SafeMemory.checkBounds(SafeMemory.java:97)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.io.util.Memory.getLong(Memory.java:249)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.io.compress.CompressionMetadata.getTotalSizeForSections(CompressionMetadata.java:247)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.messages.FileMessageHeader.size(FileMessageHeader.java:112)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:546)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>>>
>>>>>> On Wed, May 25, 2016 at 8:49 PM, Paulo Motta <
>>>>>> pauloricardomg@gmail.com> wrote:
>>>>>>
>>>>>>> This is the log of the destination/rebuilding node, you need to
>>>>>>> check what is the error message on the stream source node (192.168.1.140).
>>>>>>>
>>>>>>>
>>>>>>> 2016-05-25 15:22 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> Here is additional stack trace from system.log:
>>>>>>>>
>>>>>>>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
>>>>>>>> StreamSession.java:620 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>>> Remote peer 192.168.1.140 failed stream session.
>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
>>>>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>>> Streaming error occurred
>>>>>>>> java.io.IOException: Connection timed out
>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>>>>> INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
>>>>>>>> StreamResultFuture.java:180 - [Stream
>>>>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /192.168.1.140
>>>>>>>> is complete
>>>>>>>> WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
>>>>>>>> StreamResultFuture.java:207 - [Stream
>>>>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
>>>>>>>> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24 22:44:58,628
>>>>>>>> StorageService.java:1075 - Error while rebuilding node
>>>>>>>> org.apache.cassandra.streaming.StreamException: Stream failed
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>         at
>>>>>>>> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>         at
>>>>>>>> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>         at
>>>>>>>> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>         at
>>>>>>>> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at java.lang.Thread.run(Unknown Source) ~[na:1.7.0_79]
>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
>>>>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>>> Streaming error occurred
>>>>>>>> java.io.IOException: Broken pipe
>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>>>>>>> ~[na:1.7.0_79]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, May 25, 2016 at 5:23 PM, Paulo Motta <
>>>>>>>> pauloricardomg@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> The stack trace from the rebuild command not show the root cause
>>>>>>>>> of the rebuild stream error. Can you check the system.log for ERROR logs
>>>>>>>>> during streaming and paste here?
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Error while rebuilding a node: Stream failed

Posted by George Sigletos <si...@textkernel.nl>.
I tried again with setting streaming_socket_timeout_in_ms to 1 day on all
nodes and after having upgraded to 2.1.14.

My tcp_keep_alive_time is set to 2 hours and tcp_keepalive_probes to 9.
That should be ok I would believe.

I get streaming error again, shortly after starting the rebuild process.
This is from the destination node:

ERROR [STREAM-IN-/192.168.1.141] 2016-05-26 09:08:05,027
StreamSession.java:505 - [Stream #74c57bc0-231a-11e6-a698-1b05ac77baf9]
Streaming error occurred
java.lang.RuntimeException: Outgoing stream handler has been closed
        at
org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:138)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:568)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:457)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:263)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]

And this is from the source node:

ERROR [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,097
StreamSession.java:505 - [Stream #74c57bc0-231a-11e6-a698-1b05ac77baf9]
Streaming error occurred
java.io.IOException: Broken pipe
        at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
~[na:1.7.0_79]
        at sun.nio.ch.FileChannelImpl.transferToDirectly(Unknown Source)
~[na:1.7.0_79]
        at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source)
~[na:1.7.0_79]
        at
org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:84)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:88)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:49)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:358)
[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:330)
[apache-cassandra-2.1.14.jar:2.1.14]
        at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
INFO  [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,111
StreamResultFuture.java:180 - [Stream
#74c57bc0-231a-11e6-a698-1b05ac77baf9] Session with /172.31.22.104 is
complete
WARN  [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,114
StreamResultFuture.java:207 - [Stream
#74c57bc0-231a-11e6-a698-1b05ac77baf9] Stream failed


Streaming does not seem to be resumed again from this node. Shall I just
kill again the entire rebuild process?

On Thu, May 26, 2016 at 12:17 AM, Paulo Motta <pa...@gmail.com>
wrote:

> If increasing or disabling streaming_socket_timeout_in_ms on the source
> node does not fix it, you may want to have a look on your tcp keep alive
> settings on the source and destination nodes as intermediate
> routers/firewalls may be killing the connections due to inactivity. See
> this for more information:
> https://docs.datastax.com/en/cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html
>
> This will ultimately fixed by CASSANDRA-11841 by adding keep-alive to the
> streaming protocol.
>
> 2016-05-25 18:09 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>
>> Thanks a lot for your help. I will try that tomorrow. The first time that
>> I tried to rebuild, streaming_socket_timeout_in_ms was 0 and still failed.
>> Below is the directly previous error on the source node:
>>
>> ERROR [STREAM-IN-/172.31.22.104] 2016-05-24 22:32:20,437
>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> Streaming error occurred
>> java.io.IOException: Connection timed out
>>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>> ~[na:1.7.0_79]
>>         at sun.nio.ch.SocketDispatcher.read(Unknown Source) ~[na:1.7.0_79]
>>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
>> ~[na:1.7.0_79]
>>         at sun.nio.ch.IOUtil.read(Unknown Source) ~[na:1.7.0_79]
>>         at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
>> ~[na:1.7.0_79]
>>         at
>> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:51)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:250)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>
>> On Wed, May 25, 2016 at 10:28 PM, Paulo Motta <pa...@gmail.com>
>> wrote:
>>
>>> > Workaround is to set to a larger streaming_socket_timeout_in_ms **on
>>> the source node**., the new default will be 86400000ms (1 day).
>>>
>>> 2016-05-25 17:23 GMT-03:00 Paulo Motta <pa...@gmail.com>:
>>>
>>>> Was there any other ERROR preceding this on this node (in particular
>>>> the last few lines of [STREAM-IN-/172.31.22.104])? If it's a
>>>> SocketTimeoutException, then what is happening is that the default
>>>> streaming socket timeout of 1 hour is not sufficient to stream a single
>>>> file and the stream session is failed. Workaround is to set to a larger
>>>> streaming_socket_timeout_in_ms, the new default will be 86400000ms (1
>>>> day).
>>>>
>>>> We are addressing this on
>>>> https://issues.apache.org/jira/browse/CASSANDRA-11839.
>>>>
>>>> 2016-05-25 16:42 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>>>>
>>>>> Hello again,
>>>>>
>>>>> Here is the error message from the source
>>>>>
>>>>> INFO  [STREAM-IN-/172.31.22.104] 2016-05-25 00:44:57,275
>>>>> StreamResultFuture.java:180 - [Stream
>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /172.31.22.104 is
>>>>> complete
>>>>> WARN  [STREAM-IN-/172.31.22.104] 2016-05-25 00:44:57,276
>>>>> StreamResultFuture.java:207 - [Stream
>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
>>>>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-25 00:44:57,353
>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>> Streaming error occurred
>>>>> java.lang.AssertionError: Memory was freed
>>>>>         at
>>>>> org.apache.cassandra.io.util.SafeMemory.checkBounds(SafeMemory.java:97)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.io.util.Memory.getLong(Memory.java:249)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.io.compress.CompressionMetadata.getTotalSizeForSections(CompressionMetadata.java:247)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.messages.FileMessageHeader.size(FileMessageHeader.java:112)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:546)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>>
>>>>> On Wed, May 25, 2016 at 8:49 PM, Paulo Motta <pauloricardomg@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> This is the log of the destination/rebuilding node, you need to check
>>>>>> what is the error message on the stream source node (192.168.1.140).
>>>>>>
>>>>>>
>>>>>> 2016-05-25 15:22 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> Here is additional stack trace from system.log:
>>>>>>>
>>>>>>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
>>>>>>> StreamSession.java:620 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>> Remote peer 192.168.1.140 failed stream session.
>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
>>>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>> Streaming error occurred
>>>>>>> java.io.IOException: Connection timed out
>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>>>>> ~[na:1.7.0_79]
>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>>>>>> ~[na:1.7.0_79]
>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>>>>>>> ~[na:1.7.0_79]
>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>>>>>> ~[na:1.7.0_79]
>>>>>>>         at
>>>>>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>>>> INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
>>>>>>> StreamResultFuture.java:180 - [Stream
>>>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /192.168.1.140
>>>>>>> is complete
>>>>>>> WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
>>>>>>> StreamResultFuture.java:207 - [Stream
>>>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
>>>>>>> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24 22:44:58,628
>>>>>>> StorageService.java:1075 - Error while rebuilding node
>>>>>>> org.apache.cassandra.streaming.StreamException: Stream failed
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>         at
>>>>>>> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>         at
>>>>>>> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>         at
>>>>>>> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>         at
>>>>>>> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>>>>>>> ~[guava-16.0.jar:na]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at java.lang.Thread.run(Unknown Source) ~[na:1.7.0_79]
>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
>>>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>>> Streaming error occurred
>>>>>>> java.io.IOException: Broken pipe
>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>>>>> ~[na:1.7.0_79]
>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>>>>>> ~[na:1.7.0_79]
>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>>>>>>> ~[na:1.7.0_79]
>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>>>>>> ~[na:1.7.0_79]
>>>>>>>         at
>>>>>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at
>>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>>>>
>>>>>>>
>>>>>>> On Wed, May 25, 2016 at 5:23 PM, Paulo Motta <
>>>>>>> pauloricardomg@gmail.com> wrote:
>>>>>>>
>>>>>>>> The stack trace from the rebuild command not show the root cause of
>>>>>>>> the rebuild stream error. Can you check the system.log for ERROR logs
>>>>>>>> during streaming and paste here?
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Error while rebuilding a node: Stream failed

Posted by Paulo Motta <pa...@gmail.com>.
If increasing or disabling streaming_socket_timeout_in_ms on the source
node does not fix it, you may want to have a look on your tcp keep alive
settings on the source and destination nodes as intermediate
routers/firewalls may be killing the connections due to inactivity. See
this for more information:
https://docs.datastax.com/en/cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html

This will ultimately fixed by CASSANDRA-11841 by adding keep-alive to the
streaming protocol.

2016-05-25 18:09 GMT-03:00 George Sigletos <si...@textkernel.nl>:

> Thanks a lot for your help. I will try that tomorrow. The first time that
> I tried to rebuild, streaming_socket_timeout_in_ms was 0 and still failed.
> Below is the directly previous error on the source node:
>
> ERROR [STREAM-IN-/172.31.22.104] 2016-05-24 22:32:20,437
> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
> Streaming error occurred
> java.io.IOException: Connection timed out
>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> ~[na:1.7.0_79]
>         at sun.nio.ch.SocketDispatcher.read(Unknown Source) ~[na:1.7.0_79]
>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
> ~[na:1.7.0_79]
>         at sun.nio.ch.IOUtil.read(Unknown Source) ~[na:1.7.0_79]
>         at sun.nio.ch.SocketChannelImpl.read(Unknown Source) ~[na:1.7.0_79]
>         at
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:51)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:250)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>
> On Wed, May 25, 2016 at 10:28 PM, Paulo Motta <pa...@gmail.com>
> wrote:
>
>> > Workaround is to set to a larger streaming_socket_timeout_in_ms **on
>> the source node**., the new default will be 86400000ms (1 day).
>>
>> 2016-05-25 17:23 GMT-03:00 Paulo Motta <pa...@gmail.com>:
>>
>>> Was there any other ERROR preceding this on this node (in particular the
>>> last few lines of [STREAM-IN-/172.31.22.104])? If it's a
>>> SocketTimeoutException, then what is happening is that the default
>>> streaming socket timeout of 1 hour is not sufficient to stream a single
>>> file and the stream session is failed. Workaround is to set to a larger
>>> streaming_socket_timeout_in_ms, the new default will be 86400000ms (1
>>> day).
>>>
>>> We are addressing this on
>>> https://issues.apache.org/jira/browse/CASSANDRA-11839.
>>>
>>> 2016-05-25 16:42 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>>>
>>>> Hello again,
>>>>
>>>> Here is the error message from the source
>>>>
>>>> INFO  [STREAM-IN-/172.31.22.104] 2016-05-25 00:44:57,275
>>>> StreamResultFuture.java:180 - [Stream
>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /172.31.22.104 is
>>>> complete
>>>> WARN  [STREAM-IN-/172.31.22.104] 2016-05-25 00:44:57,276
>>>> StreamResultFuture.java:207 - [Stream
>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
>>>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-25 00:44:57,353
>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>> Streaming error occurred
>>>> java.lang.AssertionError: Memory was freed
>>>>         at
>>>> org.apache.cassandra.io.util.SafeMemory.checkBounds(SafeMemory.java:97)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at org.apache.cassandra.io.util.Memory.getLong(Memory.java:249)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.io.compress.CompressionMetadata.getTotalSizeForSections(CompressionMetadata.java:247)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.messages.FileMessageHeader.size(FileMessageHeader.java:112)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:546)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>
>>>> On Wed, May 25, 2016 at 8:49 PM, Paulo Motta <pa...@gmail.com>
>>>> wrote:
>>>>
>>>>> This is the log of the destination/rebuilding node, you need to check
>>>>> what is the error message on the stream source node (192.168.1.140).
>>>>>
>>>>>
>>>>> 2016-05-25 15:22 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> Here is additional stack trace from system.log:
>>>>>>
>>>>>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
>>>>>> StreamSession.java:620 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>> Remote peer 192.168.1.140 failed stream session.
>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
>>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>> Streaming error occurred
>>>>>> java.io.IOException: Connection timed out
>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>>>> ~[na:1.7.0_79]
>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>>>>> ~[na:1.7.0_79]
>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>>>>>> ~[na:1.7.0_79]
>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>>>>> ~[na:1.7.0_79]
>>>>>>         at
>>>>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>>> INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
>>>>>> StreamResultFuture.java:180 - [Stream
>>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /192.168.1.140
>>>>>> is complete
>>>>>> WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
>>>>>> StreamResultFuture.java:207 - [Stream
>>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
>>>>>> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24 22:44:58,628
>>>>>> StorageService.java:1075 - Error while rebuilding node
>>>>>> org.apache.cassandra.streaming.StreamException: Stream failed
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
>>>>>> ~[guava-16.0.jar:na]
>>>>>>         at
>>>>>> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>>>>>> ~[guava-16.0.jar:na]
>>>>>>         at
>>>>>> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>>>>>> ~[guava-16.0.jar:na]
>>>>>>         at
>>>>>> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>>>>>> ~[guava-16.0.jar:na]
>>>>>>         at
>>>>>> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>>>>>> ~[guava-16.0.jar:na]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at java.lang.Thread.run(Unknown Source) ~[na:1.7.0_79]
>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
>>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>>> Streaming error occurred
>>>>>> java.io.IOException: Broken pipe
>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>>>> ~[na:1.7.0_79]
>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>>>>> ~[na:1.7.0_79]
>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>>>>>> ~[na:1.7.0_79]
>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>>>>> ~[na:1.7.0_79]
>>>>>>         at
>>>>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at
>>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>>>
>>>>>>
>>>>>> On Wed, May 25, 2016 at 5:23 PM, Paulo Motta <
>>>>>> pauloricardomg@gmail.com> wrote:
>>>>>>
>>>>>>> The stack trace from the rebuild command not show the root cause of
>>>>>>> the rebuild stream error. Can you check the system.log for ERROR logs
>>>>>>> during streaming and paste here?
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Error while rebuilding a node: Stream failed

Posted by George Sigletos <si...@textkernel.nl>.
Thanks a lot for your help. I will try that tomorrow. The first time that I
tried to rebuild, streaming_socket_timeout_in_ms was 0 and still failed.
Below is the directly previous error on the source node:

ERROR [STREAM-IN-/172.31.22.104] 2016-05-24 22:32:20,437
StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
Streaming error occurred
java.io.IOException: Connection timed out
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_79]
        at sun.nio.ch.SocketDispatcher.read(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
~[na:1.7.0_79]
        at sun.nio.ch.IOUtil.read(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.SocketChannelImpl.read(Unknown Source) ~[na:1.7.0_79]
        at
org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:51)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:250)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]

On Wed, May 25, 2016 at 10:28 PM, Paulo Motta <pa...@gmail.com>
wrote:

> > Workaround is to set to a larger streaming_socket_timeout_in_ms **on
> the source node**., the new default will be 86400000ms (1 day).
>
> 2016-05-25 17:23 GMT-03:00 Paulo Motta <pa...@gmail.com>:
>
>> Was there any other ERROR preceding this on this node (in particular the
>> last few lines of [STREAM-IN-/172.31.22.104])? If it's a
>> SocketTimeoutException, then what is happening is that the default
>> streaming socket timeout of 1 hour is not sufficient to stream a single
>> file and the stream session is failed. Workaround is to set to a larger
>> streaming_socket_timeout_in_ms, the new default will be 86400000ms (1
>> day).
>>
>> We are addressing this on
>> https://issues.apache.org/jira/browse/CASSANDRA-11839.
>>
>> 2016-05-25 16:42 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>>
>>> Hello again,
>>>
>>> Here is the error message from the source
>>>
>>> INFO  [STREAM-IN-/172.31.22.104] 2016-05-25 00:44:57,275
>>> StreamResultFuture.java:180 - [Stream
>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /172.31.22.104 is
>>> complete
>>> WARN  [STREAM-IN-/172.31.22.104] 2016-05-25 00:44:57,276
>>> StreamResultFuture.java:207 - [Stream
>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
>>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-25 00:44:57,353
>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>> Streaming error occurred
>>> java.lang.AssertionError: Memory was freed
>>>         at
>>> org.apache.cassandra.io.util.SafeMemory.checkBounds(SafeMemory.java:97)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at org.apache.cassandra.io.util.Memory.getLong(Memory.java:249)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.io.compress.CompressionMetadata.getTotalSizeForSections(CompressionMetadata.java:247)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.messages.FileMessageHeader.size(FileMessageHeader.java:112)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:546)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>
>>> On Wed, May 25, 2016 at 8:49 PM, Paulo Motta <pa...@gmail.com>
>>> wrote:
>>>
>>>> This is the log of the destination/rebuilding node, you need to check
>>>> what is the error message on the stream source node (192.168.1.140).
>>>>
>>>>
>>>> 2016-05-25 15:22 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>>>>
>>>>> Hello,
>>>>>
>>>>> Here is additional stack trace from system.log:
>>>>>
>>>>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
>>>>> StreamSession.java:620 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>> Remote peer 192.168.1.140 failed stream session.
>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>> Streaming error occurred
>>>>> java.io.IOException: Connection timed out
>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>>> ~[na:1.7.0_79]
>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>>>> ~[na:1.7.0_79]
>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>>>>> ~[na:1.7.0_79]
>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>>>> ~[na:1.7.0_79]
>>>>>         at
>>>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>> INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
>>>>> StreamResultFuture.java:180 - [Stream
>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /192.168.1.140 is
>>>>> complete
>>>>> WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
>>>>> StreamResultFuture.java:207 - [Stream
>>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
>>>>> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24 22:44:58,628
>>>>> StorageService.java:1075 - Error while rebuilding node
>>>>> org.apache.cassandra.streaming.StreamException: Stream failed
>>>>>         at
>>>>> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
>>>>> ~[guava-16.0.jar:na]
>>>>>         at
>>>>> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>>>>> ~[guava-16.0.jar:na]
>>>>>         at
>>>>> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>>>>> ~[guava-16.0.jar:na]
>>>>>         at
>>>>> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>>>>> ~[guava-16.0.jar:na]
>>>>>         at
>>>>> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>>>>> ~[guava-16.0.jar:na]
>>>>>         at
>>>>> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at java.lang.Thread.run(Unknown Source) ~[na:1.7.0_79]
>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
>>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>>> Streaming error occurred
>>>>> java.io.IOException: Broken pipe
>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>>> ~[na:1.7.0_79]
>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>>>> ~[na:1.7.0_79]
>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>>>>> ~[na:1.7.0_79]
>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>>>> ~[na:1.7.0_79]
>>>>>         at
>>>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at
>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>>
>>>>>
>>>>> On Wed, May 25, 2016 at 5:23 PM, Paulo Motta <pauloricardomg@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> The stack trace from the rebuild command not show the root cause of
>>>>>> the rebuild stream error. Can you check the system.log for ERROR logs
>>>>>> during streaming and paste here?
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Error while rebuilding a node: Stream failed

Posted by Paulo Motta <pa...@gmail.com>.
> Workaround is to set to a larger streaming_socket_timeout_in_ms **on the
source node**., the new default will be 86400000ms (1 day).

2016-05-25 17:23 GMT-03:00 Paulo Motta <pa...@gmail.com>:

> Was there any other ERROR preceding this on this node (in particular the
> last few lines of [STREAM-IN-/172.31.22.104])? If it's a
> SocketTimeoutException, then what is happening is that the default
> streaming socket timeout of 1 hour is not sufficient to stream a single
> file and the stream session is failed. Workaround is to set to a larger
> streaming_socket_timeout_in_ms, the new default will be 86400000ms (1
> day).
>
> We are addressing this on
> https://issues.apache.org/jira/browse/CASSANDRA-11839.
>
> 2016-05-25 16:42 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>
>> Hello again,
>>
>> Here is the error message from the source
>>
>> INFO  [STREAM-IN-/172.31.22.104] 2016-05-25 00:44:57,275
>> StreamResultFuture.java:180 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /172.31.22.104 is
>> complete
>> WARN  [STREAM-IN-/172.31.22.104] 2016-05-25 00:44:57,276
>> StreamResultFuture.java:207 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-25 00:44:57,353
>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> Streaming error occurred
>> java.lang.AssertionError: Memory was freed
>>         at
>> org.apache.cassandra.io.util.SafeMemory.checkBounds(SafeMemory.java:97)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at org.apache.cassandra.io.util.Memory.getLong(Memory.java:249)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.io.compress.CompressionMetadata.getTotalSizeForSections(CompressionMetadata.java:247)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.messages.FileMessageHeader.size(FileMessageHeader.java:112)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:546)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>
>> On Wed, May 25, 2016 at 8:49 PM, Paulo Motta <pa...@gmail.com>
>> wrote:
>>
>>> This is the log of the destination/rebuilding node, you need to check
>>> what is the error message on the stream source node (192.168.1.140).
>>>
>>>
>>> 2016-05-25 15:22 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>>>
>>>> Hello,
>>>>
>>>> Here is additional stack trace from system.log:
>>>>
>>>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
>>>> StreamSession.java:620 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>> Remote peer 192.168.1.140 failed stream session.
>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>> Streaming error occurred
>>>> java.io.IOException: Connection timed out
>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>> ~[na:1.7.0_79]
>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>>> ~[na:1.7.0_79]
>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>>>> ~[na:1.7.0_79]
>>>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>>> ~[na:1.7.0_79]
>>>>         at
>>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>> INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
>>>> StreamResultFuture.java:180 - [Stream
>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /192.168.1.140 is
>>>> complete
>>>> WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
>>>> StreamResultFuture.java:207 - [Stream
>>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
>>>> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24 22:44:58,628
>>>> StorageService.java:1075 - Error while rebuilding node
>>>> org.apache.cassandra.streaming.StreamException: Stream failed
>>>>         at
>>>> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
>>>> ~[guava-16.0.jar:na]
>>>>         at
>>>> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>>>> ~[guava-16.0.jar:na]
>>>>         at
>>>> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>>>> ~[guava-16.0.jar:na]
>>>>         at
>>>> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>>>> ~[guava-16.0.jar:na]
>>>>         at
>>>> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>>>> ~[guava-16.0.jar:na]
>>>>         at
>>>> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at java.lang.Thread.run(Unknown Source) ~[na:1.7.0_79]
>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
>>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>>> Streaming error occurred
>>>> java.io.IOException: Broken pipe
>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>> ~[na:1.7.0_79]
>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>>> ~[na:1.7.0_79]
>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>>>> ~[na:1.7.0_79]
>>>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>>> ~[na:1.7.0_79]
>>>>         at
>>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at
>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>>
>>>>
>>>> On Wed, May 25, 2016 at 5:23 PM, Paulo Motta <pa...@gmail.com>
>>>> wrote:
>>>>
>>>>> The stack trace from the rebuild command not show the root cause of
>>>>> the rebuild stream error. Can you check the system.log for ERROR logs
>>>>> during streaming and paste here?
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Error while rebuilding a node: Stream failed

Posted by Paulo Motta <pa...@gmail.com>.
Was there any other ERROR preceding this on this node (in particular the
last few lines of [STREAM-IN-/172.31.22.104])? If it's a
SocketTimeoutException, then what is happening is that the default
streaming socket timeout of 1 hour is not sufficient to stream a single
file and the stream session is failed. Workaround is to set to a larger
streaming_socket_timeout_in_ms, the new default will be 86400000ms (1 day).

We are addressing this on
https://issues.apache.org/jira/browse/CASSANDRA-11839.

2016-05-25 16:42 GMT-03:00 George Sigletos <si...@textkernel.nl>:

> Hello again,
>
> Here is the error message from the source
>
> INFO  [STREAM-IN-/172.31.22.104] 2016-05-25 00:44:57,275
> StreamResultFuture.java:180 - [Stream
> #2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /172.31.22.104 is
> complete
> WARN  [STREAM-IN-/172.31.22.104] 2016-05-25 00:44:57,276
> StreamResultFuture.java:207 - [Stream
> #2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-25 00:44:57,353
> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
> Streaming error occurred
> java.lang.AssertionError: Memory was freed
>         at
> org.apache.cassandra.io.util.SafeMemory.checkBounds(SafeMemory.java:97)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at org.apache.cassandra.io.util.Memory.getLong(Memory.java:249)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.io.compress.CompressionMetadata.getTotalSizeForSections(CompressionMetadata.java:247)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.messages.FileMessageHeader.size(FileMessageHeader.java:112)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:546)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>
> On Wed, May 25, 2016 at 8:49 PM, Paulo Motta <pa...@gmail.com>
> wrote:
>
>> This is the log of the destination/rebuilding node, you need to check
>> what is the error message on the stream source node (192.168.1.140).
>>
>>
>> 2016-05-25 15:22 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>>
>>> Hello,
>>>
>>> Here is additional stack trace from system.log:
>>>
>>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
>>> StreamSession.java:620 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>> Remote peer 192.168.1.140 failed stream session.
>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>> Streaming error occurred
>>> java.io.IOException: Connection timed out
>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>> ~[na:1.7.0_79]
>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>> ~[na:1.7.0_79]
>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>>> ~[na:1.7.0_79]
>>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>> ~[na:1.7.0_79]
>>>         at
>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>> INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
>>> StreamResultFuture.java:180 - [Stream
>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /192.168.1.140 is
>>> complete
>>> WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
>>> StreamResultFuture.java:207 - [Stream
>>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
>>> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24 22:44:58,628
>>> StorageService.java:1075 - Error while rebuilding node
>>> org.apache.cassandra.streaming.StreamException: Stream failed
>>>         at
>>> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
>>> ~[guava-16.0.jar:na]
>>>         at
>>> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>>> ~[guava-16.0.jar:na]
>>>         at
>>> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>>> ~[guava-16.0.jar:na]
>>>         at
>>> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>>> ~[guava-16.0.jar:na]
>>>         at
>>> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>>> ~[guava-16.0.jar:na]
>>>         at
>>> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at java.lang.Thread.run(Unknown Source) ~[na:1.7.0_79]
>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
>>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>> Streaming error occurred
>>> java.io.IOException: Broken pipe
>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>> ~[na:1.7.0_79]
>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>>> ~[na:1.7.0_79]
>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>>> ~[na:1.7.0_79]
>>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>> ~[na:1.7.0_79]
>>>         at
>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>         at
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>>
>>>
>>> On Wed, May 25, 2016 at 5:23 PM, Paulo Motta <pa...@gmail.com>
>>> wrote:
>>>
>>>> The stack trace from the rebuild command not show the root cause of the
>>>> rebuild stream error. Can you check the system.log for ERROR logs during
>>>> streaming and paste here?
>>>>
>>>
>>>
>>
>

Re: Error while rebuilding a node: Stream failed

Posted by George Sigletos <si...@textkernel.nl>.
Hello again,

Here is the error message from the source

INFO  [STREAM-IN-/172.31.22.104] 2016-05-25 00:44:57,275
StreamResultFuture.java:180 - [Stream
#2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /172.31.22.104 is
complete
WARN  [STREAM-IN-/172.31.22.104] 2016-05-25 00:44:57,276
StreamResultFuture.java:207 - [Stream
#2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
ERROR [STREAM-OUT-/172.31.22.104] 2016-05-25 00:44:57,353
StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
Streaming error occurred
java.lang.AssertionError: Memory was freed
        at
org.apache.cassandra.io.util.SafeMemory.checkBounds(SafeMemory.java:97)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at org.apache.cassandra.io.util.Memory.getLong(Memory.java:249)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.io.compress.CompressionMetadata.getTotalSizeForSections(CompressionMetadata.java:247)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.messages.FileMessageHeader.size(FileMessageHeader.java:112)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:546)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]

On Wed, May 25, 2016 at 8:49 PM, Paulo Motta <pa...@gmail.com>
wrote:

> This is the log of the destination/rebuilding node, you need to check what
> is the error message on the stream source node (192.168.1.140).
>
>
> 2016-05-25 15:22 GMT-03:00 George Sigletos <si...@textkernel.nl>:
>
>> Hello,
>>
>> Here is additional stack trace from system.log:
>>
>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
>> StreamSession.java:620 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> Remote peer 192.168.1.140 failed stream session.
>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> Streaming error occurred
>> java.io.IOException: Connection timed out
>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>> ~[na:1.7.0_79]
>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>> ~[na:1.7.0_79]
>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>> ~[na:1.7.0_79]
>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>> ~[na:1.7.0_79]
>>         at
>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>> [apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
>> [apache-cassandra-2.1.13.jar:2.1.13]
>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>> INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
>> StreamResultFuture.java:180 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /192.168.1.140 is
>> complete
>> WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
>> StreamResultFuture.java:207 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
>> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24 22:44:58,628
>> StorageService.java:1075 - Error while rebuilding node
>> org.apache.cassandra.streaming.StreamException: Stream failed
>>         at
>> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
>> ~[guava-16.0.jar:na]
>>         at
>> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>> ~[guava-16.0.jar:na]
>>         at
>> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>> ~[guava-16.0.jar:na]
>>         at
>> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>> ~[guava-16.0.jar:na]
>>         at
>> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>> ~[guava-16.0.jar:na]
>>         at
>> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at java.lang.Thread.run(Unknown Source) ~[na:1.7.0_79]
>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
>> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> Streaming error occurred
>> java.io.IOException: Broken pipe
>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>> ~[na:1.7.0_79]
>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>> ~[na:1.7.0_79]
>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>> ~[na:1.7.0_79]
>>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>> ~[na:1.7.0_79]
>>         at
>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>> [apache-cassandra-2.1.13.jar:2.1.13]
>>         at
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>> [apache-cassandra-2.1.13.jar:2.1.13]
>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>
>>
>> On Wed, May 25, 2016 at 5:23 PM, Paulo Motta <pa...@gmail.com>
>> wrote:
>>
>>> The stack trace from the rebuild command not show the root cause of the
>>> rebuild stream error. Can you check the system.log for ERROR logs during
>>> streaming and paste here?
>>>
>>
>>
>

Re: Error while rebuilding a node: Stream failed

Posted by Paulo Motta <pa...@gmail.com>.
This is the log of the destination/rebuilding node, you need to check what
is the error message on the stream source node (192.168.1.140).

2016-05-25 15:22 GMT-03:00 George Sigletos <si...@textkernel.nl>:

> Hello,
>
> Here is additional stack trace from system.log:
>
> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
> StreamSession.java:620 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
> Remote peer 192.168.1.140 failed stream session.
> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
> Streaming error occurred
> java.io.IOException: Connection timed out
>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> ~[na:1.7.0_79]
>         at sun.nio.ch.SocketDispatcher.write(Unknown Source) ~[na:1.7.0_79]
>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
> ~[na:1.7.0_79]
>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
> ~[na:1.7.0_79]
>         at
> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
> [apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
> [apache-cassandra-2.1.13.jar:2.1.13]
>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
> INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
> StreamResultFuture.java:180 - [Stream
> #2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /192.168.1.140 is
> complete
> WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
> StreamResultFuture.java:207 - [Stream
> #2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24 22:44:58,628
> StorageService.java:1075 - Error while rebuilding node
> org.apache.cassandra.streaming.StreamException: Stream failed
>         at
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
> ~[guava-16.0.jar:na]
>         at
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
> ~[guava-16.0.jar:na]
>         at
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
> ~[guava-16.0.jar:na]
>         at
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
> ~[guava-16.0.jar:na]
>         at
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
> ~[guava-16.0.jar:na]
>         at
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at java.lang.Thread.run(Unknown Source) ~[na:1.7.0_79]
> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
> StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
> Streaming error occurred
> java.io.IOException: Broken pipe
>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> ~[na:1.7.0_79]
>         at sun.nio.ch.SocketDispatcher.write(Unknown Source) ~[na:1.7.0_79]
>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
> ~[na:1.7.0_79]
>         at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
> ~[na:1.7.0_79]
>         at
> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
> ~[apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
> [apache-cassandra-2.1.13.jar:2.1.13]
>         at
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
> [apache-cassandra-2.1.13.jar:2.1.13]
>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>
>
> On Wed, May 25, 2016 at 5:23 PM, Paulo Motta <pa...@gmail.com>
> wrote:
>
>> The stack trace from the rebuild command not show the root cause of the
>> rebuild stream error. Can you check the system.log for ERROR logs during
>> streaming and paste here?
>>
>
>

Re: Error while rebuilding a node: Stream failed

Posted by George Sigletos <si...@textkernel.nl>.
Hello,

Here is additional stack trace from system.log:

ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
StreamSession.java:620 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
Remote peer 192.168.1.140 failed stream session.
ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
Streaming error occurred
java.io.IOException: Connection timed out
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
~[na:1.7.0_79]
        at sun.nio.ch.SocketDispatcher.write(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
~[na:1.7.0_79]
        at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.SocketChannelImpl.write(Unknown Source) ~[na:1.7.0_79]
        at
org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
[apache-cassandra-2.1.13.jar:2.1.13]
        at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
StreamResultFuture.java:180 - [Stream
#2c290460-20d4-11e6-930f-1b05ac77baf9] Session with /192.168.1.140 is
complete
WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
StreamResultFuture.java:207 - [Stream
#2c290460-20d4-11e6-930f-1b05ac77baf9] Stream failed
ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24 22:44:58,628
StorageService.java:1075 - Error while rebuilding node
org.apache.cassandra.streaming.StreamException: Stream failed
        at
org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
~[guava-16.0.jar:na]
        at
com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
~[guava-16.0.jar:na]
        at
com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
~[guava-16.0.jar:na]
        at
com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
~[guava-16.0.jar:na]
        at
com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
~[guava-16.0.jar:na]
        at
org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at java.lang.Thread.run(Unknown Source) ~[na:1.7.0_79]
ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
StreamSession.java:505 - [Stream #2c290460-20d4-11e6-930f-1b05ac77baf9]
Streaming error occurred
java.io.IOException: Broken pipe
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
~[na:1.7.0_79]
        at sun.nio.ch.SocketDispatcher.write(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
~[na:1.7.0_79]
        at sun.nio.ch.IOUtil.write(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.SocketChannelImpl.write(Unknown Source) ~[na:1.7.0_79]
        at
org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
~[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
[apache-cassandra-2.1.13.jar:2.1.13]
        at
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
[apache-cassandra-2.1.13.jar:2.1.13]
        at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]


On Wed, May 25, 2016 at 5:23 PM, Paulo Motta <pa...@gmail.com>
wrote:

> The stack trace from the rebuild command not show the root cause of the
> rebuild stream error. Can you check the system.log for ERROR logs during
> streaming and paste here?
>

Re: Error while rebuilding a node: Stream failed

Posted by Paulo Motta <pa...@gmail.com>.
The stack trace from the rebuild command not show the root cause of the
rebuild stream error. Can you check the system.log for ERROR logs during
streaming and paste here?

Re: Error while rebuilding a node: Stream failed

Posted by George Sigletos <si...@textkernel.nl>.
Hi Mike,

Yes I am using NetworkTopologyStrategy. I checked
cassandra-rackdc.properties on the new node:
dc=DCamazon-1
rack=RACamazon-1

I also checked the jira link you sent me. My network topology seems
correct: I have 4 nodes in DC1 and 1 node in DCamazon-1 and I can verify
that when running "nodetool status".

Now I am running a full repair on the amazon node. I have given up
rebuilding

Kind regards,
George



On Wed, May 25, 2016 at 8:50 AM, Mike Yeap <wk...@gmail.com> wrote:

> Hi George, are you using NetworkTopologyStrategy as the replication
> strategy for your keyspace? If yes, can you check the
> cassandra-rackdc.properties of this new node?
>
> https://issues.apache.org/jira/browse/CASSANDRA-8279
>
>
> Regards,
> Mike Yeap
>
> On Wed, May 25, 2016 at 2:31 PM, George Sigletos <si...@textkernel.nl>
> wrote:
>
>> I am getting this error repeatedly while I am trying to add a new DC
>> consisting of one node in AWS to my existing cluster. I have tried 5 times
>> already. Running Cassandra 2.1.13
>>
>> I have also set:
>> streaming_socket_timeout_in_ms: 3600000
>> in all of my nodes
>>
>> Does anybody have any idea how this can be fixed? Thanks in advance
>>
>> Kind regards,
>> George
>>
>> P.S.
>> The complete stack trace:
>> -- StackTrace --
>> java.lang.RuntimeException: Error while rebuilding node: Stream failed
>>         at
>> org.apache.cassandra.service.StorageService.rebuild(StorageService.java:1076)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>>         at java.lang.reflect.Method.invoke(Unknown Source)
>>         at sun.reflect.misc.Trampoline.invoke(Unknown Source)
>>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>>         at java.lang.reflect.Method.invoke(Unknown Source)
>>         at sun.reflect.misc.MethodUtil.invoke(Unknown Source)
>>         at
>> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Unknown Source)
>>         at
>> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Unknown Source)
>>         at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(Unknown
>> Source)
>>         at com.sun.jmx.mbeanserver.PerInterface.invoke(Unknown Source)
>>         at com.sun.jmx.mbeanserver.MBeanSupport.invoke(Unknown Source)
>>         at
>> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(Unknown Source)
>>         at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(Unknown Source)
>>         at
>> javax.management.remote.rmi.RMIConnectionImpl.doOperation(Unknown Source)
>>         at
>> javax.management.remote.rmi.RMIConnectionImpl.access$300(Unknown Source)
>>         at
>> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(Unknown
>> Source)
>>         at
>> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(Unknown
>> Source)
>>         at javax.management.remote.rmi.RMIConnectionImpl.invoke(Unknown
>> Source)
>>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>>         at java.lang.reflect.Method.invoke(Unknown Source)
>>         at sun.rmi.server.UnicastServerRef.dispatch(Unknown Source)
>>         at sun.rmi.transport.Transport$2.run(Unknown Source)
>>         at sun.rmi.transport.Transport$2.run(Unknown Source)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at sun.rmi.transport.Transport.serviceCall(Unknown Source)
>>         at sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown
>> Source)
>>         at
>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown Source)
>>         at
>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.access$400(Unknown
>> Source)
>>         at
>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(Unknown Source)
>>         at
>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(Unknown Source)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at
>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown Source)
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
>> Source)
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
>> Source)
>>         at java.lang.Thread.run(Unknown Source)
>>
>
>

Re: Error while rebuilding a node: Stream failed

Posted by Mike Yeap <wk...@gmail.com>.
Hi George, are you using NetworkTopologyStrategy as the replication
strategy for your keyspace? If yes, can you check the
cassandra-rackdc.properties of this new node?

https://issues.apache.org/jira/browse/CASSANDRA-8279


Regards,
Mike Yeap

On Wed, May 25, 2016 at 2:31 PM, George Sigletos <si...@textkernel.nl>
wrote:

> I am getting this error repeatedly while I am trying to add a new DC
> consisting of one node in AWS to my existing cluster. I have tried 5 times
> already. Running Cassandra 2.1.13
>
> I have also set:
> streaming_socket_timeout_in_ms: 3600000
> in all of my nodes
>
> Does anybody have any idea how this can be fixed? Thanks in advance
>
> Kind regards,
> George
>
> P.S.
> The complete stack trace:
> -- StackTrace --
> java.lang.RuntimeException: Error while rebuilding node: Stream failed
>         at
> org.apache.cassandra.service.StorageService.rebuild(StorageService.java:1076)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at sun.reflect.misc.Trampoline.invoke(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at sun.reflect.misc.MethodUtil.invoke(Unknown Source)
>         at
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Unknown Source)
>         at
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Unknown Source)
>         at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(Unknown
> Source)
>         at com.sun.jmx.mbeanserver.PerInterface.invoke(Unknown Source)
>         at com.sun.jmx.mbeanserver.MBeanSupport.invoke(Unknown Source)
>         at
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(Unknown Source)
>         at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(Unknown Source)
>         at
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(Unknown Source)
>         at
> javax.management.remote.rmi.RMIConnectionImpl.access$300(Unknown Source)
>         at
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(Unknown
> Source)
>         at
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(Unknown
> Source)
>         at javax.management.remote.rmi.RMIConnectionImpl.invoke(Unknown
> Source)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at sun.rmi.server.UnicastServerRef.dispatch(Unknown Source)
>         at sun.rmi.transport.Transport$2.run(Unknown Source)
>         at sun.rmi.transport.Transport$2.run(Unknown Source)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at sun.rmi.transport.Transport.serviceCall(Unknown Source)
>         at sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown
> Source)
>         at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown Source)
>         at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.access$400(Unknown
> Source)
>         at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(Unknown Source)
>         at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(Unknown Source)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
> Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
> Source)
>         at java.lang.Thread.run(Unknown Source)
>