You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Tyler Hobbs (JIRA)" <ji...@apache.org> on 2015/05/12 22:49:00 UTC

[jira] [Commented] (CASSANDRA-9341) IndexOutOfBoundsException on server when unlogged batch write times out

    [ https://issues.apache.org/jira/browse/CASSANDRA-9341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540681#comment-14540681 ] 

Tyler Hobbs commented on CASSANDRA-9341:
----------------------------------------

With the way that message decoding is implemented, it's difficult to distinguish between internal errors (i.e. C* bugs) and malformed messages.  When we don't know what the problem is, we default to error code 0 (internal server error).

Now, we _could_ assume that Cassandra's message decoding is bug-free, and always return a ProtocolError to the client if there is a problem decoding a message.  However, we would need to add a lot more fine-grained error handling to the decoding logic to make the error messages useful at all.  Given that very few people are writing drivers, I'm not sure that's worth the effort right now.

bq.  I'm not sure if the database should throw an IndexOutOfBoundsException back to the client (is this a security issue?)

I don't believe there are any security concerns here.  However, just for the sake of message cleanliness, we could potentially return a generic message like "there was an error decoding the message, check your server logs" and simply log the exception details whenever we catch an unexpected exception.

> IndexOutOfBoundsException on server when unlogged batch write times out
> -----------------------------------------------------------------------
>
>                 Key: CASSANDRA-9341
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9341
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Ubuntu 14.04 LTS 64bit
> Cassandra 2.1.5
>            Reporter: Nimi Wariboko Jr.
>            Assignee: Tyler Hobbs
>            Priority: Minor
>             Fix For: 2.1.x
>
>
> In our application (golang) we were debugging an issue that caused our entire app to lockup (I think this is community-driver related, and has little to do with the server).
> What caused this issue is we were rapidly sending large batches - and (pretty rarely) one of these write requests would timeout. I think what may have happened is the we end up writing incomplete data to the server.
> When this happens we get this response frame from the server
> This is with the native protocol version 2
> {code}
>  flags=0x0 
> stream=9 
> op=ERROR 
> length=107
> Error Code: 0
> Message: java.lang.IndexOutOfBoundsException: index: 1408818, length: 1375797264 (expected: range(0, 1506453))
> {code}
> And in the Cassandra logs on that node:
> {code}
> ERROR [SharedPool-Worker-28] 2015-05-10 22:32:15,242 Message.java:538 - Unexpected exception during request; channel = [id: 0x68d4acfb, /10.129.196.41:33549 => /10.129.196.24:9042]
> java.lang.IndexOutOfBoundsException: index: 1408818, length: 1375797264 (expected: range(0, 1506453))
> 	at io.netty.buffer.AbstractByteBuf.checkIndex(AbstractByteBuf.java:1143) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.buffer.SlicedByteBuf.slice(SlicedByteBuf.java:155) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.buffer.AbstractByteBuf.readSlice(AbstractByteBuf.java:669) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at org.apache.cassandra.transport.CBUtil.readValue(CBUtil.java:336) ~[apache-cassandra-2.1.5.jar:2.1.5]
> 	at org.apache.cassandra.transport.CBUtil.readValueList(CBUtil.java:386) ~[apache-cassandra-2.1.5.jar:2.1.5]
> 	at org.apache.cassandra.transport.messages.BatchMessage$1.decode(BatchMessage.java:64) ~[apache-cassandra-2.1.5.jar:2.1.5]
> 	at org.apache.cassandra.transport.messages.BatchMessage$1.decode(BatchMessage.java:45) ~[apache-cassandra-2.1.5.jar:2.1.5]
> 	at org.apache.cassandra.transport.Message$ProtocolDecoder.decode(Message.java:247) ~[apache-cassandra-2.1.5.jar:2.1.5]
> 	at org.apache.cassandra.transport.Message$ProtocolDecoder.decode(Message.java:235) ~[apache-cassandra-2.1.5.jar:2.1.5]
> 	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:722) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_76]
> ERROR [SharedPool-Worker-28] 2015-05-10 22:32:15,248 Message.java:538 - Unexpected exception during request; channel = [id: 0x68d4acfb, /10.129.196.41:33549 => /10.129.196.24:9042]
> io.netty.handler.codec.DecoderException: org.apache.cassandra.transport.ProtocolException: Invalid or unsupported protocol version: 110
> 	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:280) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:149) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:722) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_76]
> Caused by: org.apache.cassandra.transport.ProtocolException: Invalid or unsupported protocol version: 110
> 	at org.apache.cassandra.transport.Frame$Decoder.decode(Frame.java:184) ~[apache-cassandra-2.1.5.jar:2.1.5]
> 	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:249) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	... 10 common frames omitted
> ERROR [SharedPool-Worker-22] 2015-05-10 22:32:15,260 Message.java:538 - Unexpected exception during request; channel = [id: 0x68d4acfb, /10.129.196.41:33549 => /10.129.196.24:9042]
> io.netty.handler.codec.DecoderException: org.apache.cassandra.transport.ProtocolException: Invalid or unsupported protocol version: 110
> 	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:280) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:149) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:722) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_76]
> Caused by: org.apache.cassandra.transport.ProtocolException: Invalid or unsupported protocol version: 110
> 	at org.apache.cassandra.transport.Frame$Decoder.decode(Frame.java:184) ~[apache-cassandra-2.1.5.jar:2.1.5]
> 	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:249) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	... 10 common frames omitted
> ERROR [SharedPool-Worker-19] 2015-05-10 22:32:15,260 Message.java:538 - Unexpected exception during request; channel = [id: 0x68d4acfb, /10.129.196.41:33549 => /10.129.196.24:9042]
> io.netty.handler.codec.DecoderException: org.apache.cassandra.transport.ProtocolException: Invalid or unsupported protocol version: 110
> 	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:280) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:149) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:722) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_76]
> Caused by: org.apache.cassandra.transport.ProtocolException: Invalid or unsupported protocol version: 110
> 	at org.apache.cassandra.transport.Frame$Decoder.decode(Frame.java:184) ~[apache-cassandra-2.1.5.jar:2.1.5]
> 	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:249) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	... 10 common frames omitted
> ... repeated a couple more times ... 
> {code}
> I'm ultimately unfamiliar with what should happen here, but I'm not sure if the database should throw an IndexOutOfBoundsException back to the client (is this a security issue?) In any case I wanted to bring up this issue just in case if this exception is something that shouldn't happen in normal operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)