You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by eolivelli <gi...@git.apache.org> on 2017/06/21 14:20:16 UTC

[GitHub] bookkeeper issue #198: TestBackwardCompat.testCompat410 often fails due to i...

GitHub user eolivelli opened an issue:

    https://github.com/apache/bookkeeper/issues/198

    TestBackwardCompat.testCompat410 often fails due to io.netty.util.IllegalReferenceCountException

    The test is failing very often on Jenkins.
    It is a bug related to Netty 4 buffers usage
    
    > Error
    > Shouldn't be able to write
    > Stacktrace
    > java.lang.AssertionError: Shouldn't be able to write
    > 	at org.apache.bookkeeper.test.TestBackwardCompat.testCompat410(TestBackwardCompat.java:647)
    > 
    
    ```
    2017-06-21 12:52:51,207 - WARN  - [bookkeeper-io-0:Slf4JLogger@151] - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
    io.netty.util.IllegalReferenceCountException: refCnt: 0, decrement: 1
    	at io.netty.buffer.AbstractReferenceCountedByteBuf.release0(AbstractReferenceCountedByteBuf.java:101)
    	at io.netty.buffer.AbstractReferenceCountedByteBuf.release(AbstractReferenceCountedByteBuf.java:89)
    	at io.netty.util.ReferenceCountUtil.release(ReferenceCountUtil.java:84)
    	at io.netty.channel.DefaultChannelPipeline.onUnhandledInboundMessage(DefaultChannelPipeline.java:1169)
    	at io.netty.channel.DefaultChannelPipeline$TailContext.channelRead(DefaultChannelPipeline.java:1221)
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
    	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
    	at org.apache.bookkeeper.proto.PerChannelBookieClient.channelRead(PerChannelBookieClient.java:1175)
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
    	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
    	at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
    	at org.apache.bookkeeper.proto.AuthHandler$ClientSideHandler.channelRead(AuthHandler.java:272)
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
    	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
    	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
    	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
    	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293)
    	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:267)
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
    	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
    	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
    	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
    	at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:1017)
    	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:394)
    	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:299)
    	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
    	at java.lang.Thread.run(Thread.java:748)
    ```

----

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bookkeeper issue #198: TestBackwardCompat.testCompat410 often fails due to i...

Posted by sijie <gi...@git.apache.org>.
Github user sijie commented on the issue:

    https://github.com/apache/bookkeeper/issues/198
  
    marked this as a blocker for 4.5.0, since it seems to indicate some kind of reference leaking in current code base.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bookkeeper issue #198: TestBackwardCompat.testCompat410 often fails due to i...

Posted by jiazhai <gi...@git.apache.org>.
Github user jiazhai closed the issue at:

    https://github.com/apache/bookkeeper/issues/198


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bookkeeper issue #198: TestBackwardCompat.testCompat410 often fails due to i...

Posted by eolivelli <gi...@git.apache.org>.
Github user eolivelli commented on the issue:

    https://github.com/apache/bookkeeper/issues/198
  
    @kishorekasi my error report was against an old version of the master.
    I think that @merlimat suggestion is good. Can you give it a try?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bookkeeper issue #198: TestBackwardCompat.testCompat410 often fails due to i...

Posted by kishorekasi <gi...@git.apache.org>.
Github user kishorekasi commented on the issue:

    https://github.com/apache/bookkeeper/issues/198
  
    Enrico,
    
    Let me look into this.
    
    Kishore
    
    On Wed, Jun 21, 2017 at 7:21 AM, Enrico Olivelli <no...@github.com>
    wrote:
    
    > @merlimat <https://github.com/merlimat> @kishorekasi
    > <https://github.com/kishorekasi>
    > do you have time to check ? at the moment you are the most netty 4 experts
    > in the group
    >
    > —
    > You are receiving this because you were mentioned.
    > Reply to this email directly, view it on GitHub
    > <https://github.com/apache/bookkeeper/issues/198#issuecomment-310093908>,
    > or mute the thread
    > <https://github.com/notifications/unsubscribe-auth/ADKY-KPG59ic6Mol20bYlMNJ4cBxNNgdks5sGSbagaJpZM4OBCCA>
    > .
    >



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bookkeeper issue #198: TestBackwardCompat.testCompat410 often fails due to i...

Posted by kishorekasi <gi...@git.apache.org>.
Github user kishorekasi commented on the issue:

    https://github.com/apache/bookkeeper/issues/198
  
    Enrico,
    
    Found a few refcount leaks, fixed them and created a pull request. I
    couldn't reproduce the test failure locally. So, could you please check and
    let me know if this fixes the issue.
    
    Kishore
    
    On Thu, Jun 22, 2017 at 10:36 AM, Kishore Kasi <kk...@gmail.com> wrote:
    
    > Enrico,
    >
    > Let me look into this.
    >
    > Kishore
    >
    > On Wed, Jun 21, 2017 at 7:21 AM, Enrico Olivelli <notifications@github.com
    > > wrote:
    >
    >> @merlimat <https://github.com/merlimat> @kishorekasi
    >> <https://github.com/kishorekasi>
    >> do you have time to check ? at the moment you are the most netty 4
    >> experts in the group
    >>
    >> —
    >> You are receiving this because you were mentioned.
    >> Reply to this email directly, view it on GitHub
    >> <https://github.com/apache/bookkeeper/issues/198#issuecomment-310093908>,
    >> or mute the thread
    >> <https://github.com/notifications/unsubscribe-auth/ADKY-KPG59ic6Mol20bYlMNJ4cBxNNgdks5sGSbagaJpZM4OBCCA>
    >> .
    >>
    >
    >



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bookkeeper issue #198: TestBackwardCompat.testCompat410 often fails due to i...

Posted by eolivelli <gi...@git.apache.org>.
Github user eolivelli commented on the issue:

    https://github.com/apache/bookkeeper/issues/198
  
    @merlimat  @kishorekasi
    do you have time to check ? at the moment you are the most netty 4 experts in the group


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bookkeeper issue #198: TestBackwardCompat.testCompat410 often fails due to i...

Posted by sijie <gi...@git.apache.org>.
Github user sijie commented on the issue:

    https://github.com/apache/bookkeeper/issues/198
  
    @eolivelli @kishorekasi 
    
    @merlimat  and me were looking together at this issue. @merlimat found the problem on how do we handling the invalid op code in v2 protocol.
    
    in TestBackwardCompat#testCompact400, the test is to verify the current client can't talk to a 4.0.0 server. 
    
    The 4.5.0 client is sending a protobuf encoded request to 4.0.0 server. The 4.0.0 server will interpret the 4.5.0 protobuf encoded request, but it will realize this is bad request and sending v2 protocol encoded response. because the request is a bad request, 4.0.0 server sent a response back with unknown op code.
    
    In current v2 ResponseEnDecoder (listed as below), when it doesn't know the op code, it will return the buffer to the channel. this might cause the misbehavior in the channel pipeline to decrement reference count.
    
    `
            @Override
            public Object decode(ByteBuf buffer)
                    throws Exception {
                int rc;
                long ledgerId, entryId;
    
                int packetHeader = buffer.readInt();
                byte version = PacketHeader.getVersion(packetHeader);
                byte opCode = PacketHeader.getOpCode(packetHeader);
    
                switch (opCode) {
                case BookieProtocol.ADDENTRY:
                    rc = buffer.readInt();
                    ledgerId = buffer.readLong();
                    entryId = buffer.readLong();
                    return new BookieProtocol.AddResponse(version, rc, ledgerId, entryId);
                case BookieProtocol.READENTRY:
                    rc = buffer.readInt();
                    ledgerId = buffer.readLong();
                    entryId = buffer.readLong();
    
                    if (rc == BookieProtocol.EOK) {
                        ByteBuf content = buffer.slice();
                        return new BookieProtocol.ReadResponse(version, rc, ledgerId, entryId, content.retain());
                    } else {
                        return new BookieProtocol.ReadResponse(version, rc, ledgerId, entryId);
                    }
                case BookieProtocol.AUTH:
                    ByteBufInputStream bufStream = new ByteBufInputStream(buffer);
                    BookkeeperProtocol.AuthMessage.Builder builder
                        = BookkeeperProtocol.AuthMessage.newBuilder();
                    builder.mergeFrom(bufStream, extensionRegistry);
                    BookkeeperProtocol.AuthMessage am = builder.build();
                    return new BookieProtocol.AuthResponse(version, am);
                default:
                    return buffer;
                }
            }
    `
    
    One suggested fix from @merlimat  is to throw exception in the EnDeCoder when receiving unknown op code. so the netty can close the connection, error out the pending requests and cleaning up the resources.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---