You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Andrew Wong (JIRA)" <ji...@apache.org> on 2018/01/05 23:53:00 UTC

[jira] [Comment Edited] (KUDU-2236) org.apache.kudu.client.TestKuduClient flaky

    [ https://issues.apache.org/jira/browse/KUDU-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314158#comment-16314158 ] 

Andrew Wong edited comment on KUDU-2236 at 1/5/18 11:52 PM:
------------------------------------------------------------

So after looking a bit more, I think it was a regression _and maybe_ an incomplete bug fix in the first place. Taking out the regression and looping the test still spews some WARN-level exception messages stemming from ConnectToMasterErrCB.call() that seems to at least have been since a pretty old refactoring of ConnectToCluster (see 3102e3833bcf32a290b63e8d6cae917b62194ac6).

Looping the test after taking out the WARN-level exception logging _and_ removing the regression from 986e8de63d8687421c476de07c4e889129062637, this passes 100/100.

Looping leaving in the WARN-level logging led to ~50/100 failures. Looping leaving in the regression only and commenting out the WARN-level logging led to ~20/100 failures. A fix for the regression seems straightforward (conditional logging, maybe conditionally logging DEBUG level instead of INFO and updating so only INFO-level logs are parsed, in case the logging is still valuable for debugging). A fix for the WARN-level logging, though, I have to think about it a bit more (suggestions welcome :) ).


was (Author: andrew.wong):
After experimenting a bit more, it seems the culprit is exclusively the WARN-level exception messages stemming from ConnectToMasterErrCB.call(), which 
So after looking a bit more, I think it was a regression _and maybe_ an incomplete bug fix in the first place. Taking out the regression and looping the test still spews some WARN-level exception messages stemming from ConnectToMasterErrCB.call() that seems to at least have been since a pretty old refactoring of ConnectToCluster (see 3102e3833bcf32a290b63e8d6cae917b62194ac6).

Looping the test after taking out the WARN-level exception logging _and_ removing the regression from 986e8de63d8687421c476de07c4e889129062637, this passes 100/100. Looping leaving in the WARN-level logging led to ~50/100 failures. Looping leaving in the regression only and commenting out the WARN-level logging led to ~20/100 failures. A fix for the regression seems straightforward (conditional logging, maybe conditionally logging DEBUG level instead of INFO and updating so only INFO-level logs are parsed, in case the logging is still valuable for debugging). A fix for the WARN-level logging, though, I have to think about it a bit more (suggestions welcome :) ).

> org.apache.kudu.client.TestKuduClient flaky
> -------------------------------------------
>
>                 Key: KUDU-2236
>                 URL: https://issues.apache.org/jira/browse/KUDU-2236
>             Project: Kudu
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 1.6.0
>            Reporter: Edward Fancher
>            Assignee: Andrew Wong
>
> Last seen in org.apache.kudu.client.TestKuduClient.testCloseShortlyAfterOpen
> DEBUG - Could not login via JAAS. Using no credentials: Unable to obtain Principal Name for authentication 
> DEBUG - SASL mechanism PLAIN chosen for peer 127.63.177.1
> DEBUG - SASL mechanism PLAIN chosen for peer 127.63.177.1
> DEBUG - SASL mechanism PLAIN chosen for peer 127.63.177.1
> DEBUG - Learned about tablet Kudu Master for table 'Kudu Master' with partition [<start>, <end>)
> DEBUG - Releasing all remaining resources
> DEBUG - [peer master-127.63.177.1:64030] cleaning up while in state READY due to: connection disconnected
> INFO - W1206 07:14:39.727399 16334 connection.cc:511] server connection from 127.63.177.1:43497 recv error: Network error: recv error: Connection reset by peer (error 104)
> DEBUG - [peer master-127.63.177.1:64034] cleaning up while in state NEGOTIATING due to: connection disconnected
> WARN - Error receiving response from 127.63.177.1:64030
> org.apache.kudu.client.RecoverableException: connection disconnected
>  at org.apache.kudu.client.Connection.channelDisconnected(Connection.java:244)
>  at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
>  at org.apache.kudu.client.Connection.handleUpstream(Connection.java:236)
>  at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>  at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>  at org.jboss.netty.channel.SimpleChannelUpstreamHandler.channelDisconnected(SimpleChannelUpstreamHandler.java:208)
>  at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
>  at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>  at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>  at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:60)
>  at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>  at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>  at org.jboss.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:493)
>  at org.jboss.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365)
>  at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
>  at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>  at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
>  at org.jboss.netty.channel.Channels.fireChannelDisconnected(Channels.java:396)
>  at org.jboss.netty.channel.Channels$4.run(Channels.java:386)
>  at org.jboss.netty.channel.socket.ChannelRunnableWrapper.run(ChannelRunnableWrapper.java:40)
>  at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:391)
>  at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:315)
>  at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
>  at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
>  at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>  at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> INFO - W1206 07:14:39.741600 17527 negotiation.cc:311] Failed RPC negotiation. Trace:
> INFO - 1206 07:14:39.695892 (+     0us) reactor.cc:499] Submitting negotiation task for server connection from 127.63.177.1:48551
> INFO - 1206 07:14:39.722215 (+ 26323us) server_negotiation.cc:173] Beginning negotiation
> INFO - 1206 07:14:39.722236 (+    21us) server_negotiation.cc:361] Waiting for connection header
> DEBUG - [peer master-127.63.177.1:64032] cleaning up while in state READY due to: connection disconnected
> WARN - Error receiving response from 127.63.177.1:64034
> org.apache.kudu.client.RecoverableException: connection disconnected
>  at org.apache.kudu.client.Connection.channelDisconnected(Connection.java:244)
>  at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
>  at org.apache.kudu.client.Connection.handleUpstream(Connection.java:236)
>  at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>  at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>  at org.jboss.netty.channel.SimpleChannelUpstreamHandler.channelDisconnected(SimpleChannelUpstreamHandler.java:208)
>  at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
>  at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>  at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>  at org.jboss.netty.channel.SimpleChannelUpstreamHandler.channelDisconnected(SimpleChannelUpstreamHandler.java:208)
>  at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
>  at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>  at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>  at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:60)
>  at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>  at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>  at org.jboss.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:493)
>  at org.jboss.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365)
>  at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
>  at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>  at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
>  at org.jboss.netty.channel.Channels.fireChannelDisconnected(Channels.java:396)
>  at org.jboss.netty.channel.Channels$4.run(Channels.java:386)
>  at org.jboss.netty.channel.socket.ChannelRunnableWrapper.run(ChannelRunnableWrapper.java:40)
>  at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:391)
>  at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:315)
>  at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
>  at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
>  at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>  at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> INFO - 1206 07:14:39.722274 (+    38us) server_negotiation.cc:369] Connection header received
> INFO - 1206 07:14:39.722401 (+   127us) server_negotiation.cc:325] Received NEGOTIATE NegotiatePB request
> INFO - 1206 07:14:39.722424 (+    23us) server_negotiation.cc:410] Received NEGOTIATE request from client
> INFO - 1206 07:14:39.722534 (+   110us) server_negotiation.cc:337] Sending NEGOTIATE NegotiatePB response
> INFO - 1206 07:14:39.722662 (+   128us) server_negotiation.cc:194] Negotiated authn=SASL
> INFO - 1206 07:14:39.724372 (+  1710us) server_negotiation.cc:325] Received TLS_HANDSHAKE NegotiatePB request
> INFO - 1206 07:14:39.724946 (+   574us) server_negotiation.cc:337] Sending TLS_HANDSHAKE NegotiatePB response
> INFO - 1206 07:14:39.730617 (+  5671us) server_negotiation.cc:325] Received TLS_HANDSHAKE NegotiatePB request
> INFO - 1206 07:14:39.736652 (+  6035us) server_negotiation.cc:337] Sending TLS_HANDSHAKE NegotiatePB response
> INFO - 1206 07:14:39.737144 (+   492us) negotiation.cc:302] Negotiation complete: Network error: Server connection negotiation failed: server connection from 127.63.177.1:48551: BlockingWrite error: write error: Broken pipe (error 32)
> INFO - Metrics: {"server-negotiator.queue_time_us":26096,"thread_start_us":25983,"threads_started":1}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)