You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Xavier Léauté (JIRA)" <ji...@apache.org> on 2017/04/17 22:12:41 UTC

[jira] [Commented] (KAFKA-5079) ProducerBounceTest fails occasionally with a SocketTimeoutException

    [ https://issues.apache.org/jira/browse/KAFKA-5079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15971735#comment-15971735 ] 

Xavier Léauté commented on KAFKA-5079:
--------------------------------------

A possible workaround for failing PR builds would be to execute each test build in its own container to avoid the port conflicts.
Are there any plans to support docker builds or something equivalent within the Apache Jenkins build infrastructure? 

> ProducerBounceTest fails occasionally with a SocketTimeoutException
> -------------------------------------------------------------------
>
>                 Key: KAFKA-5079
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5079
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Apurva Mehta
>
> {noformat}
> java.net.SocketTimeoutException
> 	at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
> 	at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> 	at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> 	at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:85)
> 	at kafka.network.BlockingChannel.readCompletely(BlockingChannel.scala:129)
> 	at kafka.network.BlockingChannel.receive(BlockingChannel.scala:120)
> 	at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:100)
> 	at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:84)
> 	at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:133)
> 	at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:133)
> 	at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:133)
> 	at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:32)
> 	at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:132)
> 	at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:132)
> 	at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:132)
> 	at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:32)
> 	at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:131)
> 	at kafka.api.ProducerBounceTest$$anonfun$2.apply(ProducerBounceTest.scala:116)
> 	at kafka.api.ProducerBounceTest$$anonfun$2.apply(ProducerBounceTest.scala:113)
> {noformat}
> This is expected occasionally, since the ports are preallocated and the brokers are bounced in quick succession. Here is the relevant comment from the code: 
> {noformat}
>   // This is the one of the few tests we currently allow to preallocate ports, despite the fact that this can result in transient
>   // failures due to ports getting reused. We can't use random ports because of bad behavior that can result from bouncing
>   // brokers too quickly when they get new, random ports. If we're not careful, the client can end up in a situation
>   // where metadata is not refreshed quickly enough, and by the time it's actually trying to, all the servers have
>   // been bounced and have new addresses. None of the bootstrap nodes or current metadata can get them connected to a
>   // running server.
>   //
>   // Since such quick rotation of servers is incredibly unrealistic, we allow this one test to preallocate ports, leaving
>   // a small risk of hitting errors due to port conflicts. Hopefully this is infrequent enough to not cause problems.
> {noformat}
> We should try to look into handling this exception better so that the test doesn't fail occasionally. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)