You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Ismael Juma (JIRA)" <ji...@apache.org> on 2019/08/17 17:23:00 UTC

[jira] [Resolved] (KAFKA-7553) Jenkins PR tests hung

     [ https://issues.apache.org/jira/browse/KAFKA-7553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ismael Juma resolved KAFKA-7553.
--------------------------------
    Resolution: Fixed

I'm going to mark this as fixed since this is pretty old. I have seen a few cases of builds timing out more recently, but it's a new issue and not this one.

> Jenkins PR tests hung
> ---------------------
>
>                 Key: KAFKA-7553
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7553
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: John Roesler
>            Priority: Minor
>         Attachments: consoleText-2.txt, consoleText.txt
>
>
> I wouldn't worry about this unless it continues to happen, but I wanted to document it.
> This was a Java 11 build: [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/266/]
> It was for this PR: [https://github.com/apache/kafka/pull/5795]
> And this commit: [https://github.com/apache/kafka/pull/5795/commits/5bdcd0e023c6f406d585155399f6541bb6a9f9c2]
>  
> It looks like the tests just hung after 46 minutes, until the build timed out at 180 minutes.
> End of the output:
> {noformat}
> ...
> 00:46:27.275 kafka.server.ServerGenerateBrokerIdTest > testConsistentBrokerIdFromUserConfigAndMetaProps STARTED
> 00:46:29.775 
> 00:46:29.775 kafka.server.ServerGenerateBrokerIdTest > testConsistentBrokerIdFromUserConfigAndMetaProps PASSED
> 03:00:51.124 Build timed out (after 180 minutes). Marking the build as aborted.
> 03:00:51.440 Build was aborted
> 03:00:51.492 [FINDBUGS] Skipping publisher since build result is ABORTED
> 03:00:51.492 Recording test results
> 03:00:51.495 Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1
> 03:00:58.017 Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1
> 03:00:59.330 Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1
> 03:00:59.331 Adding one-line test results to commit status...
> 03:00:59.332 Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1
> 03:00:59.334 Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1
> 03:00:59.335 Setting status of 5bdcd0e023c6f406d585155399f6541bb6a9f9c2 to FAILURE with url https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/266/ and message: 'FAILURE
> 03:00:59.335  9053 tests run, 1 skipped, 0 failed.'
> 03:00:59.335 Using context: JDK 11 and Scala 2.12
> 03:00:59.541 Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1
> 03:00:59.542 Finished: ABORTED{noformat}
>  
> I did find one test that started but did not finish:
> {noformat}
> 00:23:29.576 kafka.api.PlaintextConsumerTest > testLowMaxFetchSizeForRequestAndPartition STARTED
> {noformat}
> But note that the tests continued to run for another 23 minutes after this one started.
>  
> Just for completeness, there were 4 failures:
> {noformat}
> 00:22:06.875 kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsNotExistingGroup FAILED
> 00:22:06.875     java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The coordinator is not available.
> 00:22:06.875         at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
> 00:22:06.875         at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
> 00:22:06.875         at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
> 00:22:06.876         at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:262)
> 00:22:06.876         at kafka.admin.ConsumerGroupCommand$ConsumerGroupService.resetOffsets(ConsumerGroupCommand.scala:307)
> 00:22:06.876         at kafka.admin.ResetConsumerGroupOffsetTest.testResetOffsetsNotExistingGroup(ResetConsumerGroupOffsetTest.scala:89)
> 00:22:06.876 
> 00:22:06.876         Caused by:
> 00:22:06.876         org.apache.kafka.common.errors.CoordinatorNotAvailableException: The coordinator is not available.{noformat}
>  
> {noformat}
> 00:25:22.175 kafka.api.CustomQuotaCallbackTest > testCustomQuotaCallback FAILED
> 00:25:22.175     java.lang.AssertionError: Partition [group1_largeTopic,69] metadata not propagated after 15000 ms
> 00:25:22.176         at kafka.utils.TestUtils$.fail(TestUtils.scala:351)
> 00:25:22.176         at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:741)
> 00:25:22.176         at kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:831)
> 00:25:22.176         at kafka.utils.TestUtils$$anonfun$createTopic$2.apply(TestUtils.scala:330)
> 00:25:22.176         at kafka.utils.TestUtils$$anonfun$createTopic$2.apply(TestUtils.scala:329)
> 00:25:22.176         at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> 00:25:22.176         at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> 00:25:22.176         at scala.collection.Iterator$class.foreach(Iterator.scala:891)
> 00:25:22.176         at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
> 00:25:22.176         at scala.collection.MapLike$DefaultKeySet.foreach(MapLike.scala:174)
> 00:25:22.176         at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> 00:25:22.176         at scala.collection.AbstractSet.scala$collection$SetLike$$super$map(Set.scala:47)
> 00:25:22.176         at scala.collection.SetLike$class.map(SetLike.scala:92)
> 00:25:22.176         at scala.collection.AbstractSet.map(Set.scala:47)
> 00:25:22.176         at kafka.utils.TestUtils$.createTopic(TestUtils.scala:329)
> 00:25:22.176         at kafka.utils.TestUtils$.createTopic(TestUtils.scala:312)
> 00:25:22.176         at kafka.api.CustomQuotaCallbackTest.createTopic(CustomQuotaCallbackTest.scala:180)
> 00:25:22.176         at kafka.api.CustomQuotaCallbackTest.testCustomQuotaCallback(CustomQuotaCallbackTest.scala:135){noformat}
>  
> {noformat}
> 00:22:28.075 kafka.api.SaslSslAdminClientIntegrationTest > testAclDescribe FAILED
> 00:22:28.075     org.junit.runners.model.TestTimedOutException: test timed out after 120000 milliseconds
> 00:22:28.075         at java.base@11/java.io.FileDescriptor.sync(Native Method)
> 00:22:28.075         at app//jdbm.recman.TransactionManager.sync(TransactionManager.java:385)
> 00:22:28.075         at app//jdbm.recman.TransactionManager.close(TransactionManager.java:405)
> 00:22:28.075         at app//jdbm.recman.TransactionManager.synchronizeLogFromMemory(TransactionManager.java:173)
> 00:22:28.075         at app//jdbm.recman.TransactionManager.shutdown(TransactionManager.java:395)
> 00:22:28.075         at app//jdbm.recman.RecordFile.close(RecordFile.java:365)
> 00:22:28.075         at app//jdbm.recman.BaseRecordManager.close(BaseRecordManager.java:167)
> 00:22:28.075         at app//jdbm.recman.CacheRecordManager.close(CacheRecordManager.java:297)
> 00:22:28.075         at app//org.apache.directory.server.core.partition.impl.btree.jdbm.JdbmIndex.close(JdbmIndex.java:580)
> 00:22:28.075         at app//org.apache.directory.server.core.partition.impl.btree.AbstractBTreePartition.doDestroy(AbstractBTreePartition.java:524)
> 00:22:28.075         at app//org.apache.directory.server.core.partition.impl.btree.jdbm.JdbmPartition.doDestroy(JdbmPartition.java:744)
> 00:22:28.075         at app//org.apache.directory.server.core.api.partition.AbstractPartition.destroy(AbstractPartition.java:153)
> 00:22:28.075         at app//org.apache.directory.server.core.shared.partition.DefaultPartitionNexus.removeContextPartition(DefaultPartitionNexus.java:886)
> 00:22:28.075         at app//org.apache.directory.server.core.shared.partition.DefaultPartitionNexus.doDestroy(DefaultPartitionNexus.java:287)
> 00:22:28.075         at app//org.apache.directory.server.core.api.partition.AbstractPartition.destroy(AbstractPartition.java:153)
> 00:22:28.075         at app//org.apache.directory.server.core.DefaultDirectoryService.shutdown(DefaultDirectoryService.java:1313)
> 00:22:28.075         at app//kafka.security.minikdc.MiniKdc.stop(MiniKdc.scala:278)
> 00:22:28.075         at app//kafka.api.SaslSetup$class.closeSasl(SaslSetup.scala:118)
> 00:22:28.075         at app//kafka.api.SaslSslAdminClientIntegrationTest.closeSasl(SaslSslAdminClientIntegrationTest.scala:34)
> 00:22:28.075         at app//kafka.api.SaslSslAdminClientIntegrationTest.tearDown(SaslSslAdminClientIntegrationTest.scala:91){noformat}
>  
> {noformat}
> 00:22:30.775 kafka.tools.MirrorMakerIntegrationTest > testCommaSeparatedRegex FAILED
> 00:22:30.775     kafka.tools.MirrorMaker$NoRecordsException
> 00:22:30.776         at kafka.tools.MirrorMaker$ConsumerWrapper.receive(MirrorMaker.scala:483)
> 00:22:30.776         at kafka.tools.MirrorMakerIntegrationTest$$anonfun$testCommaSeparatedRegex$1.apply$mcZ$sp(MirrorMakerIntegrationTest.scala:92)
> 00:22:30.776         at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:738)
> 00:22:30.776         at kafka.tools.MirrorMakerIntegrationTest.testCommaSeparatedRegex(MirrorMakerIntegrationTest.scala:90){noformat}
>  
>  
> It's not clear whether there's something wrong with the tests that made the job fail (especially the one that started but didn't end), or if there's something wrong with the build machine that made the tests fail.
> I've attached the console output for completeness.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)