You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/03/29 16:03:42 UTC

[GitHub] [pulsar] dlg99 opened a new issue #14936: pulsar-io Sinks & Sources tests fail (flake?) with "Failed to open ledger"

dlg99 opened a new issue #14936:
URL: https://github.com/apache/pulsar/issues/14936


   **Describe the bug**
   
   "CI - Integration - Pulsar-IO Sinks and Sources / pulsar-io" job fails on multiple PRs and fail on re-runs.
   Failed integration test seems to not be affected by the changes in the PR per se.
   
   Most commonly I see testRabbitMQSink failures, but sometimes testKinesis or testKafka.
   
   Example PRs: https://github.com/apache/pulsar/pull/14922 https://github.com/apache/pulsar/pull/14912 https://github.com/apache/pulsar/pull/14916
   
   **To Reproduce**
   
   Rerun "CI - Integration - Pulsar-IO Sinks and Sources / pulsar-io" on PRs, the job fails quite repeatedly.
   
   **Expected behavior**
   
   The job pass.
   
   **Log**
   
   ```
   2022-03-29T14:25:38,055+0000 INFO  [main] o.a.p.t.MockitoCleanupListener@40 - Cleaning up Mockito's ThreadSafeMockingProgress.MOCKING_PROGRESS_PROVIDER thread local state. {}
   Error:  Tests run: 12, Failures: 3, Errors: 0, Skipped: 6, Time elapsed: 866.072 s <<< FAILURE! - in TestSuite
   Error:  testKafkaSink(org.apache.pulsar.tests.integration.io.sinks.PulsarSinksTest)  Time elapsed: 132.423 s  <<< FAILURE!
   org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.ExecutionException: org.apache.pulsar.client.api.PulsarClientException: {"errorMsg":"org.apache.pulsar.broker.service.schema.exceptions.SchemaException: Error while recovering ledger -  ledger=5 - operation=Failed to open ledger caused by org.apache.pulsar.broker.service.schema.exceptions.SchemaException: Error while recovering ledger -  ledger=5 - operation=Failed to open ledger","reqId":1253756777559247167, "remote":"localhost/127.0.0.1:49251", "local":"/127.0.0.1:48208"}
   	at org.apache.pulsar.client.api.PulsarClientException.unwrap(PulsarClientException.java:1074)
   	at org.apache.pulsar.client.impl.ProducerBuilderImpl.create(ProducerBuilderImpl.java:88)
   	at org.apache.pulsar.tests.integration.io.sinks.SinkTester.produceMessage(SinkTester.java:118)
   	at org.apache.pulsar.tests.integration.io.PulsarIOTestRunner.produceMessagesToInputTopic(PulsarIOTestRunner.java:93)
   	at org.apache.pulsar.tests.integration.io.sinks.PulsarIOSinkRunner.runSinkTester(PulsarIOSinkRunner.java:116)
   	at org.apache.pulsar.tests.integration.io.PulsarIOTestBase.testSink(PulsarIOTestBase.java:50)
   	at org.apache.pulsar.tests.integration.io.sinks.PulsarSinksTest.testKafkaSink(PulsarSinksTest.java:34)
   	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
   	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:132)
   	at org.testng.internal.InvokeMethodRunnable.runOne(InvokeMethodRunnable.java:45)
   	at org.testng.internal.InvokeMethodRunnable.call(InvokeMethodRunnable.java:73)
   	at org.testng.internal.InvokeMethodRunnable.call(InvokeMethodRunnable.java:11)
   	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
   	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
   	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
   	at java.base/java.lang.Thread.run(Thread.java:829)
   Caused by: java.util.concurrent.ExecutionException: org.apache.pulsar.client.api.PulsarClientException: {"errorMsg":"org.apache.pulsar.broker.service.schema.exceptions.SchemaException: Error while recovering ledger -  ledger=5 - operation=Failed to open ledger caused by org.apache.pulsar.broker.service.schema.exceptions.SchemaException: Error while recovering ledger -  ledger=5 - operation=Failed to open ledger","reqId":1253756777559247167, "remote":"localhost/127.0.0.1:49251", "local":"/127.0.0.1:48208"}
   	at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
   	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
   	at org.apache.pulsar.client.impl.ProducerBuilderImpl.create(ProducerBuilderImpl.java:86)
   	... 17 more
   Caused by: org.apache.pulsar.client.api.PulsarClientException: {"errorMsg":"org.apache.pulsar.broker.service.schema.exceptions.SchemaException: Error while recovering ledger -  ledger=5 - operation=Failed to open ledger caused by org.apache.pulsar.broker.service.schema.exceptions.SchemaException: Error while recovering ledger -  ledger=5 - operation=Failed to open ledger","reqId":1253756777559247167, "remote":"localhost/127.0.0.1:49251", "local":"/127.0.0.1:48208"}
   	at org.apache.pulsar.client.impl.ClientCnx.getPulsarClientException(ClientCnx.java:1196)
   	at org.apache.pulsar.client.impl.ClientCnx.handleError(ClientCnx.java:726)
   	at org.apache.pulsar.common.protocol.PulsarDecoder.channelRead(PulsarDecoder.java:178)
   	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
   	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
   	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
   	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:327)
   	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:299)
   	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
   	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
   	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
   	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
   	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
   	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
   	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
   	at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795)
   	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480)
   	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
   	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
   	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
   	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
   	... 1 more
   
   Error:  testKinesis(org.apache.pulsar.tests.integration.io.sinks.PulsarSinksTest)  Time elapsed: 0 s  <<< FAILURE!
   org.apache.pulsar.tests.FailFastNotifier$FailFastSkipException: Skipped after failure since testFailFast system property is set.
   	at org.apache.pulsar.tests.FailFastNotifier.beforeInvocation(FailFastNotifier.java:88)
   
   Error:  testRabbitMQSink(org.apache.pulsar.tests.integration.io.sinks.PulsarSinksTest)  Time elapsed: 0 s  <<< FAILURE!
   org.apache.pulsar.tests.FailFastNotifier$FailFastSkipException: Skipped after failure since testFailFast system property is set.
   	at org.apache.pulsar.tests.FailFastNotifier.beforeInvocation(FailFastNotifier.java:88)
   
   [INFO] 
   [INFO] Results:
   [INFO] 
   Error:  Failures: 
   Error:  org.apache.pulsar.tests.integration.io.sinks.PulsarSinksTest.testKafkaSink(org.apache.pulsar.tests.integration.io.sinks.PulsarSinksTest)
   [INFO]   Run 1: PASS
   Error:    Run 2: PulsarSinksTest.testKafkaSink:34->PulsarIOTestBase.testSink:50 » PulsarClient ...
   [INFO] 
   Error:    PulsarSinksTest.testKinesis » FailFastSkip Skipped after failure since testFai...
   Error:    PulsarSinksTest.testRabbitMQSink » FailFastSkip Skipped after failure since te...
   [INFO] 
   Error:  Tests run: 9, Failures: 3, Errors: 0, Skipped: 3
   [INFO] 
   [INFO] ------------------------------------------------------------------------
   [INFO] Reactor Summary for Apache Pulsar :: Tests 2.10.0-SNAPSHOT:
   [INFO] 
   [INFO] Apache Pulsar :: Tests ............................. SUCCESS [  2.587 s]
   [INFO] Apache Pulsar :: Tests :: Integration .............. FAILURE [14:35 min]
   [INFO] ------------------------------------------------------------------------
   [INFO] BUILD FAILURE
   [INFO] ------------------------------------------------------------------------
   [INFO] Total time:  14:38 min
   [INFO] Finished at: 2022-03-29T14:25:38Z
   [INFO] ------------------------------------------------------------------------
   Error:  Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M3:test (default-test) on project integration: There are test failures.
   Error:  
   Error:  Please refer to /home/runner/work/pulsar/pulsar/tests/integration/target/surefire-reports for the individual test results.
   Error:  Please refer to dump files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
   Error:  -> [Help 1]
   Error:  
   Error:  To see the full stack trace of the errors, re-run Maven with the -e switch.
   Error:  Re-run Maven using the -X switch to enable full debug logging.
   Error:  
   Error:  For more information about the errors and possible solutions, please read the following articles:
   Error:  [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
   Error:  
   Error:  After correcting the problems, you can resume the build with the command
   Error:    mvn <args> -rf :integration
   Error: Process completed with exit code 1.
   6s
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] eolivelli commented on issue #14936: pulsar-io Sinks & Sources tests fail (flake?) with "Failed to open ledger"

Posted by GitBox <gi...@apache.org>.
eolivelli commented on issue #14936:
URL: https://github.com/apache/pulsar/issues/14936#issuecomment-1082084509


   In integration tests we run the bookies and the brokers in docker containers.
   maybe there is some leak ?
   cc @lhotari @nicoloboschi 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] nicoloboschi commented on issue #14936: pulsar-io Sinks & Sources tests fail (flake?) with "Failed to open ledger"

Posted by GitBox <gi...@apache.org>.
nicoloboschi commented on issue #14936:
URL: https://github.com/apache/pulsar/issues/14936#issuecomment-1082125792


   Maybe related to the recent pull https://github.com/apache/pulsar/pull/14796
   btw the containers shutdown is handled in the same way for every tester


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on issue #14936: pulsar-io Sinks & Sources tests fail (flake?) with "Failed to open ledger"

Posted by GitBox <gi...@apache.org>.
lhotari commented on issue #14936:
URL: https://github.com/apache/pulsar/issues/14936#issuecomment-1084174032


   Will be fixed by #14962


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari closed issue #14936: pulsar-io Sinks & Sources tests fail (flake?) with "Failed to open ledger"

Posted by GitBox <gi...@apache.org>.
lhotari closed issue #14936:
URL: https://github.com/apache/pulsar/issues/14936


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org