You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Robert Metzger (Jira)" <ji...@apache.org> on 2021/02/09 07:17:00 UTC

[jira] [Created] (FLINK-21329) "Local recovery and sticky scheduling end-to-end test" does not finish within 600 seconds

Robert Metzger created FLINK-21329:
--------------------------------------

             Summary: "Local recovery and sticky scheduling end-to-end test" does not finish within 600 seconds
                 Key: FLINK-21329
                 URL: https://issues.apache.org/jira/browse/FLINK-21329
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Coordination
    Affects Versions: 1.13.0
            Reporter: Robert Metzger


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=13118&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=ff888d9b-cd34-53cc-d90f-3e446d355529

{code}
Feb 08 22:25:46 ==============================================================================
Feb 08 22:25:46 Running 'Local recovery and sticky scheduling end-to-end test'
Feb 08 22:25:46 ==============================================================================
Feb 08 22:25:46 TEST_DATA_DIR: /home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-46881214821
Feb 08 22:25:47 Flink dist directory: /home/vsts/work/1/s/flink-dist/target/flink-1.13-SNAPSHOT-bin/flink-1.13-SNAPSHOT
Feb 08 22:25:47 Running local recovery test with configuration:
Feb 08 22:25:47         parallelism: 4
Feb 08 22:25:47         max attempts: 10
Feb 08 22:25:47         backend: rocks
Feb 08 22:25:47         incremental checkpoints: false
Feb 08 22:25:47         kill JVM: false
Feb 08 22:25:47 Starting zookeeper daemon on host fv-az127-394.
Feb 08 22:25:47 Starting HA cluster with 1 masters.
Feb 08 22:25:48 Starting standalonesession daemon on host fv-az127-394.
Feb 08 22:25:49 Starting taskexecutor daemon on host fv-az127-394.
Feb 08 22:25:49 Waiting for Dispatcher REST endpoint to come up...
Feb 08 22:25:50 Waiting for Dispatcher REST endpoint to come up...
Feb 08 22:25:51 Waiting for Dispatcher REST endpoint to come up...
Feb 08 22:25:53 Waiting for Dispatcher REST endpoint to come up...
Feb 08 22:25:54 Dispatcher REST endpoint is up.
Feb 08 22:25:54 Started TM watchdog with PID 28961.
Feb 08 22:25:58 Job has been submitted with JobID e790e85a39040539f9386c0df7ca4812
Feb 08 22:35:47 Test (pid: 27970) did not finish after 600 seconds.
Feb 08 22:35:47 Printing Flink logs and killing it:

{code}

and

{code}

	at org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalDriver.unhandledError(ZooKeeperLeaderRetrievalDriver.java:184)
	at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$6.apply(CuratorFrameworkImpl.java:713)
	at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$6.apply(CuratorFrameworkImpl.java:709)
	at org.apache.flink.shaded.curator4.org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:100)
	at org.apache.flink.shaded.curator4.org.apache.curator.shaded.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
	at org.apache.flink.shaded.curator4.org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:92)
	at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.logError(CuratorFrameworkImpl.java:708)
	at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:874)
	at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:990)
	at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:943)
	at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:66)
	at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:346)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
	at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
	at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:862)
	... 10 more

{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)