You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by pengchenglin <pe...@163.com> on 2019/12/02 07:54:51 UTC

perjob yarn页面链接flink出错

各位大佬:
    
    有遇到过这种问题吗,perjob模式,4台机器,同时运行了多个任务,在yarn的管理页面上,点tracking ui,跳转到flink页面,都是同一个任务的flink页面。

flink配置如下:
high-availability.zookeeper.client.max-retry-attempts: 10
historyserver.web.address: 0.0.0.0
state.checkpoints.num-retained: 3
historyserver.web.port: 8082
env.java.opts: -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/xx/oomdump -XX:ReservedCodeCacheSize=2048M
high-availability.cluster-id: /v1.1/perjob-20191130
jobmanager.execution.failover-strategy: region
#jobmanager.rpc.address: xx
state.savepoints.dir: hdfs://xx:8020/flink/v1.1/mxm1130_savepoint
high-availability.zookeeper.path.root: /flink
high-availability.zookeeper.client.session-timeout: 300000
taskmanager.registration.timeout: 20 min
high-availability.storageDir: hdfs://xx:8020/flink/v1.1/mxm_ha/
task.cancellation.timeout: 0
taskmanager.network.numberOfBuffers: 10240
parallelism.default: 8
taskmanager.numberOfTaskSlots: 8
akka.ask.timeout: 600 s
historyserver.archive.fs.dir: hdfs://xx:8020/flink/v1.1/completed-jobs-mxm1130/
jobmanager.heap.size: 2048m
jobmanager.archive.fs.dir: hdfs://xx:8020/flink/v1.1/mxm1130completed-jobs/
heartbeat.timeout: 300000
restart-strategy.fixed-delay.attempts: 360
high-availability.zookeeper.client.connection-timeout: 60000
historyserver.archive.fs.refresh-interval: 10000
jobmanager.rpc.port: 6123
jobstore.expiration-time: 14400
#rest.port: 8983
high-availability.zookeeper.quorum: xx:6666,xx:6666,xx:6666
restart-strategy.fixed-delay.delay: 1 s
high-availability: zookeeper
state.backend: filesystem
restart-strategy: fixed-delay
taskmanager.heap.size: 8192m
akka.client.timeout: 600 s
state.checkpoints.dir: hdfs://xx:8020/flink/v1.1/mxm1130_checkpoint

回复: perjob yarn页面链接flink出错

Posted by pengchenglin <pe...@163.com>.
perjob模式,运行多个任务时,如果配置了high-availability.zookeeper.path.root: /flink
和high-availability.cluster-id: /v1.1/perjob-20191130 由于每个任务一个集群,所以每个集群在zookeeper上的目录都一样,导致yarn关联的flink相同,推荐删除上面两个配置,对于perjob模式。
 
发件人: pengchenglin
发送时间: 2019-12-02 15:54
收件人: user-zh@flink.apache.org
主题: perjob yarn页面链接flink出错
各位大佬:
    
    有遇到过这种问题吗,perjob模式,4台机器,同时运行了多个任务,在yarn的管理页面上,点tracking ui,跳转到flink页面,都是同一个任务的flink页面。

flink配置如下:
high-availability.zookeeper.client.max-retry-attempts: 10
historyserver.web.address: 0.0.0.0
state.checkpoints.num-retained: 3
historyserver.web.port: 8082
env.java.opts: -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/xx/oomdump -XX:ReservedCodeCacheSize=2048M
high-availability.cluster-id: /v1.1/perjob-20191130
jobmanager.execution.failover-strategy: region
#jobmanager.rpc.address: xx
state.savepoints.dir: hdfs://xx:8020/flink/v1.1/mxm1130_savepoint
high-availability.zookeeper.path.root: /flink
high-availability.zookeeper.client.session-timeout: 300000
taskmanager.registration.timeout: 20 min
high-availability.storageDir: hdfs://xx:8020/flink/v1.1/mxm_ha/
task.cancellation.timeout: 0
taskmanager.network.numberOfBuffers: 10240
parallelism.default: 8
taskmanager.numberOfTaskSlots: 8
akka.ask.timeout: 600 s
historyserver.archive.fs.dir: hdfs://xx:8020/flink/v1.1/completed-jobs-mxm1130/
jobmanager.heap.size: 2048m
jobmanager.archive.fs.dir: hdfs://xx:8020/flink/v1.1/mxm1130completed-jobs/
heartbeat.timeout: 300000
restart-strategy.fixed-delay.attempts: 360
high-availability.zookeeper.client.connection-timeout: 60000
historyserver.archive.fs.refresh-interval: 10000
jobmanager.rpc.port: 6123
jobstore.expiration-time: 14400
#rest.port: 8983
high-availability.zookeeper.quorum: xx:6666,xx:6666,xx:6666
restart-strategy.fixed-delay.delay: 1 s
high-availability: zookeeper
state.backend: filesystem
restart-strategy: fixed-delay
taskmanager.heap.size: 8192m
akka.client.timeout: 600 s
state.checkpoints.dir: hdfs://xx:8020/flink/v1.1/mxm1130_checkpoint