You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by "casel.chen" <ca...@126.com> on 2021/11/04 10:42:15 UTC

flink启动yarn-session失败

flink 1.13.2 + hadoop 3.2.1
yarn上已经成功跑了hive和spark作业
flink上通过运行 bin/yarn-session.sh 启动yarn session集群的时候一直报如下INFO日志,查看yarn web console发现并没有启flink-session集群,我的flink-conf.yaml配置如附件,hadoop集群并没有开启认证SSL之类的,改用standalone模式是可以启动3节点集群的,请问这会是什么原因造成的?要怎么修复?谢谢!



2021-11-04 16:51:39,964 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-04 16:51:39,986 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-04 16:51:40,004 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-04 16:51:40,020 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-04 16:51:40,041 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-04 16:51:40,059 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-04 16:51:40,078 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-04 16:51:40,097 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-04 16:51:40,114 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-04 16:51:40,134 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-04 16:51:40,155 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-04 16:51:40,175 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-04 16:51:40,193 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-04 16:51:40,212 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false






Re:Re:Re: flink启动yarn-session失败

Posted by "casel.chen" <ca...@126.com>.
没有人遇到这类问题吗?



[docker@master flink-1.13.2]$ ./bin/yarn-session.sh

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/home/docker/flink-1.13.2/lib/log4j-slf4j-impl-2.12.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/home/docker/hadoop-3.2.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

2021-11-08 14:36:43,651 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: jobmanager.rpc.address, master

2021-11-08 14:36:43,657 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: jobmanager.rpc.port, 6123

2021-11-08 14:36:43,657 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: jobmanager.memory.process.size, 1600m

2021-11-08 14:36:43,657 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: taskmanager.memory.process.size, 1728m

2021-11-08 14:36:43,657 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: taskmanager.numberOfTaskSlots, 8

2021-11-08 14:36:43,657 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: parallelism.default, 1

2021-11-08 14:36:43,658 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: execution.checkpointing.interval, 30s

2021-11-08 14:36:43,658 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: execution.checkpointing.unaligned, true

2021-11-08 14:36:43,658 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: execution.checkpointing.timeout, 1200s

2021-11-08 14:36:43,658 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: state.backend, filesystem

2021-11-08 14:36:43,658 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: state.checkpoints.dir, oss://datalake-huifu/hudi/flink/checkpoints

2021-11-08 14:36:43,659 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: state.savepoints.dir, oss://datalake-huifu/hudi/flink/savepoints

2021-11-08 14:36:43,659 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: state.backend.incremental, true

2021-11-08 14:36:43,659 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: jobmanager.execution.failover-strategy, region

2021-11-08 14:36:43,659 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: rest.port, 18081

2021-11-08 14:36:43,660 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: classloader.resolve-order, parent-first

2021-11-08 14:36:43,660 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: web.submit.enabled, false

2021-11-08 14:36:43,660 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: fs.oss.endpoint, oss-cn-shanghai.aliyuncs.com

2021-11-08 14:36:43,660 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: fs.oss.accessKeyId, LTAI5tJ4k9pk1KwZVsLd8NHd

2021-11-08 14:36:43,661 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: fs.oss.accessKeySecret, ******

2021-11-08 14:36:44,042 WARN  org.apache.hadoop.util.NativeCodeLoader                      [] - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

2021-11-08 14:36:44,155 INFO  org.apache.flink.runtime.security.modules.HadoopModule       [] - Hadoop user set to docker (auth:SIMPLE)

2021-11-08 14:36:44,171 INFO  org.apache.flink.runtime.security.modules.JaasModule         [] - Jaas file will be created as /tmp/jaas-5414790866026859677.conf.

2021-11-08 14:36:44,213 WARN  org.apache.flink.yarn.configuration.YarnLogConfigUtil        [] - The configuration directory ('/home/docker/flink-1.13.2/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.

2021-11-08 14:36:44,340 INFO  org.apache.hadoop.yarn.client.RMProxy                        [] - Connecting to ResourceManager at master/192.168.16.191:8032

2021-11-08 14:36:44,711 INFO  org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The derived from fraction jvm overhead memory (160.000mb (167772162 bytes)) is less than its min value 192.000mb (201326592 bytes), min value will be used instead

2021-11-08 14:36:44,734 INFO  org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The derived from fraction jvm overhead memory (172.800mb (181193935 bytes)) is less than its min value 192.000mb (201326592 bytes), min value will be used instead

2021-11-08 14:36:45,025 INFO  org.apache.hadoop.conf.Configuration                         [] - resource-types.xml not found

2021-11-08 14:36:45,026 INFO  org.apache.hadoop.yarn.util.resource.ResourceUtils           [] - Unable to find 'resource-types.xml'.

2021-11-08 14:36:45,091 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - The configured JobManager memory is 1600 MB. YARN will allocate 2048 MB to make up an integer multiple of its minimum allocation memory (1024 MB, configured via 'yarn.scheduler.minimum-allocation-mb'). The extra 448 MB may not be used by Flink.

2021-11-08 14:36:45,092 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - The configured TaskManager memory is 1728 MB. YARN will allocate 2048 MB to make up an integer multiple of its minimum allocation memory (1024 MB, configured via 'yarn.scheduler.minimum-allocation-mb'). The extra 320 MB may not be used by Flink.

2021-11-08 14:36:45,092 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Cluster specification: ClusterSpecification{masterMemoryMB=1600, taskManagerMemoryMB=1728, slotsPerTaskManager=8}

2021-11-08 14:36:45,847 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-08 14:36:45,979 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-08 14:36:46,004 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-08 14:36:46,032 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-08 14:36:46,191 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-08 14:36:46,248 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-08 14:36:46,287 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-08 14:36:46,325 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-08 14:36:46,368 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

2021-11-08 14:36:46,546 INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

.............

一直输出

INFO  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
查看yarn控台没有flink session集群创建











At 2021-11-04 21:18:03, "casel.chen" <ca...@126.com> wrote:
>flink-conf.yaml 内容如下
>
>
>################################################################################
>#  Licensed to the Apache Software Foundation (ASF) under one
>#  or more contributor license agreements.  See the NOTICE file
>#  distributed with this work for additional information
>#  regarding copyright ownership.  The ASF licenses this file
>#  to you under the Apache License, Version 2.0 (the
>#  "License"); you may not use this file except in compliance
>#  with the License.  You may obtain a copy of the License at
>#
>#      http://www.apache.org/licenses/LICENSE-2.0
>#
>#  Unless required by applicable law or agreed to in writing, software
>#  distributed under the License is distributed on an "AS IS" BASIS,
>#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
>#  See the License for the specific language governing permissions and
># limitations under the License.
>################################################################################
>
>
>
>
>#==============================================================================
># Common
>#==============================================================================
>
>
># The external address of the host on which the JobManager runs and can be
># reached by the TaskManagers and any clients which want to connect. This setting
># is only used in Standalone mode and may be overwritten on the JobManager side
># by specifying the --host <hostname> parameter of the bin/jobmanager.sh executable.
># In high availability mode, if you use the bin/start-cluster.sh script and setup
># the conf/masters file, this will be taken care of automatically. Yarn/Mesos
># automatically configure the host name based on the hostname of the node where the
># JobManager runs.
>
>
>jobmanager.rpc.address: master
>
>
># The RPC port where the JobManager is reachable.
>
>
>jobmanager.rpc.port: 6123
>
>
>
>
># The total process memory size for the JobManager.
>#
># Note this accounts for all memory usage within the JobManager process, including JVM metaspace and other overhead.
>
>
>jobmanager.memory.process.size: 1600m
>
>
>
>
># The total process memory size for the TaskManager.
>#
># Note this accounts for all memory usage within the TaskManager process, including JVM metaspace and other overhead.
>
>
>taskmanager.memory.process.size: 1728m
>
>
># To exclude JVM metaspace and overhead, please, use total Flink memory size instead of 'taskmanager.memory.process.size'.
># It is not recommended to set both 'taskmanager.memory.process.size' and Flink memory.
>#
># taskmanager.memory.flink.size: 1280m
>
>
># The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.
>
>
>taskmanager.numberOfTaskSlots: 8
>
>
># The parallelism used for programs that did not specify and other parallelism.
>
>
>parallelism.default: 1
>
>
># The default file system scheme and authority.
># 
># By default file paths without scheme are interpreted relative to the local
># root file system 'file:///'. Use this to override the default and interpret
># relative paths relative to a different file system,
># for example 'hdfs://mynamenode:12345'
>#
># fs.default-scheme
>
>
>#==============================================================================
># High Availability
>#==============================================================================
>
>
># The high-availability mode. Possible options are 'NONE' or 'zookeeper'.
>#
># high-availability: zookeeper
>
>
># The path where metadata for master recovery is persisted. While ZooKeeper stores
># the small ground truth for checkpoint and leader election, this location stores
># the larger objects, like persisted dataflow graphs.
># 
># Must be a durable file system that is accessible from all nodes
># (like HDFS, S3, Ceph, nfs, ...) 
>#
># high-availability.storageDir: hdfs:///flink/ha/
>
>
># The list of ZooKeeper quorum peers that coordinate the high-availability
># setup. This must be a list of the form:
># "host1:clientPort,host2:clientPort,..." (default clientPort: 2181)
>#
># high-availability.zookeeper.quorum: localhost:2181
>
>
>
>
># ACL options are based on https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_BuiltinACLSchemes
># It can be either "creator" (ZOO_CREATE_ALL_ACL) or "open" (ZOO_OPEN_ACL_UNSAFE)
># The default value is "open" and it can be changed to "creator" if ZK security is enabled
>#
># high-availability.zookeeper.client.acl: open
>
>
>#==============================================================================
># Fault tolerance and checkpointing
>#==============================================================================
>
>
>execution.checkpointing.interval: 60s
>
>
>execution.checkpointing.unaligned: true
>
>
>execution.checkpointing.timeout: 1200s
>
>
># The backend that will be used to store operator state checkpoints if
># checkpointing is enabled.
>#
># Supported backends are 'jobmanager', 'filesystem', 'rocksdb', or the
># <class-name-of-factory>.
>#
>state.backend: filesystem
>
>
># Directory for checkpoints filesystem, when using any of the default bundled
># state backends.
>#
>state.checkpoints.dir: oss://datalake-huifu/hudi/flink/checkpoints
>
>
># Default target directory for savepoints, optional.
>#
>state.savepoints.dir: oss://datalake-huifu/hudi/flink/savepoints
>
>
># Flag to enable/disable incremental checkpoints for backends that
># support incremental checkpoints (like the RocksDB state backend). 
>#
>state.backend.incremental: true
>
>
># The failover strategy, i.e., how the job computation recovers from task failures.
># Only restart tasks that may have been affected by the task failure, which typically includes
># downstream tasks and potentially upstream tasks if their produced data is no longer available for consumption.
>
>
>jobmanager.execution.failover-strategy: region
>
>
>#==============================================================================
># Rest & web frontend
>#==============================================================================
>
>
># The port to which the REST client connects to. If rest.bind-port has
># not been specified, then the server will bind to this port as well.
>#
>rest.port: 18081
>
>
># The address to which the REST client will connect to
>#
>#rest.address: 0.0.0.0
>
>
># Port range for the REST and web server to bind to.
>#
>#rest.bind-port: 8080-8090
>
>
># The address that the REST & web server binds to
>#
>#rest.bind-address: 0.0.0.0
>
>
># Flag to specify whether job submission is enabled from the web-based
># runtime monitor. Uncomment to disable.
>
>
>#web.submit.enable: false
>
>
>#==============================================================================
># Advanced
>#==============================================================================
>
>
># Override the directories for temporary files. If not specified, the
># system-specific Java temporary directory (java.io.tmpdir property) is taken.
>#
># For framework setups on Yarn or Mesos, Flink will automatically pick up the
># containers' temp directories without any need for configuration.
>#
># Add a delimited list for multiple directories, using the system directory
># delimiter (colon ':' on unix) or a comma, e.g.:
>#     /data1/tmp:/data2/tmp:/data3/tmp
>#
># Note: Each directory entry is read from and written to by a different I/O
># thread. You can include the same directory multiple times in order to create
># multiple I/O threads against that directory. This is for example relevant for
># high-throughput RAIDs.
>#
># io.tmp.dirs: /tmp
>
>
># The classloading resolve order. Possible values are 'child-first' (Flink's default)
># and 'parent-first' (Java's default).
>#
># Child first classloading allows users to use different dependency/library
># versions in their application than those in the classpath. Switching back
># to 'parent-first' may help with debugging dependency issues.
>#
>classloader.resolve-order: parent-first
>
>
># The amount of memory going to the network stack. These numbers usually need 
># no tuning. Adjusting them may be necessary in case of an "Insufficient number
># of network buffers" error. The default min is 64MB, the default max is 1GB.
># 
># taskmanager.memory.network.fraction: 0.1
># taskmanager.memory.network.min: 64mb
># taskmanager.memory.network.max: 1gb
>
>
>#==============================================================================
># Flink Cluster Security Configuration
>#==============================================================================
>
>
># Kerberos authentication for various components - Hadoop, ZooKeeper, and connectors -
># may be enabled in four steps:
># 1. configure the local krb5.conf file
># 2. provide Kerberos credentials (either a keytab or a ticket cache w/ kinit)
># 3. make the credentials available to various JAAS login contexts
># 4. configure the connector to use JAAS/SASL
>
>
># The below configure how Kerberos credentials are provided. A keytab will be used instead of
># a ticket cache if the keytab path and principal are set.
>
>
># security.kerberos.login.use-ticket-cache: true
># security.kerberos.login.keytab: /path/to/kerberos/keytab
># security.kerberos.login.principal: flink-user
>
>
># The configuration below defines which JAAS login contexts
>
>
># security.kerberos.login.contexts: Client,KafkaClient
>
>
>#==============================================================================
># ZK Security Configuration
>#==============================================================================
>
>
># Below configurations are applicable if ZK ensemble is configured for security
>
>
># Override below configuration to provide custom ZK service name if configured
># zookeeper.sasl.service-name: zookeeper
>
>
># The configuration below must match one of the values set in "security.kerberos.login.contexts"
># zookeeper.sasl.login-context-name: Client
>
>
>#==============================================================================
># HistoryServer
>#==============================================================================
>
>
># The HistoryServer is started and stopped via bin/historyserver.sh (start|stop)
>
>
># Directory to upload completed jobs to. Add this directory to the list of
># monitored directories of the HistoryServer as well (see below).
>#jobmanager.archive.fs.dir: hdfs:///completed-jobs/
>
>
># The address under which the web-based HistoryServer listens.
>#historyserver.web.address: 0.0.0.0
>
>
># The port under which the web-based HistoryServer listens.
>#historyserver.web.port: 8082
>
>
># Comma separated list of directories to monitor for completed jobs.
>#historyserver.archive.fs.dir: hdfs:///completed-jobs/
>
>
># Interval in milliseconds for refreshing the monitored directories.
>#historyserver.archive.fs.refresh-interval: 10000
>
>
>fs.oss.endpoint: oss-cn-shanghai.aliyuncs.com
>fs.oss.accessKeyId: xxx
>fs.oss.accessKeySecret: xxx
>
>
>metrics.reporter.promgateway.class: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter
>metrics.reporter.promgateway.host: 192.168.24.41
>metrics.reporter.promgateway.port: 9091
>metrics.reporter.promgateway.jobName: PerfTest
>metrics.reporter.promgateway.randomJobNameSuffix: true
>metrics.reporter.promgateway.deleteOnShutdown: false
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>在 2021-11-04 19:32:45,"Caizhi Weng" <ts...@gmail.com> 写道:
>>Hi!
>>
>>没有在邮件里发现附件,可以考虑把 flink-conf.yaml 的内容贴在邮件里,或者外部剪贴板。
>>
>>casel.chen <ca...@126.com> 于2021年11月4日周四 下午6:42写道:
>>
>>> flink 1.13.2 + hadoop 3.2.1
>>> yarn上已经成功跑了hive和spark作业
>>> flink上通过运行 bin/yarn-session.sh 启动yarn session集群的时候一直报如下INFO日志,查看yarn web
>>> console发现并没有启flink-session集群,我的flink-conf.yaml配置如附件,hadoop集群并没有开启认证SSL之类的,改用standalone模式是可以启动3节点集群的,请问这会是什么原因造成的?要怎么修复?谢谢!
>>>
>>>
>>>
>>> 2021-11-04 16:51:39,964 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:39,986 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,004 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,020 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,041 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,059 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,078 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,097 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,114 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,134 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,155 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,175 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,193 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,212 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>>
>>>
>>>
>>>
>>>

Re:Re: flink启动yarn-session失败

Posted by "casel.chen" <ca...@126.com>.
flink-conf.yaml 内容如下


################################################################################
#  Licensed to the Apache Software Foundation (ASF) under one
#  or more contributor license agreements.  See the NOTICE file
#  distributed with this work for additional information
#  regarding copyright ownership.  The ASF licenses this file
#  to you under the Apache License, Version 2.0 (the
#  "License"); you may not use this file except in compliance
#  with the License.  You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
#  Unless required by applicable law or agreed to in writing, software
#  distributed under the License is distributed on an "AS IS" BASIS,
#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#  See the License for the specific language governing permissions and
# limitations under the License.
################################################################################




#==============================================================================
# Common
#==============================================================================


# The external address of the host on which the JobManager runs and can be
# reached by the TaskManagers and any clients which want to connect. This setting
# is only used in Standalone mode and may be overwritten on the JobManager side
# by specifying the --host <hostname> parameter of the bin/jobmanager.sh executable.
# In high availability mode, if you use the bin/start-cluster.sh script and setup
# the conf/masters file, this will be taken care of automatically. Yarn/Mesos
# automatically configure the host name based on the hostname of the node where the
# JobManager runs.


jobmanager.rpc.address: master


# The RPC port where the JobManager is reachable.


jobmanager.rpc.port: 6123




# The total process memory size for the JobManager.
#
# Note this accounts for all memory usage within the JobManager process, including JVM metaspace and other overhead.


jobmanager.memory.process.size: 1600m




# The total process memory size for the TaskManager.
#
# Note this accounts for all memory usage within the TaskManager process, including JVM metaspace and other overhead.


taskmanager.memory.process.size: 1728m


# To exclude JVM metaspace and overhead, please, use total Flink memory size instead of 'taskmanager.memory.process.size'.
# It is not recommended to set both 'taskmanager.memory.process.size' and Flink memory.
#
# taskmanager.memory.flink.size: 1280m


# The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.


taskmanager.numberOfTaskSlots: 8


# The parallelism used for programs that did not specify and other parallelism.


parallelism.default: 1


# The default file system scheme and authority.
# 
# By default file paths without scheme are interpreted relative to the local
# root file system 'file:///'. Use this to override the default and interpret
# relative paths relative to a different file system,
# for example 'hdfs://mynamenode:12345'
#
# fs.default-scheme


#==============================================================================
# High Availability
#==============================================================================


# The high-availability mode. Possible options are 'NONE' or 'zookeeper'.
#
# high-availability: zookeeper


# The path where metadata for master recovery is persisted. While ZooKeeper stores
# the small ground truth for checkpoint and leader election, this location stores
# the larger objects, like persisted dataflow graphs.
# 
# Must be a durable file system that is accessible from all nodes
# (like HDFS, S3, Ceph, nfs, ...) 
#
# high-availability.storageDir: hdfs:///flink/ha/


# The list of ZooKeeper quorum peers that coordinate the high-availability
# setup. This must be a list of the form:
# "host1:clientPort,host2:clientPort,..." (default clientPort: 2181)
#
# high-availability.zookeeper.quorum: localhost:2181




# ACL options are based on https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_BuiltinACLSchemes
# It can be either "creator" (ZOO_CREATE_ALL_ACL) or "open" (ZOO_OPEN_ACL_UNSAFE)
# The default value is "open" and it can be changed to "creator" if ZK security is enabled
#
# high-availability.zookeeper.client.acl: open


#==============================================================================
# Fault tolerance and checkpointing
#==============================================================================


execution.checkpointing.interval: 60s


execution.checkpointing.unaligned: true


execution.checkpointing.timeout: 1200s


# The backend that will be used to store operator state checkpoints if
# checkpointing is enabled.
#
# Supported backends are 'jobmanager', 'filesystem', 'rocksdb', or the
# <class-name-of-factory>.
#
state.backend: filesystem


# Directory for checkpoints filesystem, when using any of the default bundled
# state backends.
#
state.checkpoints.dir: oss://datalake-huifu/hudi/flink/checkpoints


# Default target directory for savepoints, optional.
#
state.savepoints.dir: oss://datalake-huifu/hudi/flink/savepoints


# Flag to enable/disable incremental checkpoints for backends that
# support incremental checkpoints (like the RocksDB state backend). 
#
state.backend.incremental: true


# The failover strategy, i.e., how the job computation recovers from task failures.
# Only restart tasks that may have been affected by the task failure, which typically includes
# downstream tasks and potentially upstream tasks if their produced data is no longer available for consumption.


jobmanager.execution.failover-strategy: region


#==============================================================================
# Rest & web frontend
#==============================================================================


# The port to which the REST client connects to. If rest.bind-port has
# not been specified, then the server will bind to this port as well.
#
rest.port: 18081


# The address to which the REST client will connect to
#
#rest.address: 0.0.0.0


# Port range for the REST and web server to bind to.
#
#rest.bind-port: 8080-8090


# The address that the REST & web server binds to
#
#rest.bind-address: 0.0.0.0


# Flag to specify whether job submission is enabled from the web-based
# runtime monitor. Uncomment to disable.


#web.submit.enable: false


#==============================================================================
# Advanced
#==============================================================================


# Override the directories for temporary files. If not specified, the
# system-specific Java temporary directory (java.io.tmpdir property) is taken.
#
# For framework setups on Yarn or Mesos, Flink will automatically pick up the
# containers' temp directories without any need for configuration.
#
# Add a delimited list for multiple directories, using the system directory
# delimiter (colon ':' on unix) or a comma, e.g.:
#     /data1/tmp:/data2/tmp:/data3/tmp
#
# Note: Each directory entry is read from and written to by a different I/O
# thread. You can include the same directory multiple times in order to create
# multiple I/O threads against that directory. This is for example relevant for
# high-throughput RAIDs.
#
# io.tmp.dirs: /tmp


# The classloading resolve order. Possible values are 'child-first' (Flink's default)
# and 'parent-first' (Java's default).
#
# Child first classloading allows users to use different dependency/library
# versions in their application than those in the classpath. Switching back
# to 'parent-first' may help with debugging dependency issues.
#
classloader.resolve-order: parent-first


# The amount of memory going to the network stack. These numbers usually need 
# no tuning. Adjusting them may be necessary in case of an "Insufficient number
# of network buffers" error. The default min is 64MB, the default max is 1GB.
# 
# taskmanager.memory.network.fraction: 0.1
# taskmanager.memory.network.min: 64mb
# taskmanager.memory.network.max: 1gb


#==============================================================================
# Flink Cluster Security Configuration
#==============================================================================


# Kerberos authentication for various components - Hadoop, ZooKeeper, and connectors -
# may be enabled in four steps:
# 1. configure the local krb5.conf file
# 2. provide Kerberos credentials (either a keytab or a ticket cache w/ kinit)
# 3. make the credentials available to various JAAS login contexts
# 4. configure the connector to use JAAS/SASL


# The below configure how Kerberos credentials are provided. A keytab will be used instead of
# a ticket cache if the keytab path and principal are set.


# security.kerberos.login.use-ticket-cache: true
# security.kerberos.login.keytab: /path/to/kerberos/keytab
# security.kerberos.login.principal: flink-user


# The configuration below defines which JAAS login contexts


# security.kerberos.login.contexts: Client,KafkaClient


#==============================================================================
# ZK Security Configuration
#==============================================================================


# Below configurations are applicable if ZK ensemble is configured for security


# Override below configuration to provide custom ZK service name if configured
# zookeeper.sasl.service-name: zookeeper


# The configuration below must match one of the values set in "security.kerberos.login.contexts"
# zookeeper.sasl.login-context-name: Client


#==============================================================================
# HistoryServer
#==============================================================================


# The HistoryServer is started and stopped via bin/historyserver.sh (start|stop)


# Directory to upload completed jobs to. Add this directory to the list of
# monitored directories of the HistoryServer as well (see below).
#jobmanager.archive.fs.dir: hdfs:///completed-jobs/


# The address under which the web-based HistoryServer listens.
#historyserver.web.address: 0.0.0.0


# The port under which the web-based HistoryServer listens.
#historyserver.web.port: 8082


# Comma separated list of directories to monitor for completed jobs.
#historyserver.archive.fs.dir: hdfs:///completed-jobs/


# Interval in milliseconds for refreshing the monitored directories.
#historyserver.archive.fs.refresh-interval: 10000


fs.oss.endpoint: oss-cn-shanghai.aliyuncs.com
fs.oss.accessKeyId: xxx
fs.oss.accessKeySecret: xxx


metrics.reporter.promgateway.class: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter
metrics.reporter.promgateway.host: 192.168.24.41
metrics.reporter.promgateway.port: 9091
metrics.reporter.promgateway.jobName: PerfTest
metrics.reporter.promgateway.randomJobNameSuffix: true
metrics.reporter.promgateway.deleteOnShutdown: false

























在 2021-11-04 19:32:45,"Caizhi Weng" <ts...@gmail.com> 写道:
>Hi!
>
>没有在邮件里发现附件,可以考虑把 flink-conf.yaml 的内容贴在邮件里,或者外部剪贴板。
>
>casel.chen <ca...@126.com> 于2021年11月4日周四 下午6:42写道:
>
>> flink 1.13.2 + hadoop 3.2.1
>> yarn上已经成功跑了hive和spark作业
>> flink上通过运行 bin/yarn-session.sh 启动yarn session集群的时候一直报如下INFO日志,查看yarn web
>> console发现并没有启flink-session集群,我的flink-conf.yaml配置如附件,hadoop集群并没有开启认证SSL之类的,改用standalone模式是可以启动3节点集群的,请问这会是什么原因造成的?要怎么修复?谢谢!
>>
>>
>>
>> 2021-11-04 16:51:39,964 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:39,986 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,004 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,020 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,041 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,059 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,078 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,097 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,114 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,134 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,155 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,175 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,193 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,212 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>>
>>
>>
>>
>>

Re: flink启动yarn-session失败

Posted by Caizhi Weng <ts...@gmail.com>.
Hi!

没有在邮件里发现附件,可以考虑把 flink-conf.yaml 的内容贴在邮件里,或者外部剪贴板。

casel.chen <ca...@126.com> 于2021年11月4日周四 下午6:42写道:

> flink 1.13.2 + hadoop 3.2.1
> yarn上已经成功跑了hive和spark作业
> flink上通过运行 bin/yarn-session.sh 启动yarn session集群的时候一直报如下INFO日志,查看yarn web
> console发现并没有启flink-session集群,我的flink-conf.yaml配置如附件,hadoop集群并没有开启认证SSL之类的,改用standalone模式是可以启动3节点集群的,请问这会是什么原因造成的?要怎么修复?谢谢!
>
>
>
> 2021-11-04 16:51:39,964 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:39,986 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,004 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,020 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,041 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,059 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,078 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,097 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,114 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,134 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,155 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,175 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,193 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,212 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
>
>
>
>
>