You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by "casel.chen" <ca...@126.com> on 2021/11/04 10:42:15 UTC
flink启动yarn-session失败
flink 1.13.2 + hadoop 3.2.1
yarn上已经成功跑了hive和spark作业
flink上通过运行 bin/yarn-session.sh 启动yarn session集群的时候一直报如下INFO日志,查看yarn web console发现并没有启flink-session集群,我的flink-conf.yaml配置如附件,hadoop集群并没有开启认证SSL之类的,改用standalone模式是可以启动3节点集群的,请问这会是什么原因造成的?要怎么修复?谢谢!
2021-11-04 16:51:39,964 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-04 16:51:39,986 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-04 16:51:40,004 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-04 16:51:40,020 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-04 16:51:40,041 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-04 16:51:40,059 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-04 16:51:40,078 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-04 16:51:40,097 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-04 16:51:40,114 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-04 16:51:40,134 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-04 16:51:40,155 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-04 16:51:40,175 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-04 16:51:40,193 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-04 16:51:40,212 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
Re:Re:Re: flink启动yarn-session失败
Posted by "casel.chen" <ca...@126.com>.
没有人遇到这类问题吗?
[docker@master flink-1.13.2]$ ./bin/yarn-session.sh
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/docker/flink-1.13.2/lib/log4j-slf4j-impl-2.12.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/docker/hadoop-3.2.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2021-11-08 14:36:43,651 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: jobmanager.rpc.address, master
2021-11-08 14:36:43,657 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: jobmanager.rpc.port, 6123
2021-11-08 14:36:43,657 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: jobmanager.memory.process.size, 1600m
2021-11-08 14:36:43,657 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: taskmanager.memory.process.size, 1728m
2021-11-08 14:36:43,657 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: taskmanager.numberOfTaskSlots, 8
2021-11-08 14:36:43,657 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: parallelism.default, 1
2021-11-08 14:36:43,658 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: execution.checkpointing.interval, 30s
2021-11-08 14:36:43,658 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: execution.checkpointing.unaligned, true
2021-11-08 14:36:43,658 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: execution.checkpointing.timeout, 1200s
2021-11-08 14:36:43,658 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: state.backend, filesystem
2021-11-08 14:36:43,658 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: state.checkpoints.dir, oss://datalake-huifu/hudi/flink/checkpoints
2021-11-08 14:36:43,659 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: state.savepoints.dir, oss://datalake-huifu/hudi/flink/savepoints
2021-11-08 14:36:43,659 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: state.backend.incremental, true
2021-11-08 14:36:43,659 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: jobmanager.execution.failover-strategy, region
2021-11-08 14:36:43,659 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: rest.port, 18081
2021-11-08 14:36:43,660 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: classloader.resolve-order, parent-first
2021-11-08 14:36:43,660 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: web.submit.enabled, false
2021-11-08 14:36:43,660 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: fs.oss.endpoint, oss-cn-shanghai.aliyuncs.com
2021-11-08 14:36:43,660 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: fs.oss.accessKeyId, LTAI5tJ4k9pk1KwZVsLd8NHd
2021-11-08 14:36:43,661 INFO org.apache.flink.configuration.GlobalConfiguration [] - Loading configuration property: fs.oss.accessKeySecret, ******
2021-11-08 14:36:44,042 WARN org.apache.hadoop.util.NativeCodeLoader [] - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2021-11-08 14:36:44,155 INFO org.apache.flink.runtime.security.modules.HadoopModule [] - Hadoop user set to docker (auth:SIMPLE)
2021-11-08 14:36:44,171 INFO org.apache.flink.runtime.security.modules.JaasModule [] - Jaas file will be created as /tmp/jaas-5414790866026859677.conf.
2021-11-08 14:36:44,213 WARN org.apache.flink.yarn.configuration.YarnLogConfigUtil [] - The configuration directory ('/home/docker/flink-1.13.2/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2021-11-08 14:36:44,340 INFO org.apache.hadoop.yarn.client.RMProxy [] - Connecting to ResourceManager at master/192.168.16.191:8032
2021-11-08 14:36:44,711 INFO org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The derived from fraction jvm overhead memory (160.000mb (167772162 bytes)) is less than its min value 192.000mb (201326592 bytes), min value will be used instead
2021-11-08 14:36:44,734 INFO org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The derived from fraction jvm overhead memory (172.800mb (181193935 bytes)) is less than its min value 192.000mb (201326592 bytes), min value will be used instead
2021-11-08 14:36:45,025 INFO org.apache.hadoop.conf.Configuration [] - resource-types.xml not found
2021-11-08 14:36:45,026 INFO org.apache.hadoop.yarn.util.resource.ResourceUtils [] - Unable to find 'resource-types.xml'.
2021-11-08 14:36:45,091 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - The configured JobManager memory is 1600 MB. YARN will allocate 2048 MB to make up an integer multiple of its minimum allocation memory (1024 MB, configured via 'yarn.scheduler.minimum-allocation-mb'). The extra 448 MB may not be used by Flink.
2021-11-08 14:36:45,092 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - The configured TaskManager memory is 1728 MB. YARN will allocate 2048 MB to make up an integer multiple of its minimum allocation memory (1024 MB, configured via 'yarn.scheduler.minimum-allocation-mb'). The extra 320 MB may not be used by Flink.
2021-11-08 14:36:45,092 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Cluster specification: ClusterSpecification{masterMemoryMB=1600, taskManagerMemoryMB=1728, slotsPerTaskManager=8}
2021-11-08 14:36:45,847 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-08 14:36:45,979 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-08 14:36:46,004 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-08 14:36:46,032 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-08 14:36:46,191 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-08 14:36:46,248 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-08 14:36:46,287 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-08 14:36:46,325 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-08 14:36:46,368 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-11-08 14:36:46,546 INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
.............
一直输出
INFO org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient [] - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
查看yarn控台没有flink session集群创建
At 2021-11-04 21:18:03, "casel.chen" <ca...@126.com> wrote:
>flink-conf.yaml 内容如下
>
>
>################################################################################
># Licensed to the Apache Software Foundation (ASF) under one
># or more contributor license agreements. See the NOTICE file
># distributed with this work for additional information
># regarding copyright ownership. The ASF licenses this file
># to you under the Apache License, Version 2.0 (the
># "License"); you may not use this file except in compliance
># with the License. You may obtain a copy of the License at
>#
># http://www.apache.org/licenses/LICENSE-2.0
>#
># Unless required by applicable law or agreed to in writing, software
># distributed under the License is distributed on an "AS IS" BASIS,
># WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
># See the License for the specific language governing permissions and
># limitations under the License.
>################################################################################
>
>
>
>
>#==============================================================================
># Common
>#==============================================================================
>
>
># The external address of the host on which the JobManager runs and can be
># reached by the TaskManagers and any clients which want to connect. This setting
># is only used in Standalone mode and may be overwritten on the JobManager side
># by specifying the --host <hostname> parameter of the bin/jobmanager.sh executable.
># In high availability mode, if you use the bin/start-cluster.sh script and setup
># the conf/masters file, this will be taken care of automatically. Yarn/Mesos
># automatically configure the host name based on the hostname of the node where the
># JobManager runs.
>
>
>jobmanager.rpc.address: master
>
>
># The RPC port where the JobManager is reachable.
>
>
>jobmanager.rpc.port: 6123
>
>
>
>
># The total process memory size for the JobManager.
>#
># Note this accounts for all memory usage within the JobManager process, including JVM metaspace and other overhead.
>
>
>jobmanager.memory.process.size: 1600m
>
>
>
>
># The total process memory size for the TaskManager.
>#
># Note this accounts for all memory usage within the TaskManager process, including JVM metaspace and other overhead.
>
>
>taskmanager.memory.process.size: 1728m
>
>
># To exclude JVM metaspace and overhead, please, use total Flink memory size instead of 'taskmanager.memory.process.size'.
># It is not recommended to set both 'taskmanager.memory.process.size' and Flink memory.
>#
># taskmanager.memory.flink.size: 1280m
>
>
># The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.
>
>
>taskmanager.numberOfTaskSlots: 8
>
>
># The parallelism used for programs that did not specify and other parallelism.
>
>
>parallelism.default: 1
>
>
># The default file system scheme and authority.
>#
># By default file paths without scheme are interpreted relative to the local
># root file system 'file:///'. Use this to override the default and interpret
># relative paths relative to a different file system,
># for example 'hdfs://mynamenode:12345'
>#
># fs.default-scheme
>
>
>#==============================================================================
># High Availability
>#==============================================================================
>
>
># The high-availability mode. Possible options are 'NONE' or 'zookeeper'.
>#
># high-availability: zookeeper
>
>
># The path where metadata for master recovery is persisted. While ZooKeeper stores
># the small ground truth for checkpoint and leader election, this location stores
># the larger objects, like persisted dataflow graphs.
>#
># Must be a durable file system that is accessible from all nodes
># (like HDFS, S3, Ceph, nfs, ...)
>#
># high-availability.storageDir: hdfs:///flink/ha/
>
>
># The list of ZooKeeper quorum peers that coordinate the high-availability
># setup. This must be a list of the form:
># "host1:clientPort,host2:clientPort,..." (default clientPort: 2181)
>#
># high-availability.zookeeper.quorum: localhost:2181
>
>
>
>
># ACL options are based on https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_BuiltinACLSchemes
># It can be either "creator" (ZOO_CREATE_ALL_ACL) or "open" (ZOO_OPEN_ACL_UNSAFE)
># The default value is "open" and it can be changed to "creator" if ZK security is enabled
>#
># high-availability.zookeeper.client.acl: open
>
>
>#==============================================================================
># Fault tolerance and checkpointing
>#==============================================================================
>
>
>execution.checkpointing.interval: 60s
>
>
>execution.checkpointing.unaligned: true
>
>
>execution.checkpointing.timeout: 1200s
>
>
># The backend that will be used to store operator state checkpoints if
># checkpointing is enabled.
>#
># Supported backends are 'jobmanager', 'filesystem', 'rocksdb', or the
># <class-name-of-factory>.
>#
>state.backend: filesystem
>
>
># Directory for checkpoints filesystem, when using any of the default bundled
># state backends.
>#
>state.checkpoints.dir: oss://datalake-huifu/hudi/flink/checkpoints
>
>
># Default target directory for savepoints, optional.
>#
>state.savepoints.dir: oss://datalake-huifu/hudi/flink/savepoints
>
>
># Flag to enable/disable incremental checkpoints for backends that
># support incremental checkpoints (like the RocksDB state backend).
>#
>state.backend.incremental: true
>
>
># The failover strategy, i.e., how the job computation recovers from task failures.
># Only restart tasks that may have been affected by the task failure, which typically includes
># downstream tasks and potentially upstream tasks if their produced data is no longer available for consumption.
>
>
>jobmanager.execution.failover-strategy: region
>
>
>#==============================================================================
># Rest & web frontend
>#==============================================================================
>
>
># The port to which the REST client connects to. If rest.bind-port has
># not been specified, then the server will bind to this port as well.
>#
>rest.port: 18081
>
>
># The address to which the REST client will connect to
>#
>#rest.address: 0.0.0.0
>
>
># Port range for the REST and web server to bind to.
>#
>#rest.bind-port: 8080-8090
>
>
># The address that the REST & web server binds to
>#
>#rest.bind-address: 0.0.0.0
>
>
># Flag to specify whether job submission is enabled from the web-based
># runtime monitor. Uncomment to disable.
>
>
>#web.submit.enable: false
>
>
>#==============================================================================
># Advanced
>#==============================================================================
>
>
># Override the directories for temporary files. If not specified, the
># system-specific Java temporary directory (java.io.tmpdir property) is taken.
>#
># For framework setups on Yarn or Mesos, Flink will automatically pick up the
># containers' temp directories without any need for configuration.
>#
># Add a delimited list for multiple directories, using the system directory
># delimiter (colon ':' on unix) or a comma, e.g.:
># /data1/tmp:/data2/tmp:/data3/tmp
>#
># Note: Each directory entry is read from and written to by a different I/O
># thread. You can include the same directory multiple times in order to create
># multiple I/O threads against that directory. This is for example relevant for
># high-throughput RAIDs.
>#
># io.tmp.dirs: /tmp
>
>
># The classloading resolve order. Possible values are 'child-first' (Flink's default)
># and 'parent-first' (Java's default).
>#
># Child first classloading allows users to use different dependency/library
># versions in their application than those in the classpath. Switching back
># to 'parent-first' may help with debugging dependency issues.
>#
>classloader.resolve-order: parent-first
>
>
># The amount of memory going to the network stack. These numbers usually need
># no tuning. Adjusting them may be necessary in case of an "Insufficient number
># of network buffers" error. The default min is 64MB, the default max is 1GB.
>#
># taskmanager.memory.network.fraction: 0.1
># taskmanager.memory.network.min: 64mb
># taskmanager.memory.network.max: 1gb
>
>
>#==============================================================================
># Flink Cluster Security Configuration
>#==============================================================================
>
>
># Kerberos authentication for various components - Hadoop, ZooKeeper, and connectors -
># may be enabled in four steps:
># 1. configure the local krb5.conf file
># 2. provide Kerberos credentials (either a keytab or a ticket cache w/ kinit)
># 3. make the credentials available to various JAAS login contexts
># 4. configure the connector to use JAAS/SASL
>
>
># The below configure how Kerberos credentials are provided. A keytab will be used instead of
># a ticket cache if the keytab path and principal are set.
>
>
># security.kerberos.login.use-ticket-cache: true
># security.kerberos.login.keytab: /path/to/kerberos/keytab
># security.kerberos.login.principal: flink-user
>
>
># The configuration below defines which JAAS login contexts
>
>
># security.kerberos.login.contexts: Client,KafkaClient
>
>
>#==============================================================================
># ZK Security Configuration
>#==============================================================================
>
>
># Below configurations are applicable if ZK ensemble is configured for security
>
>
># Override below configuration to provide custom ZK service name if configured
># zookeeper.sasl.service-name: zookeeper
>
>
># The configuration below must match one of the values set in "security.kerberos.login.contexts"
># zookeeper.sasl.login-context-name: Client
>
>
>#==============================================================================
># HistoryServer
>#==============================================================================
>
>
># The HistoryServer is started and stopped via bin/historyserver.sh (start|stop)
>
>
># Directory to upload completed jobs to. Add this directory to the list of
># monitored directories of the HistoryServer as well (see below).
>#jobmanager.archive.fs.dir: hdfs:///completed-jobs/
>
>
># The address under which the web-based HistoryServer listens.
>#historyserver.web.address: 0.0.0.0
>
>
># The port under which the web-based HistoryServer listens.
>#historyserver.web.port: 8082
>
>
># Comma separated list of directories to monitor for completed jobs.
>#historyserver.archive.fs.dir: hdfs:///completed-jobs/
>
>
># Interval in milliseconds for refreshing the monitored directories.
>#historyserver.archive.fs.refresh-interval: 10000
>
>
>fs.oss.endpoint: oss-cn-shanghai.aliyuncs.com
>fs.oss.accessKeyId: xxx
>fs.oss.accessKeySecret: xxx
>
>
>metrics.reporter.promgateway.class: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter
>metrics.reporter.promgateway.host: 192.168.24.41
>metrics.reporter.promgateway.port: 9091
>metrics.reporter.promgateway.jobName: PerfTest
>metrics.reporter.promgateway.randomJobNameSuffix: true
>metrics.reporter.promgateway.deleteOnShutdown: false
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>在 2021-11-04 19:32:45,"Caizhi Weng" <ts...@gmail.com> 写道:
>>Hi!
>>
>>没有在邮件里发现附件,可以考虑把 flink-conf.yaml 的内容贴在邮件里,或者外部剪贴板。
>>
>>casel.chen <ca...@126.com> 于2021年11月4日周四 下午6:42写道:
>>
>>> flink 1.13.2 + hadoop 3.2.1
>>> yarn上已经成功跑了hive和spark作业
>>> flink上通过运行 bin/yarn-session.sh 启动yarn session集群的时候一直报如下INFO日志,查看yarn web
>>> console发现并没有启flink-session集群,我的flink-conf.yaml配置如附件,hadoop集群并没有开启认证SSL之类的,改用standalone模式是可以启动3节点集群的,请问这会是什么原因造成的?要怎么修复?谢谢!
>>>
>>>
>>>
>>> 2021-11-04 16:51:39,964 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:39,986 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,004 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,020 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,041 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,059 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,078 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,097 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,114 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,134 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,155 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,175 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,193 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>> 2021-11-04 16:51:40,212 INFO
>>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>>> = false
>>>
>>>
>>>
>>>
>>>
>>>
Re:Re: flink启动yarn-session失败
Posted by "casel.chen" <ca...@126.com>.
flink-conf.yaml 内容如下
################################################################################
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################
#==============================================================================
# Common
#==============================================================================
# The external address of the host on which the JobManager runs and can be
# reached by the TaskManagers and any clients which want to connect. This setting
# is only used in Standalone mode and may be overwritten on the JobManager side
# by specifying the --host <hostname> parameter of the bin/jobmanager.sh executable.
# In high availability mode, if you use the bin/start-cluster.sh script and setup
# the conf/masters file, this will be taken care of automatically. Yarn/Mesos
# automatically configure the host name based on the hostname of the node where the
# JobManager runs.
jobmanager.rpc.address: master
# The RPC port where the JobManager is reachable.
jobmanager.rpc.port: 6123
# The total process memory size for the JobManager.
#
# Note this accounts for all memory usage within the JobManager process, including JVM metaspace and other overhead.
jobmanager.memory.process.size: 1600m
# The total process memory size for the TaskManager.
#
# Note this accounts for all memory usage within the TaskManager process, including JVM metaspace and other overhead.
taskmanager.memory.process.size: 1728m
# To exclude JVM metaspace and overhead, please, use total Flink memory size instead of 'taskmanager.memory.process.size'.
# It is not recommended to set both 'taskmanager.memory.process.size' and Flink memory.
#
# taskmanager.memory.flink.size: 1280m
# The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.
taskmanager.numberOfTaskSlots: 8
# The parallelism used for programs that did not specify and other parallelism.
parallelism.default: 1
# The default file system scheme and authority.
#
# By default file paths without scheme are interpreted relative to the local
# root file system 'file:///'. Use this to override the default and interpret
# relative paths relative to a different file system,
# for example 'hdfs://mynamenode:12345'
#
# fs.default-scheme
#==============================================================================
# High Availability
#==============================================================================
# The high-availability mode. Possible options are 'NONE' or 'zookeeper'.
#
# high-availability: zookeeper
# The path where metadata for master recovery is persisted. While ZooKeeper stores
# the small ground truth for checkpoint and leader election, this location stores
# the larger objects, like persisted dataflow graphs.
#
# Must be a durable file system that is accessible from all nodes
# (like HDFS, S3, Ceph, nfs, ...)
#
# high-availability.storageDir: hdfs:///flink/ha/
# The list of ZooKeeper quorum peers that coordinate the high-availability
# setup. This must be a list of the form:
# "host1:clientPort,host2:clientPort,..." (default clientPort: 2181)
#
# high-availability.zookeeper.quorum: localhost:2181
# ACL options are based on https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_BuiltinACLSchemes
# It can be either "creator" (ZOO_CREATE_ALL_ACL) or "open" (ZOO_OPEN_ACL_UNSAFE)
# The default value is "open" and it can be changed to "creator" if ZK security is enabled
#
# high-availability.zookeeper.client.acl: open
#==============================================================================
# Fault tolerance and checkpointing
#==============================================================================
execution.checkpointing.interval: 60s
execution.checkpointing.unaligned: true
execution.checkpointing.timeout: 1200s
# The backend that will be used to store operator state checkpoints if
# checkpointing is enabled.
#
# Supported backends are 'jobmanager', 'filesystem', 'rocksdb', or the
# <class-name-of-factory>.
#
state.backend: filesystem
# Directory for checkpoints filesystem, when using any of the default bundled
# state backends.
#
state.checkpoints.dir: oss://datalake-huifu/hudi/flink/checkpoints
# Default target directory for savepoints, optional.
#
state.savepoints.dir: oss://datalake-huifu/hudi/flink/savepoints
# Flag to enable/disable incremental checkpoints for backends that
# support incremental checkpoints (like the RocksDB state backend).
#
state.backend.incremental: true
# The failover strategy, i.e., how the job computation recovers from task failures.
# Only restart tasks that may have been affected by the task failure, which typically includes
# downstream tasks and potentially upstream tasks if their produced data is no longer available for consumption.
jobmanager.execution.failover-strategy: region
#==============================================================================
# Rest & web frontend
#==============================================================================
# The port to which the REST client connects to. If rest.bind-port has
# not been specified, then the server will bind to this port as well.
#
rest.port: 18081
# The address to which the REST client will connect to
#
#rest.address: 0.0.0.0
# Port range for the REST and web server to bind to.
#
#rest.bind-port: 8080-8090
# The address that the REST & web server binds to
#
#rest.bind-address: 0.0.0.0
# Flag to specify whether job submission is enabled from the web-based
# runtime monitor. Uncomment to disable.
#web.submit.enable: false
#==============================================================================
# Advanced
#==============================================================================
# Override the directories for temporary files. If not specified, the
# system-specific Java temporary directory (java.io.tmpdir property) is taken.
#
# For framework setups on Yarn or Mesos, Flink will automatically pick up the
# containers' temp directories without any need for configuration.
#
# Add a delimited list for multiple directories, using the system directory
# delimiter (colon ':' on unix) or a comma, e.g.:
# /data1/tmp:/data2/tmp:/data3/tmp
#
# Note: Each directory entry is read from and written to by a different I/O
# thread. You can include the same directory multiple times in order to create
# multiple I/O threads against that directory. This is for example relevant for
# high-throughput RAIDs.
#
# io.tmp.dirs: /tmp
# The classloading resolve order. Possible values are 'child-first' (Flink's default)
# and 'parent-first' (Java's default).
#
# Child first classloading allows users to use different dependency/library
# versions in their application than those in the classpath. Switching back
# to 'parent-first' may help with debugging dependency issues.
#
classloader.resolve-order: parent-first
# The amount of memory going to the network stack. These numbers usually need
# no tuning. Adjusting them may be necessary in case of an "Insufficient number
# of network buffers" error. The default min is 64MB, the default max is 1GB.
#
# taskmanager.memory.network.fraction: 0.1
# taskmanager.memory.network.min: 64mb
# taskmanager.memory.network.max: 1gb
#==============================================================================
# Flink Cluster Security Configuration
#==============================================================================
# Kerberos authentication for various components - Hadoop, ZooKeeper, and connectors -
# may be enabled in four steps:
# 1. configure the local krb5.conf file
# 2. provide Kerberos credentials (either a keytab or a ticket cache w/ kinit)
# 3. make the credentials available to various JAAS login contexts
# 4. configure the connector to use JAAS/SASL
# The below configure how Kerberos credentials are provided. A keytab will be used instead of
# a ticket cache if the keytab path and principal are set.
# security.kerberos.login.use-ticket-cache: true
# security.kerberos.login.keytab: /path/to/kerberos/keytab
# security.kerberos.login.principal: flink-user
# The configuration below defines which JAAS login contexts
# security.kerberos.login.contexts: Client,KafkaClient
#==============================================================================
# ZK Security Configuration
#==============================================================================
# Below configurations are applicable if ZK ensemble is configured for security
# Override below configuration to provide custom ZK service name if configured
# zookeeper.sasl.service-name: zookeeper
# The configuration below must match one of the values set in "security.kerberos.login.contexts"
# zookeeper.sasl.login-context-name: Client
#==============================================================================
# HistoryServer
#==============================================================================
# The HistoryServer is started and stopped via bin/historyserver.sh (start|stop)
# Directory to upload completed jobs to. Add this directory to the list of
# monitored directories of the HistoryServer as well (see below).
#jobmanager.archive.fs.dir: hdfs:///completed-jobs/
# The address under which the web-based HistoryServer listens.
#historyserver.web.address: 0.0.0.0
# The port under which the web-based HistoryServer listens.
#historyserver.web.port: 8082
# Comma separated list of directories to monitor for completed jobs.
#historyserver.archive.fs.dir: hdfs:///completed-jobs/
# Interval in milliseconds for refreshing the monitored directories.
#historyserver.archive.fs.refresh-interval: 10000
fs.oss.endpoint: oss-cn-shanghai.aliyuncs.com
fs.oss.accessKeyId: xxx
fs.oss.accessKeySecret: xxx
metrics.reporter.promgateway.class: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter
metrics.reporter.promgateway.host: 192.168.24.41
metrics.reporter.promgateway.port: 9091
metrics.reporter.promgateway.jobName: PerfTest
metrics.reporter.promgateway.randomJobNameSuffix: true
metrics.reporter.promgateway.deleteOnShutdown: false
在 2021-11-04 19:32:45,"Caizhi Weng" <ts...@gmail.com> 写道:
>Hi!
>
>没有在邮件里发现附件,可以考虑把 flink-conf.yaml 的内容贴在邮件里,或者外部剪贴板。
>
>casel.chen <ca...@126.com> 于2021年11月4日周四 下午6:42写道:
>
>> flink 1.13.2 + hadoop 3.2.1
>> yarn上已经成功跑了hive和spark作业
>> flink上通过运行 bin/yarn-session.sh 启动yarn session集群的时候一直报如下INFO日志,查看yarn web
>> console发现并没有启flink-session集群,我的flink-conf.yaml配置如附件,hadoop集群并没有开启认证SSL之类的,改用standalone模式是可以启动3节点集群的,请问这会是什么原因造成的?要怎么修复?谢谢!
>>
>>
>>
>> 2021-11-04 16:51:39,964 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:39,986 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,004 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,020 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,041 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,059 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,078 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,097 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,114 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,134 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,155 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,175 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,193 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>> 2021-11-04 16:51:40,212 INFO
>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
>> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
>> = false
>>
>>
>>
>>
>>
>>
Re: flink启动yarn-session失败
Posted by Caizhi Weng <ts...@gmail.com>.
Hi!
没有在邮件里发现附件,可以考虑把 flink-conf.yaml 的内容贴在邮件里,或者外部剪贴板。
casel.chen <ca...@126.com> 于2021年11月4日周四 下午6:42写道:
> flink 1.13.2 + hadoop 3.2.1
> yarn上已经成功跑了hive和spark作业
> flink上通过运行 bin/yarn-session.sh 启动yarn session集群的时候一直报如下INFO日志,查看yarn web
> console发现并没有启flink-session集群,我的flink-conf.yaml配置如附件,hadoop集群并没有开启认证SSL之类的,改用standalone模式是可以启动3节点集群的,请问这会是什么原因造成的?要怎么修复?谢谢!
>
>
>
> 2021-11-04 16:51:39,964 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:39,986 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,004 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,020 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,041 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,059 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,078 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,097 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,114 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,134 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,155 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,175 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,193 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
> 2021-11-04 16:51:40,212 INFO
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient []
> - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted
> = false
>
>
>
>
>
>