You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by yidan zhao <hi...@gmail.com> on 2021/03/03 06:47:27 UTC

Flink启动后某个TM的某个slot不工作,看起来像是直接没任何通信。

如题,日志:
2021-03-03 11:03:17,151 WARN org.apache.flink.runtime.util.HadoopUtils [] -
Could not find Hadoop configuration via any of the supported methods (Flink
configuration, environment variables).

2021-03-03 11:03:17,344 WARN org.apache.hadoop.util.NativeCodeLoader [] -
Unable to load native-hadoop library for your platform... using
builtin-java classes where applicable

2021-03-03 11:03:17,441 WARN org.apache.flink.runtime.util.HadoopUtils [] -
Could not find Hadoop configuration via any of the supported methods (Flink
configuration, environment variables).

2021-03-03 11:03:18,226 WARN
org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn [] -
SASL configuration failed: javax.security.auth.login.LoginException: No
JAAS configuration section named 'Client' was found in specified JAAS
configuration file:
'/home/work/antibotFlink/flink-1.12.0/tmp/jaas-1092430908919603833.conf'.
Will continue connection to Zookeeper server without SASL authentication,
if Zookeeper server allows it.

2021-03-03 11:03:18,227 ERROR
org.apache.flink.shaded.curator4.org.apache.curator.ConnectionState [] -
Authentication failed

2021-03-03 11:03:18,957 ERROR akka.remote.transport.netty.NettyTransport []
- failed to bind to /0.0.0.0:2027, shutting down Netty transport
2021-03-03 11:03:18,973 ERROR akka.remote.Remoting [] - Remoting system has
been terminated abrubtly. Attempting to shut down transports

如上,有个关于端口的报错,有人知道原因吗?
问题直接表现和影响是,我某个source的task无任何输出(此处的无输出包括任何数据,bytes
sent为0)。导致后续结点无watermark。进而反压永久=1(进而出现了一种之前就觉得很奇怪的场景:即反压到不工作,CPU都不再利用了。。。)。

Re: Flink启动后某个TM的某个slot不工作,看起来像是直接没任何通信。

Posted by yidan zhao <hi...@gmail.com>.
这个问题出现很多次了。目前还有一种case,不是启动的时候。
如果是启动的时候,则表现为watermark显示为没有,即无任何watermark。
另一种case是启动后正常运行,若干时间(可能几小时,也可能很多天)后,突然开始watermark无限停滞。导致无限反压。

yidan zhao <hi...@gmail.com> 于2021年3月3日周三 下午2:47写道:

> 如题,日志:
> 2021-03-03 11:03:17,151 WARN org.apache.flink.runtime.util.HadoopUtils []
> - Could not find Hadoop configuration via any of the supported methods
> (Flink configuration, environment variables).
>
> 2021-03-03 11:03:17,344 WARN org.apache.hadoop.util.NativeCodeLoader [] -
> Unable to load native-hadoop library for your platform... using
> builtin-java classes where applicable
>
> 2021-03-03 11:03:17,441 WARN org.apache.flink.runtime.util.HadoopUtils []
> - Could not find Hadoop configuration via any of the supported methods
> (Flink configuration, environment variables).
>
> 2021-03-03 11:03:18,226 WARN
> org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn [] -
> SASL configuration failed: javax.security.auth.login.LoginException: No
> JAAS configuration section named 'Client' was found in specified JAAS
> configuration file:
> '/home/work/antibotFlink/flink-1.12.0/tmp/jaas-1092430908919603833.conf'.
> Will continue connection to Zookeeper server without SASL authentication,
> if Zookeeper server allows it.
>
> 2021-03-03 11:03:18,227 ERROR
> org.apache.flink.shaded.curator4.org.apache.curator.ConnectionState [] -
> Authentication failed
>
> 2021-03-03 11:03:18,957 ERROR akka.remote.transport.netty.NettyTransport
> [] - failed to bind to /0.0.0.0:2027, shutting down Netty transport
> 2021-03-03 11:03:18,973 ERROR akka.remote.Remoting [] - Remoting system
> has been terminated abrubtly. Attempting to shut down transports
>
> 如上,有个关于端口的报错,有人知道原因吗?
> 问题直接表现和影响是,我某个source的task无任何输出(此处的无输出包括任何数据,bytes
> sent为0)。导致后续结点无watermark。进而反压永久=1(进而出现了一种之前就觉得很奇怪的场景:即反压到不工作,CPU都不再利用了。。。)。
>