You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Chesnay Schepler (Jira)" <ji...@apache.org> on 2020/04/29 13:37:00 UTC
[jira] [Comment Edited] (FLINK-17443) Flink's ZK in HA mode setup
is unable to start up if any of the zk hosts are unreachable
[ https://issues.apache.org/jira/browse/FLINK-17443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17095457#comment-17095457 ]
Chesnay Schepler edited comment on FLINK-17443 at 4/29/20, 1:36 PM:
--------------------------------------------------------------------
I'm not actively working on it; I can assign it to you. You basically have to do the thing you proposed for Flink, just against the [flink-shaded|https://github.com/apache/flink-shaded/] repo.
I would then close this ticket as a duplicate.
was (Author: zentol):
I'm not actively working on it; I can assign it to you. You basically have to the thing you proposed for Flink, just against the [flink-shaded|https://github.com/apache/flink-shaded/] repo.
I would then close this ticket as a duplicate.
> Flink's ZK in HA mode setup is unable to start up if any of the zk hosts are unreachable
> ----------------------------------------------------------------------------------------
>
> Key: FLINK-17443
> URL: https://issues.apache.org/jira/browse/FLINK-17443
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Reporter: Piyush Narang
> Priority: Major
> Labels: pull-request-available
>
> We occasionally hit an issue where our Flink cluster will not startup if any of the zookeeper hosts passed in the "high-availability.zookeeper.quorum" config setting are unreachable. This seems to stem from us using an older zookeeper dependency version (3.4.10).
> Sample error we see is shown below.
> This error seems to stem from us being on an older zookeeper release (3.4.10). This has been fixed as part of: https://issues.apache.org/jira/browse/ZOOKEEPER-1576 in the 3.4.x branch ([https://github.com/apache/zookeeper/commit/be1409cc9a14ac2e28693e0e02a0ba6d9713565e]).
> {code:java}
> java.net.UnknownHostException: zk01-pa4.hpc.criteo.prod: Name or service not knownjava.net.UnknownHostException: zk01-pa4.hpc.criteo.prod: Name or service not known at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) at java.net.InetAddress.getAllByName0(InetAddress.java:1277) at java.net.InetAddress.getAllByName(InetAddress.java:1193) at java.net.InetAddress.getAllByName(InetAddress.java:1127) at org.apache.flink.shaded.zookeeper.org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:61) at org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445) at org.apache.flink.shaded.curator.org.apache.curator.utils.DefaultZookeeperFactory.newZooKeeper(DefaultZookeeperFactory.java:29) at org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorFrameworkImpl$2.newZooKeeper(CuratorFrameworkImpl.java:150) at org.apache.flink.shaded.curator.org.apache.curator.HandleHolder$1.getZooKeeper(HandleHolder.java:94) at org.apache.flink.shaded.curator.org.apache.curator.HandleHolder.getZooKeeper(HandleHolder.java:55) at org.apache.flink.shaded.curator.org.apache.curator.ConnectionState.reset(ConnectionState.java:262) at org.apache.flink.shaded.curator.org.apache.curator.ConnectionState.start(ConnectionState.java:109) at org.apache.flink.shaded.curator.org.apache.curator.CuratorZookeeperClient.start(CuratorZookeeperClient.java:191) at org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorFrameworkImpl.start(CuratorFrameworkImpl.java:259) at org.apache.flink.runtime.util.ZooKeeperUtils.startCuratorFramework(ZooKeeperUtils.java:131) at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:123) at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:292) at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:257){code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)