You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Philipp von dem Bussche (JIRA)" <ji...@apache.org> on 2017/02/01 11:55:52 UTC

[jira] [Commented] (FLINK-2821) Change Akka configuration to allow accessing actors from different URLs

    [ https://issues.apache.org/jira/browse/FLINK-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15848287#comment-15848287 ] 

Philipp von dem Bussche commented on FLINK-2821:
------------------------------------------------

Hello [~mxm], after being quiet for a while I wanted to feed back on the setup I am running at the moment.
To recap (I had to think about my setup myself again after not spending much time on it lately ;) ):
- job manager and task manager run in Docker containers
- I am using an orchestration engine called Rancher on top of docker which also introduces another set of IP addresses / network on top of Docker.

Since I am communicating to the JobManager from within the Docker / Rancher network as well as from outside (from my local buildserver) I had to have the JobManager register to a hostname that is resolvable on the Internet. Both the task manager (coming from within the Docker / Rancher network) as well as the build server connect via the internet host name now. Obviously since the task manager would live right next to the job manager the preferred solution would be for the task manager to connect locally (meaning through the Docker / Rancher network) but since one can only specify one listener address it has to go through the internet host name.

However this does not solve the problem completly yet because if I just tell the JobManager to bind to the internet host name I am getting the following exception while JobManager starts up:

017-02-01 11:13:51,997 INFO  org.apache.flink.util.NetUtils                                - Unable to allocate on port 6123, due to error: Address not available (Bind failed)
2017-02-01 11:13:51,999 ERROR org.apache.flink.runtime.jobmanager.JobManager                - Failed to run JobManager.
java.lang.RuntimeException: Unable to do further retries starting the actor system
        at org.apache.flink.runtime.jobmanager.JobManager$.retryOnBindException(JobManager.scala:2136)
        at org.apache.flink.runtime.jobmanager.JobManager$.runJobManager(JobManager.scala:2076)
        at org.apache.flink.runtime.jobmanager.JobManager$$anon$12.call(JobManager.scala:1971)
        at org.apache.flink.runtime.jobmanager.JobManager$$anon$12.call(JobManager.scala:1969)
        at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:29)
        at org.apache.flink.runtime.jobmanager.JobManager$.main(JobManager.scala:1969)
        at org.apache.flink.runtime.jobmanager.JobManager.main(JobManager.scala)

So additionally I had to put the Docker IP address of the JobManager container into /etc/hosts resolving to the internet host name so that it tries to bind on the Docker IP address rather than the Amazon AWS IP address (which is the IP that the internet host name resolves to).

This works for me now, I would not call it ideal though.

I have to admit I have not tested this with the latest RC, will do that later in the week.
Thanks

> Change Akka configuration to allow accessing actors from different URLs
> -----------------------------------------------------------------------
>
>                 Key: FLINK-2821
>                 URL: https://issues.apache.org/jira/browse/FLINK-2821
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Coordination
>            Reporter: Robert Metzger
>            Assignee: Maximilian Michels
>             Fix For: 1.2.0
>
>
> Akka expects the actor's URL to be exactly matching.
> As pointed out here, cases where users were complaining about this: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Error-trying-to-access-JM-through-proxy-td3018.html
>   - Proxy routing (as described here, send to the proxy URL, receiver recognizes only original URL)
>   - Using hostname / IP interchangeably does not work (we solved this by always putting IP addresses into URLs, never hostnames)
>   - Binding to multiple interfaces (any local 0.0.0.0) does not work. Still no solution to that (but seems not too much of a restriction)
> I am aware that this is not possible due to Akka, so it is actually not a Flink bug. But I think we should track the resolution of the issue here anyways because its affecting our user's satisfaction.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)