You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by mrooding <ad...@webresource.nl> on 2017/10/31 12:55:03 UTC

JobManager web interface redirect strategy when running in HA

Hi

We're running 3 job managers in high availability cluster mode backed by
OpenStack/Openshift. We're currently exposing all 3 job managers using 3
different routes (flink-1.domain.tld, flink-2.domain.tld,
flink-3.domain.tld). When accessing the route for a job manager which isn't
the leader it automatically redirects the user to the host and port of the
leading job manager. From what I've seen in the source code the rpc address
and port are being used to redirect. Since the internal hostnames are not
accessible outside the cluster this obviously doesn't work.

The nicest solution would be a single route (flink.domain.tld) which would
correctly delegate requests to the leading job manager. The second best
solution would probably be the possibility to declare a public URL in the
flink configuration file.

I'd be more than happy to contribute to Flink and add support for this but
I'd love to hear your ideas about it.

Kind regards

Marc




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: JobManager web interface redirect strategy when running in HA

Posted by mrooding <ad...@webresource.nl>.

Chesnay, your solution is definitely the best approach. I was already
wondering why the decision was made to only support the UI through the
leading job manager only.

Jürgen, I don't think that your solution will work in our setup. We're
currently running 3 services, one for each job manager. We need a service
per job manager because they obviously need to be able to talk to each
other. In the latest version of OpenShift you can use a StatefulSet to
handle these situations but unfortunately, StatefulSets seem to rely on each
node receiving its own persistent volume claim whereas Flink seems to share
1 persistent volume claim for all nodes.

I've been going through the Kubernetes documentation about Load Balancers
but I'm unable to find a solution which handles both cases:
- each node being available through a cluster name (e.g.
flink-jobmanager-1.env.svc.cluster.local)
- exposing 1 URL which uses the load balancing solution proposed by you

Worst case is that we would have to wait for Flink 1.5 and keep using 3
distinct URLs. It's not ideal but there are also bigger fish to tackle.

Marc



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: JobManager web interface redirect strategy when running in HA

Posted by Jürgen Thomann <ju...@innogames.com>.

I think you can solve this already with a health check (health monitor 
in OpenStack?).
I'm currently using GET requests to / and if they don't reply with a  
200 code the LB
will not use them. Only the Leader answers with a 200 code whereas the 
others send
a redirect with 30x code which should ensure that the requests go always 
the leader.


On 01.11.2017 12:34, Chesnay Schepler wrote:
> We intend to change the redirection behavior such that any jobmanager 
> (leading or not) can
> accept requests, and communicates internally with the leader. In this 
> model you could setup
> the flink.domain.tld to point to any jobmanager (or distribute 
> requests among them).
>
> Would this work for you?
>
> I believe this is targeted for 1.5.
>
> On 31.10.2017 13:55, mrooding wrote:
>> Hi
>>
>> We're running 3 job managers in high availability cluster mode backed by
>> OpenStack/Openshift. We're currently exposing all 3 job managers using 3
>> different routes (flink-1.domain.tld, flink-2.domain.tld,
>> flink-3.domain.tld). When accessing the route for a job manager which 
>> isn't
>> the leader it automatically redirects the user to the host and port 
>> of the
>> leading job manager. From what I've seen in the source code the rpc 
>> address
>> and port are being used to redirect. Since the internal hostnames are 
>> not
>> accessible outside the cluster this obviously doesn't work.
>>
>> The nicest solution would be a single route (flink.domain.tld) which 
>> would
>> correctly delegate requests to the leading job manager. The second best
>> solution would probably be the possibility to declare a public URL in 
>> the
>> flink configuration file.
>>
>> I'd be more than happy to contribute to Flink and add support for 
>> this but
>> I'd love to hear your ideas about it.
>>
>> Kind regards
>>
>> Marc
>>
>>
>>
>>
>> -- 
>> Sent from: 
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>>
>

-- 
Jürgen Thomann
Software Developer


InnoGames GmbH
Friesenstraße 13 - 20097 Hamburg - Germany
Tel +49 40 7889335-0

Managing Directors: Hendrik Klindworth, Michael Zillmer
VAT-ID: DE264068907 Amtsgericht Hamburg, HRB 108973

http://www.innogames.com – juergen.thomann@innogames.com

Re: JobManager web interface redirect strategy when running in HA

Posted by Chesnay Schepler <ch...@apache.org>.

We intend to change the redirection behavior such that any jobmanager 
(leading or not) can
accept requests, and communicates internally with the leader. In this 
model you could setup
the flink.domain.tld to point to any jobmanager (or distribute requests 
among them).

Would this work for you?

I believe this is targeted for 1.5.

On 31.10.2017 13:55, mrooding wrote:
> Hi
>
> We're running 3 job managers in high availability cluster mode backed by
> OpenStack/Openshift. We're currently exposing all 3 job managers using 3
> different routes (flink-1.domain.tld, flink-2.domain.tld,
> flink-3.domain.tld). When accessing the route for a job manager which isn't
> the leader it automatically redirects the user to the host and port of the
> leading job manager. From what I've seen in the source code the rpc address
> and port are being used to redirect. Since the internal hostnames are not
> accessible outside the cluster this obviously doesn't work.
>
> The nicest solution would be a single route (flink.domain.tld) which would
> correctly delegate requests to the leading job manager. The second best
> solution would probably be the possibility to declare a public URL in the
> flink configuration file.
>
> I'd be more than happy to contribute to Flink and add support for this but
> I'd love to hear your ideas about it.
>
> Kind regards
>
> Marc
>
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>