You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Elias Levy <fe...@gmail.com> on 2017/09/24 01:04:51 UTC

high-availability.jobmanager.port vs jobmanager.rpc.port

I am wondering why HA mode there is a need for a separate config parameter
to set the JM RPC port (high-availability.jobmanager.port) and why this
parameter accepts a range, unlike jobmanager.rpc.port.

Re: high-availability.jobmanager.port vs jobmanager.rpc.port

Posted by Till Rohrmann <tr...@apache.org>.
Yes exactly.

On Tue, Sep 26, 2017 at 5:07 PM, Elias Levy <fe...@gmail.com>
wrote:

> I presume then that the Job Managers and Task Managers are performing
> service discovery via Zookeeper in HA mode, rather than from the config
> file or the masters file.  Yes?
>
> On Mon, Sep 25, 2017 at 11:14 PM, Till Rohrmann <tr...@apache.org>
> wrote:
>
>> Because a single port could easily lead to clashes if there is another
>> JobManager running on the same machine with the same port (e.g. due to
>> standby JobManagers).
>>
>> Cheers,
>> Till
>>
>> On Sep 26, 2017 03:20, "Elias Levy" <fe...@gmail.com> wrote:
>>
>>> Why a range instead of just a single port in HA mode?
>>>
>>> On Mon, Sep 25, 2017 at 1:49 PM, Till Rohrmann <tr...@apache.org>
>>> wrote:
>>>
>>>> Yes, with Flip-6 it will most likely look like how Stephan described
>>>> it. We need the explicit port in standalone mode so that TMs can connect to
>>>> the JM. In the other deployment scenarios, the port can be randomly picked
>>>> unless you want to specify a port range, e.g. for firewall configuration
>>>> purposes.
>>>>
>>>
>

Re: high-availability.jobmanager.port vs jobmanager.rpc.port

Posted by Elias Levy <fe...@gmail.com>.
I presume then that the Job Managers and Task Managers are performing
service discovery via Zookeeper in HA mode, rather than from the config
file or the masters file.  Yes?

On Mon, Sep 25, 2017 at 11:14 PM, Till Rohrmann <tr...@apache.org>
wrote:

> Because a single port could easily lead to clashes if there is another
> JobManager running on the same machine with the same port (e.g. due to
> standby JobManagers).
>
> Cheers,
> Till
>
> On Sep 26, 2017 03:20, "Elias Levy" <fe...@gmail.com> wrote:
>
>> Why a range instead of just a single port in HA mode?
>>
>> On Mon, Sep 25, 2017 at 1:49 PM, Till Rohrmann <tr...@apache.org>
>> wrote:
>>
>>> Yes, with Flip-6 it will most likely look like how Stephan described it.
>>> We need the explicit port in standalone mode so that TMs can connect to the
>>> JM. In the other deployment scenarios, the port can be randomly picked
>>> unless you want to specify a port range, e.g. for firewall configuration
>>> purposes.
>>>
>>

Re: high-availability.jobmanager.port vs jobmanager.rpc.port

Posted by Till Rohrmann <tr...@apache.org>.
Because a single port could easily lead to clashes if there is another
JobManager running on the same machine with the same port (e.g. due to
standby JobManagers).

Cheers,
Till

On Sep 26, 2017 03:20, "Elias Levy" <fe...@gmail.com> wrote:

> Why a range instead of just a single port in HA mode?
>
> On Mon, Sep 25, 2017 at 1:49 PM, Till Rohrmann <tr...@apache.org>
> wrote:
>
>> Yes, with Flip-6 it will most likely look like how Stephan described it.
>> We need the explicit port in standalone mode so that TMs can connect to the
>> JM. In the other deployment scenarios, the port can be randomly picked
>> unless you want to specify a port range, e.g. for firewall configuration
>> purposes.
>>
>

Re: high-availability.jobmanager.port vs jobmanager.rpc.port

Posted by Elias Levy <fe...@gmail.com>.
Why a range instead of just a single port in HA mode?

On Mon, Sep 25, 2017 at 1:49 PM, Till Rohrmann <tr...@apache.org> wrote:

> Yes, with Flip-6 it will most likely look like how Stephan described it.
> We need the explicit port in standalone mode so that TMs can connect to the
> JM. In the other deployment scenarios, the port can be randomly picked
> unless you want to specify a port range, e.g. for firewall configuration
> purposes.
>

Re: high-availability.jobmanager.port vs jobmanager.rpc.port

Posted by Till Rohrmann <tr...@apache.org>.
Yes, with Flip-6 it will most likely look like how Stephan described it. We
need the explicit port in standalone mode so that TMs can connect to the
JM. In the other deployment scenarios, the port can be randomly picked
unless you want to specify a port range, e.g. for firewall configuration
purposes.

However, if you look at it closely, then it is mainly a renaming of the
existing configuration parameters: jobmanager.rpc.port ->
standalone.jobmanager.rpc.port and high-availability.jobmanager.port ->
jobmanager.rpc.ports
Cheers,
Till
​

On Mon, Sep 25, 2017 at 3:42 PM, Stephan Ewen <se...@apache.org> wrote:

> /cc Till for real this time ;-)
>
> Hi!
>
> I think that can probably be simplified in the FLIP-6 case:
>
>   - All RPC is only between JM and TM and the port should be completely
> random (optionally within a range). TM and JM discover each other via HA
> (ZK) or the TM gets the JM RPC port as a parameter when the container is
> started.
>   (Parameter should be something like 'jobmanager.rpc.ports: 50000-51000')
>
>   - An exception is the standalone non-HA case, because there is no
> service-discovery mechanism. That should probably be the a config key like
> 'standalone.jobmanager.rpc.port: 6123'
>
>   - The client calls come via HTTP/REST and should have one specific port
> that may optionally be discovered/redirected via YARN or the dispatchers.
>
> /cc Till for your thoughts
>
> Best,
> Stephan
>
>
> On Mon, Sep 25, 2017 at 3:31 PM, Nico Kruber <ni...@data-artisans.com>
> wrote:
>
>> Hi Elias,
>> indeed that looks strange but was introduced with FLINK-3172 [1] with an
>> argument about using the same configuration key (as opposed to having two
>> different keys as mentioned) starting at
>> https://issues.apache.org/jira/browse/FLINK-3172?
>> focusedCommentId=15091940#comment-15091940
>>
>>
>> Nico
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-3172
>>
>> On Sunday, 24 September 2017 03:04:51 CEST Elias Levy wrote:
>> > I am wondering why HA mode there is a need for a separate config
>> parameter
>> > to set the JM RPC port (high-availability.jobmanager.port) and why this
>> > parameter accepts a range, unlike jobmanager.rpc.port.
>>
>>
>
>

Re: high-availability.jobmanager.port vs jobmanager.rpc.port

Posted by Stephan Ewen <se...@apache.org>.
/cc Till for real this time ;-)

Hi!

I think that can probably be simplified in the FLIP-6 case:

  - All RPC is only between JM and TM and the port should be completely
random (optionally within a range). TM and JM discover each other via HA
(ZK) or the TM gets the JM RPC port as a parameter when the container is
started.
  (Parameter should be something like 'jobmanager.rpc.ports: 50000-51000')

  - An exception is the standalone non-HA case, because there is no
service-discovery mechanism. That should probably be the a config key like
'standalone.jobmanager.rpc.port: 6123'

  - The client calls come via HTTP/REST and should have one specific port
that may optionally be discovered/redirected via YARN or the dispatchers.

/cc Till for your thoughts

Best,
Stephan


On Mon, Sep 25, 2017 at 3:31 PM, Nico Kruber <ni...@data-artisans.com> wrote:

> Hi Elias,
> indeed that looks strange but was introduced with FLINK-3172 [1] with an
> argument about using the same configuration key (as opposed to having two
> different keys as mentioned) starting at
> https://issues.apache.org/jira/browse/FLINK-3172?
> focusedCommentId=15091940#comment-15091940
>
>
> Nico
>
> [1] https://issues.apache.org/jira/browse/FLINK-3172
>
> On Sunday, 24 September 2017 03:04:51 CEST Elias Levy wrote:
> > I am wondering why HA mode there is a need for a separate config
> parameter
> > to set the JM RPC port (high-availability.jobmanager.port) and why this
> > parameter accepts a range, unlike jobmanager.rpc.port.
>
>

Re: high-availability.jobmanager.port vs jobmanager.rpc.port

Posted by Stephan Ewen <se...@apache.org>.
Hi!

I think that can probably be simplified in the FLIP-6 case:

  - All RPC is only between JM and TM and the port should be completely
random (optionally within a range). TM and JM discover each other via HA
(ZK) or the TM gets the JM RPC port as a parameter when the container is
started.
  (Parameter should be something like 'jobmanager.rpc.ports: 50000-51000')

  - An exception is the standalone non-HA case, because there is no
service-discovery mechanism. That should probably be the a config key like
'standalone.jobmanager.rpc.port: 6123'

  - The client calls come via HTTP/REST and should have one specific port
that may optionally be discovered/redirected via YARN or the dispatchers.

/cc Till for your thoughts

Best,
Stephan


On Mon, Sep 25, 2017 at 3:31 PM, Nico Kruber <ni...@data-artisans.com> wrote:

> Hi Elias,
> indeed that looks strange but was introduced with FLINK-3172 [1] with an
> argument about using the same configuration key (as opposed to having two
> different keys as mentioned) starting at
> https://issues.apache.org/jira/browse/FLINK-3172?
> focusedCommentId=15091940#comment-15091940
>
>
> Nico
>
> [1] https://issues.apache.org/jira/browse/FLINK-3172
>
> On Sunday, 24 September 2017 03:04:51 CEST Elias Levy wrote:
> > I am wondering why HA mode there is a need for a separate config
> parameter
> > to set the JM RPC port (high-availability.jobmanager.port) and why this
> > parameter accepts a range, unlike jobmanager.rpc.port.
>
>

Re: high-availability.jobmanager.port vs jobmanager.rpc.port

Posted by Nico Kruber <ni...@data-artisans.com>.
Hi Elias,
indeed that looks strange but was introduced with FLINK-3172 [1] with an 
argument about using the same configuration key (as opposed to having two 
different keys as mentioned) starting at
https://issues.apache.org/jira/browse/FLINK-3172?
focusedCommentId=15091940#comment-15091940


Nico

[1] https://issues.apache.org/jira/browse/FLINK-3172

On Sunday, 24 September 2017 03:04:51 CEST Elias Levy wrote:
> I am wondering why HA mode there is a need for a separate config parameter
> to set the JM RPC port (high-availability.jobmanager.port) and why this
> parameter accepts a range, unlike jobmanager.rpc.port.