You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by geoffroy chollon <un...@gmail.com> on 2015/11/20 15:50:35 UTC

Yarn: how to specify the hostname/IP for communication between the ResourceManager and the NodeManager for a Kubernetes deployment.

Hello,


Tl;dr question: How to make the NodeManager give the
"yarn.nodemanager.hostname" value to the Yarn ResourceManager during
registration for IPC communication.


Explanations:

I have trouble making a Yarn cluster works in a Kubernetes deployment where
every components run as a Docker container.

When the NodeManager registers with the Yarn ResourceManager, it give its
hostname for NodeId. Then the ResourceManager uses this NodeId to
communicate with the node.

The problem is that the hostname comes from system calls and not from the
Yarn's configuration files. The ResourceManager is unable to talk to the
NodeManager because the container hostname is invalid in the DNS context.
Moreover, Docker disallows a dynamic change of a container's hostname.

My Yarn NodeManager's configuration looks like this:

*      <property>
*        <name>yarn.resourcemanager.hostname</name>
*        <value>yarnrm-int-svc.default.svc.cluster.docker</value>
*      </property>
*
*      <property>
*        <name>yarn.nodemanager.hostname</name>
*        <value>100-66-10-4.default.pod.cluster.docker</value>
*      </property>
*      <property>
*        <name>yarn.nodemanager.address</name>
*        <value>100-66-10-4.default.pod.cluster.docker:8111</value>
*      </property>
*      <property>
*        <name>yarn.nodemanager.webapp.address</name>
*        <value>100-66-10-4.default.pod.cluster.docker:8042</value>
*      </property>
*      <property>
*        <name>yarn.nodemanager.localizer.address</name>
*        <value>100-66-10-4.default.pod.cluster.docker:8040</value>
*      </property>

I tried to simply use IP but it didn't work either. For now I had to fork
the Kubernetes's DNS plugin to please Yarn but its not a proper solution.

It's difficult to understand why the NodeId not configurable in this setup.
If so, why do the "yarn.nodemanager.hostname" variable exists in the first
place ?.
Anyway, if someone knows a solution to this problem it would be very
helpful.


Thanks
Geoffroy

Re: Yarn: how to specify the hostname/IP for communication between the ResourceManager and the NodeManager for a Kubernetes deployment.

Posted by 林家銘 <ro...@gmail.com>.
Hi

I think you should try the Service IP and DNS solution.

The Service IP is a virtual IP and also a rule in iptables, it's life
cycle is managed by end-users, thus it is suitable being recorded in
DNS.

So back to this case, you could create the Service layer first. And
since its FQDN is specified by user, so you then pass the FQDN to the
Replication-Controller when you create it. Then you might have to
write a script in Docker image to replace the host name with the FQDN.

In my experiences, there would be several problems.
1) You can only change to host name for privileged containers.
2) There seems some random ports listened by YARN, but the in Pod,
Service specifications, as far as I know, they don't support to expose
a ranged IP (Kubernetes 1.0). So we end up just using the /etc/hosts
to record hostname and IP pairs.

But I am curious about, where you deploy the HDFS? Also with the
Kubernetes and Dockers?


On 20/11/2015, geoffroy chollon <un...@gmail.com> wrote:
> Hello,
>
>
> Tl;dr question: How to make the NodeManager give the
> "yarn.nodemanager.hostname" value to the Yarn ResourceManager during
> registration for IPC communication.
>
>
> Explanations:
>
> I have trouble making a Yarn cluster works in a Kubernetes deployment where
> every components run as a Docker container.
>
> When the NodeManager registers with the Yarn ResourceManager, it give its
> hostname for NodeId. Then the ResourceManager uses this NodeId to
> communicate with the node.
>
> The problem is that the hostname comes from system calls and not from the
> Yarn's configuration files. The ResourceManager is unable to talk to the
> NodeManager because the container hostname is invalid in the DNS context.
> Moreover, Docker disallows a dynamic change of a container's hostname.
>
> My Yarn NodeManager's configuration looks like this:
>
> *      <property>
> *        <name>yarn.resourcemanager.hostname</name>
> *        <value>yarnrm-int-svc.default.svc.cluster.docker</value>
> *      </property>
> *
> *      <property>
> *        <name>yarn.nodemanager.hostname</name>
> *        <value>100-66-10-4.default.pod.cluster.docker</value>
> *      </property>
> *      <property>
> *        <name>yarn.nodemanager.address</name>
> *        <value>100-66-10-4.default.pod.cluster.docker:8111</value>
> *      </property>
> *      <property>
> *        <name>yarn.nodemanager.webapp.address</name>
> *        <value>100-66-10-4.default.pod.cluster.docker:8042</value>
> *      </property>
> *      <property>
> *        <name>yarn.nodemanager.localizer.address</name>
> *        <value>100-66-10-4.default.pod.cluster.docker:8040</value>
> *      </property>
>
> I tried to simply use IP but it didn't work either. For now I had to fork
> the Kubernetes's DNS plugin to please Yarn but its not a proper solution.
>
> It's difficult to understand why the NodeId not configurable in this setup.
> If so, why do the "yarn.nodemanager.hostname" variable exists in the first
> place ?.
> Anyway, if someone knows a solution to this problem it would be very
> helpful.
>
>
> Thanks
> Geoffroy
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
For additional commands, e-mail: user-help@hadoop.apache.org


Re: Yarn: how to specify the hostname/IP for communication between the ResourceManager and the NodeManager for a Kubernetes deployment.

Posted by 林家銘 <ro...@gmail.com>.
Hi

I think you should try the Service IP and DNS solution.

The Service IP is a virtual IP and also a rule in iptables, it's life
cycle is managed by end-users, thus it is suitable being recorded in
DNS.

So back to this case, you could create the Service layer first. And
since its FQDN is specified by user, so you then pass the FQDN to the
Replication-Controller when you create it. Then you might have to
write a script in Docker image to replace the host name with the FQDN.

In my experiences, there would be several problems.
1) You can only change to host name for privileged containers.
2) There seems some random ports listened by YARN, but the in Pod,
Service specifications, as far as I know, they don't support to expose
a ranged IP (Kubernetes 1.0). So we end up just using the /etc/hosts
to record hostname and IP pairs.

But I am curious about, where you deploy the HDFS? Also with the
Kubernetes and Dockers?


On 20/11/2015, geoffroy chollon <un...@gmail.com> wrote:
> Hello,
>
>
> Tl;dr question: How to make the NodeManager give the
> "yarn.nodemanager.hostname" value to the Yarn ResourceManager during
> registration for IPC communication.
>
>
> Explanations:
>
> I have trouble making a Yarn cluster works in a Kubernetes deployment where
> every components run as a Docker container.
>
> When the NodeManager registers with the Yarn ResourceManager, it give its
> hostname for NodeId. Then the ResourceManager uses this NodeId to
> communicate with the node.
>
> The problem is that the hostname comes from system calls and not from the
> Yarn's configuration files. The ResourceManager is unable to talk to the
> NodeManager because the container hostname is invalid in the DNS context.
> Moreover, Docker disallows a dynamic change of a container's hostname.
>
> My Yarn NodeManager's configuration looks like this:
>
> *      <property>
> *        <name>yarn.resourcemanager.hostname</name>
> *        <value>yarnrm-int-svc.default.svc.cluster.docker</value>
> *      </property>
> *
> *      <property>
> *        <name>yarn.nodemanager.hostname</name>
> *        <value>100-66-10-4.default.pod.cluster.docker</value>
> *      </property>
> *      <property>
> *        <name>yarn.nodemanager.address</name>
> *        <value>100-66-10-4.default.pod.cluster.docker:8111</value>
> *      </property>
> *      <property>
> *        <name>yarn.nodemanager.webapp.address</name>
> *        <value>100-66-10-4.default.pod.cluster.docker:8042</value>
> *      </property>
> *      <property>
> *        <name>yarn.nodemanager.localizer.address</name>
> *        <value>100-66-10-4.default.pod.cluster.docker:8040</value>
> *      </property>
>
> I tried to simply use IP but it didn't work either. For now I had to fork
> the Kubernetes's DNS plugin to please Yarn but its not a proper solution.
>
> It's difficult to understand why the NodeId not configurable in this setup.
> If so, why do the "yarn.nodemanager.hostname" variable exists in the first
> place ?.
> Anyway, if someone knows a solution to this problem it would be very
> helpful.
>
>
> Thanks
> Geoffroy
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
For additional commands, e-mail: user-help@hadoop.apache.org


Re: Yarn: how to specify the hostname/IP for communication between the ResourceManager and the NodeManager for a Kubernetes deployment.

Posted by 林家銘 <ro...@gmail.com>.
Hi

I think you should try the Service IP and DNS solution.

The Service IP is a virtual IP and also a rule in iptables, it's life
cycle is managed by end-users, thus it is suitable being recorded in
DNS.

So back to this case, you could create the Service layer first. And
since its FQDN is specified by user, so you then pass the FQDN to the
Replication-Controller when you create it. Then you might have to
write a script in Docker image to replace the host name with the FQDN.

In my experiences, there would be several problems.
1) You can only change to host name for privileged containers.
2) There seems some random ports listened by YARN, but the in Pod,
Service specifications, as far as I know, they don't support to expose
a ranged IP (Kubernetes 1.0). So we end up just using the /etc/hosts
to record hostname and IP pairs.

But I am curious about, where you deploy the HDFS? Also with the
Kubernetes and Dockers?


On 20/11/2015, geoffroy chollon <un...@gmail.com> wrote:
> Hello,
>
>
> Tl;dr question: How to make the NodeManager give the
> "yarn.nodemanager.hostname" value to the Yarn ResourceManager during
> registration for IPC communication.
>
>
> Explanations:
>
> I have trouble making a Yarn cluster works in a Kubernetes deployment where
> every components run as a Docker container.
>
> When the NodeManager registers with the Yarn ResourceManager, it give its
> hostname for NodeId. Then the ResourceManager uses this NodeId to
> communicate with the node.
>
> The problem is that the hostname comes from system calls and not from the
> Yarn's configuration files. The ResourceManager is unable to talk to the
> NodeManager because the container hostname is invalid in the DNS context.
> Moreover, Docker disallows a dynamic change of a container's hostname.
>
> My Yarn NodeManager's configuration looks like this:
>
> *      <property>
> *        <name>yarn.resourcemanager.hostname</name>
> *        <value>yarnrm-int-svc.default.svc.cluster.docker</value>
> *      </property>
> *
> *      <property>
> *        <name>yarn.nodemanager.hostname</name>
> *        <value>100-66-10-4.default.pod.cluster.docker</value>
> *      </property>
> *      <property>
> *        <name>yarn.nodemanager.address</name>
> *        <value>100-66-10-4.default.pod.cluster.docker:8111</value>
> *      </property>
> *      <property>
> *        <name>yarn.nodemanager.webapp.address</name>
> *        <value>100-66-10-4.default.pod.cluster.docker:8042</value>
> *      </property>
> *      <property>
> *        <name>yarn.nodemanager.localizer.address</name>
> *        <value>100-66-10-4.default.pod.cluster.docker:8040</value>
> *      </property>
>
> I tried to simply use IP but it didn't work either. For now I had to fork
> the Kubernetes's DNS plugin to please Yarn but its not a proper solution.
>
> It's difficult to understand why the NodeId not configurable in this setup.
> If so, why do the "yarn.nodemanager.hostname" variable exists in the first
> place ?.
> Anyway, if someone knows a solution to this problem it would be very
> helpful.
>
>
> Thanks
> Geoffroy
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
For additional commands, e-mail: user-help@hadoop.apache.org


Re: Yarn: how to specify the hostname/IP for communication between the ResourceManager and the NodeManager for a Kubernetes deployment.

Posted by 林家銘 <ro...@gmail.com>.
Hi

I think you should try the Service IP and DNS solution.

The Service IP is a virtual IP and also a rule in iptables, it's life
cycle is managed by end-users, thus it is suitable being recorded in
DNS.

So back to this case, you could create the Service layer first. And
since its FQDN is specified by user, so you then pass the FQDN to the
Replication-Controller when you create it. Then you might have to
write a script in Docker image to replace the host name with the FQDN.

In my experiences, there would be several problems.
1) You can only change to host name for privileged containers.
2) There seems some random ports listened by YARN, but the in Pod,
Service specifications, as far as I know, they don't support to expose
a ranged IP (Kubernetes 1.0). So we end up just using the /etc/hosts
to record hostname and IP pairs.

But I am curious about, where you deploy the HDFS? Also with the
Kubernetes and Dockers?


On 20/11/2015, geoffroy chollon <un...@gmail.com> wrote:
> Hello,
>
>
> Tl;dr question: How to make the NodeManager give the
> "yarn.nodemanager.hostname" value to the Yarn ResourceManager during
> registration for IPC communication.
>
>
> Explanations:
>
> I have trouble making a Yarn cluster works in a Kubernetes deployment where
> every components run as a Docker container.
>
> When the NodeManager registers with the Yarn ResourceManager, it give its
> hostname for NodeId. Then the ResourceManager uses this NodeId to
> communicate with the node.
>
> The problem is that the hostname comes from system calls and not from the
> Yarn's configuration files. The ResourceManager is unable to talk to the
> NodeManager because the container hostname is invalid in the DNS context.
> Moreover, Docker disallows a dynamic change of a container's hostname.
>
> My Yarn NodeManager's configuration looks like this:
>
> *      <property>
> *        <name>yarn.resourcemanager.hostname</name>
> *        <value>yarnrm-int-svc.default.svc.cluster.docker</value>
> *      </property>
> *
> *      <property>
> *        <name>yarn.nodemanager.hostname</name>
> *        <value>100-66-10-4.default.pod.cluster.docker</value>
> *      </property>
> *      <property>
> *        <name>yarn.nodemanager.address</name>
> *        <value>100-66-10-4.default.pod.cluster.docker:8111</value>
> *      </property>
> *      <property>
> *        <name>yarn.nodemanager.webapp.address</name>
> *        <value>100-66-10-4.default.pod.cluster.docker:8042</value>
> *      </property>
> *      <property>
> *        <name>yarn.nodemanager.localizer.address</name>
> *        <value>100-66-10-4.default.pod.cluster.docker:8040</value>
> *      </property>
>
> I tried to simply use IP but it didn't work either. For now I had to fork
> the Kubernetes's DNS plugin to please Yarn but its not a proper solution.
>
> It's difficult to understand why the NodeId not configurable in this setup.
> If so, why do the "yarn.nodemanager.hostname" variable exists in the first
> place ?.
> Anyway, if someone knows a solution to this problem it would be very
> helpful.
>
>
> Thanks
> Geoffroy
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
For additional commands, e-mail: user-help@hadoop.apache.org