You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samza.apache.org by Debraj Manna <su...@gmail.com> on 2019/12/17 18:12:29 UTC

Running Samza with YARN Node label support

Hi

I am seeing running samza with yarn node label is resolved in 0.12.

https://issues.apache.org/jira/browse/SAMZA-1013?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel

But I am not able to locate the relevant documentation in samza-yarn
documentation
<https://samza.apache.org/learn/documentation/latest/deployment/yarn.html>

Can someone point me to the relevant documentation?

Re: Running Samza with YARN Node label support

Posted by Debraj Manna <su...@gmail.com>.
Thanks Bharath and Yang. It is clear now.

On Thu, 19 Dec 2019, 04:04 Bharath Kumara Subramanian, <
codin.martial@gmail.com> wrote:

> Hi Debraj,
>
> I forgot to call this out earlier. Some distribution of YARN doesn't
> support node label and rack combination as part of the same request. If you
> were to use node labels along with host affinity feature
> <
> https://samza.apache.org/learn/documentation/latest/yarn/yarn-host-affinity.html
> >
> in Samza, you might run into following issue
>
> 19:25:10.032 [main] ClusterBasedJobCoordinator [ERROR] Exception thrown in
> > the JobCoordinator loop
> > org.apache.hadoop.yarn.client.api.InvalidContainerRequestException:
> Cannot
> > specify node label with rack and node at
> >
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.checkNodeLabelExpression(AMRMClientImpl.java:617)
> > at
>
>
> Refer https://jira.apache.org/jira/browse/YARN-4925
> <https://jira.apache.org/jira/browse/YARN-4925?attachmentOrder=asc> for
> more information. You may want to back-port the patch to your custom YARN
> distribution if applicable.
>
> Thanks,
> Bharath
>
> On Wed, Dec 18, 2019 at 1:15 PM Bharath Kumara Subramanian <
> codin.martial@gmail.com> wrote:
>
> > Hi Debraj,
> >
> > To get the node label working, set the label configurations[1] pointed
> out
> > by Yang in your application config. Samza will take care of embedding the
> > node label in the resource request automatically if it notices the label
> > configuration inside your application.
> > Samza framework respects node label configurations even though they are
> > documented in the configuration table. I have created SAMZA-2422
> > <https://issues.apache.org/jira/browse/SAMZA-2422> to track this work
> > item.
> >
> > Let us know if you run into any issues.
> >
> > Thanks,
> > Bharath
> >
> > [1] -
> > *yarn.container.label* for specifying node label for the containers
> > *yarn.am.container.label*  for specifying node label for the application
> > master
> >
> > On Wed, Dec 18, 2019 at 10:49 AM Debraj Manna <su...@gmail.com>
> > wrote:
> >
> >> I understood how I can assign labels to yarn nodes.
> >>
> >> But it is still not clear to me how can I specify the node label for a
> >> samza application. I am referring to the section "Specifying node label
> >> for
> >> application" in the link
> >> <
> >>
> https://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-site/NodeLabel.html
> >> >
> >> you shared in your last email.
> >>
> >> On Wed, Dec 18, 2019 at 11:17 PM Yang Zhang <zh...@umn.edu> wrote:
> >>
> >> > Hi Debraj Manna,
> >> >
> >> > The app-def in previous email is just an example where you can
> configure
> >> > node labels. Yarn node labels
> >> > <
> >> >
> >>
> https://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-site/NodeLabel.html
> >> > >
> >> > is
> >> > a general feature (not specific to Samza), and it depends on the
> >> > configuration system your system uses. The example uses xml format to
> >> > configure Samza job, but Samza as a framework, it does not restrict
> >> > configuration format. Please let us know if you have further
> questions,
> >> and
> >> > we should detail the documents in OSS to describe the usage of certain
> >> > features.
> >> >
> >> > Best,
> >> > Yang
> >> >
> >> > On Tue, Dec 17, 2019 at 9:58 PM Debraj Manna <
> subharaj.manna@gmail.com>
> >> > wrote:
> >> >
> >> > > Thanks, Yang for replaying.
> >> > >
> >> > > Yes, my use case is almost similar.
> >> > >
> >> > > Can you let me know which app-def you are referring to? I am not
> able
> >> to
> >> > > locate yarn.am.container.label in samza-configurations
> >> > > <
> >> > >
> >> >
> >>
> http://samza.apache.org/learn/documentation/latest/jobs/samza-configurations.html
> >> > > >
> >> > > .
> >> > > Is there any samza project whose code I can refer to regarding the
> >> usage
> >> > of
> >> > > these configurations?
> >> > >
> >> > > On Wed, Dec 18, 2019 at 7:42 AM Yang Zhang <zh...@umn.edu>
> wrote:
> >> > >
> >> > > > Hello Debraj,
> >> > > >
> >> > > > We do not have a formal documentation in open source to describe
> how
> >> > yarn
> >> > > > node label is used in general. In contrast, we have an example of
> >> using
> >> > > > yarn node label to specify Samza container to run over "HDD"
> rather
> >> > than
> >> > > > default "SSD" nodes. Please take a look at the following guide and
> >> let
> >> > us
> >> > > > know whether it can be applied for your use case. Thank you for
> >> > reporting
> >> > > > this issue!
> >> > > > =================================================Step-by-step
> guide
> >> > > >
> >> > > >
> >> > > >    1.
> >> > > >
> >> > > >    Add the *yarn.container.label *and* yarn.am.container.label* to
> >> the
> >> > > >    job's *app-def* if not already present. The default of an empty
> >> > string
> >> > > >    will keep the current default behavior of using SSD nodes.
> >> > > >    <?xml version="1.0" encoding="UTF-8"?>
> >> > > >    <application
> >> > xmlns="urn:com:linkedin:ns:configuration:definition:1.0"
> >> > > >    name="my-application" version="">
> >> > > >        <configuration-definition>
> >> > > >            <property name="yarn.am.container.label" default="" />
> >> <!--
> >> > > the
> >> > > >    label used for launching the application master -->
> >> > > >            <property name="yarn.container.label" default="" />
> <!--
> >> the
> >> > > >    label used for other containers -->
> >> > > >        </configuration-definition>
> >> > > >    </application>
> >> > > >    2. If you had to modify your *app-def* in step 1, you will need
> >> to
> >> > do
> >> > > a
> >> > > >    trigger-build to get the change to take effect.
> >> > > >    3.
> >> > > >
> >> > > >    Add the label to *application.src* for your job. The *hdd*
> label
> >> > will
> >> > > >    assign your containers to machines with spinning disks instead
> of
> >> > > solid
> >> > > >    state disks.
> >> > > >    <?xml version="1.0" encoding="UTF-8"?>
> >> > > >    <application
> xmlns="urn:com:linkedin:ns:configuration:source:1.0"
> >> > > >    name="my-application">
> >> > > >      <configuration-source>
> >> > > >        <property name="yarn.container.label" value="hdd" />
> >> > > >        <property name="yarn.am.container.label" value="hdd" />
> >> > > >      </configuration-source>
> >> > > >    </application>
> >> > > >
> >> > > >
> >> > > >    4.
> >> > > >
> >> > > >    Deploy.
> >> > > >
> >> > > > =================================================
> >> > > >
> >> > > >
> >> > > > Best,
> >> > > >
> >> > > > Yang
> >> > > >
> >> > > > On Tue, Dec 17, 2019 at 10:13 AM Debraj Manna <
> >> > subharaj.manna@gmail.com>
> >> > > > wrote:
> >> > > >
> >> > > > > Hi
> >> > > > >
> >> > > > > I am seeing running samza with yarn node label is resolved in
> >> 0.12.
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://issues.apache.org/jira/browse/SAMZA-1013?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel
> >> > > > >
> >> > > > > But I am not able to locate the relevant documentation in
> >> samza-yarn
> >> > > > > documentation
> >> > > > > <
> >> > > >
> >> >
> >>
> https://samza.apache.org/learn/documentation/latest/deployment/yarn.html
> >> > > >
> >> > > > >
> >> > > > > Can someone point me to the relevant documentation?
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>

Re: Running Samza with YARN Node label support

Posted by Bharath Kumara Subramanian <co...@gmail.com>.
Hi Debraj,

I forgot to call this out earlier. Some distribution of YARN doesn't
support node label and rack combination as part of the same request. If you
were to use node labels along with host affinity feature
<https://samza.apache.org/learn/documentation/latest/yarn/yarn-host-affinity.html>
in Samza, you might run into following issue

19:25:10.032 [main] ClusterBasedJobCoordinator [ERROR] Exception thrown in
> the JobCoordinator loop
> org.apache.hadoop.yarn.client.api.InvalidContainerRequestException: Cannot
> specify node label with rack and node at
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.checkNodeLabelExpression(AMRMClientImpl.java:617)
> at


Refer https://jira.apache.org/jira/browse/YARN-4925
<https://jira.apache.org/jira/browse/YARN-4925?attachmentOrder=asc> for
more information. You may want to back-port the patch to your custom YARN
distribution if applicable.

Thanks,
Bharath

On Wed, Dec 18, 2019 at 1:15 PM Bharath Kumara Subramanian <
codin.martial@gmail.com> wrote:

> Hi Debraj,
>
> To get the node label working, set the label configurations[1] pointed out
> by Yang in your application config. Samza will take care of embedding the
> node label in the resource request automatically if it notices the label
> configuration inside your application.
> Samza framework respects node label configurations even though they are
> documented in the configuration table. I have created SAMZA-2422
> <https://issues.apache.org/jira/browse/SAMZA-2422> to track this work
> item.
>
> Let us know if you run into any issues.
>
> Thanks,
> Bharath
>
> [1] -
> *yarn.container.label* for specifying node label for the containers
> *yarn.am.container.label*  for specifying node label for the application
> master
>
> On Wed, Dec 18, 2019 at 10:49 AM Debraj Manna <su...@gmail.com>
> wrote:
>
>> I understood how I can assign labels to yarn nodes.
>>
>> But it is still not clear to me how can I specify the node label for a
>> samza application. I am referring to the section "Specifying node label
>> for
>> application" in the link
>> <
>> https://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-site/NodeLabel.html
>> >
>> you shared in your last email.
>>
>> On Wed, Dec 18, 2019 at 11:17 PM Yang Zhang <zh...@umn.edu> wrote:
>>
>> > Hi Debraj Manna,
>> >
>> > The app-def in previous email is just an example where you can configure
>> > node labels. Yarn node labels
>> > <
>> >
>> https://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-site/NodeLabel.html
>> > >
>> > is
>> > a general feature (not specific to Samza), and it depends on the
>> > configuration system your system uses. The example uses xml format to
>> > configure Samza job, but Samza as a framework, it does not restrict
>> > configuration format. Please let us know if you have further questions,
>> and
>> > we should detail the documents in OSS to describe the usage of certain
>> > features.
>> >
>> > Best,
>> > Yang
>> >
>> > On Tue, Dec 17, 2019 at 9:58 PM Debraj Manna <su...@gmail.com>
>> > wrote:
>> >
>> > > Thanks, Yang for replaying.
>> > >
>> > > Yes, my use case is almost similar.
>> > >
>> > > Can you let me know which app-def you are referring to? I am not able
>> to
>> > > locate yarn.am.container.label in samza-configurations
>> > > <
>> > >
>> >
>> http://samza.apache.org/learn/documentation/latest/jobs/samza-configurations.html
>> > > >
>> > > .
>> > > Is there any samza project whose code I can refer to regarding the
>> usage
>> > of
>> > > these configurations?
>> > >
>> > > On Wed, Dec 18, 2019 at 7:42 AM Yang Zhang <zh...@umn.edu> wrote:
>> > >
>> > > > Hello Debraj,
>> > > >
>> > > > We do not have a formal documentation in open source to describe how
>> > yarn
>> > > > node label is used in general. In contrast, we have an example of
>> using
>> > > > yarn node label to specify Samza container to run over "HDD" rather
>> > than
>> > > > default "SSD" nodes. Please take a look at the following guide and
>> let
>> > us
>> > > > know whether it can be applied for your use case. Thank you for
>> > reporting
>> > > > this issue!
>> > > > =================================================Step-by-step guide
>> > > >
>> > > >
>> > > >    1.
>> > > >
>> > > >    Add the *yarn.container.label *and* yarn.am.container.label* to
>> the
>> > > >    job's *app-def* if not already present. The default of an empty
>> > string
>> > > >    will keep the current default behavior of using SSD nodes.
>> > > >    <?xml version="1.0" encoding="UTF-8"?>
>> > > >    <application
>> > xmlns="urn:com:linkedin:ns:configuration:definition:1.0"
>> > > >    name="my-application" version="">
>> > > >        <configuration-definition>
>> > > >            <property name="yarn.am.container.label" default="" />
>> <!--
>> > > the
>> > > >    label used for launching the application master -->
>> > > >            <property name="yarn.container.label" default="" /> <!--
>> the
>> > > >    label used for other containers -->
>> > > >        </configuration-definition>
>> > > >    </application>
>> > > >    2. If you had to modify your *app-def* in step 1, you will need
>> to
>> > do
>> > > a
>> > > >    trigger-build to get the change to take effect.
>> > > >    3.
>> > > >
>> > > >    Add the label to *application.src* for your job. The *hdd* label
>> > will
>> > > >    assign your containers to machines with spinning disks instead of
>> > > solid
>> > > >    state disks.
>> > > >    <?xml version="1.0" encoding="UTF-8"?>
>> > > >    <application xmlns="urn:com:linkedin:ns:configuration:source:1.0"
>> > > >    name="my-application">
>> > > >      <configuration-source>
>> > > >        <property name="yarn.container.label" value="hdd" />
>> > > >        <property name="yarn.am.container.label" value="hdd" />
>> > > >      </configuration-source>
>> > > >    </application>
>> > > >
>> > > >
>> > > >    4.
>> > > >
>> > > >    Deploy.
>> > > >
>> > > > =================================================
>> > > >
>> > > >
>> > > > Best,
>> > > >
>> > > > Yang
>> > > >
>> > > > On Tue, Dec 17, 2019 at 10:13 AM Debraj Manna <
>> > subharaj.manna@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > Hi
>> > > > >
>> > > > > I am seeing running samza with yarn node label is resolved in
>> 0.12.
>> > > > >
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://issues.apache.org/jira/browse/SAMZA-1013?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel
>> > > > >
>> > > > > But I am not able to locate the relevant documentation in
>> samza-yarn
>> > > > > documentation
>> > > > > <
>> > > >
>> >
>> https://samza.apache.org/learn/documentation/latest/deployment/yarn.html
>> > > >
>> > > > >
>> > > > > Can someone point me to the relevant documentation?
>> > > > >
>> > > >
>> > >
>> >
>>
>

Re: Running Samza with YARN Node label support

Posted by Bharath Kumara Subramanian <co...@gmail.com>.
Hi Debraj,

To get the node label working, set the label configurations[1] pointed out
by Yang in your application config. Samza will take care of embedding the
node label in the resource request automatically if it notices the label
configuration inside your application.
Samza framework respects node label configurations even though they are
documented in the configuration table. I have created SAMZA-2422
<https://issues.apache.org/jira/browse/SAMZA-2422> to track this work item.

Let us know if you run into any issues.

Thanks,
Bharath

[1] -
*yarn.container.label* for specifying node label for the containers
*yarn.am.container.label*  for specifying node label for the application
master

On Wed, Dec 18, 2019 at 10:49 AM Debraj Manna <su...@gmail.com>
wrote:

> I understood how I can assign labels to yarn nodes.
>
> But it is still not clear to me how can I specify the node label for a
> samza application. I am referring to the section "Specifying node label for
> application" in the link
> <
> https://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-site/NodeLabel.html
> >
> you shared in your last email.
>
> On Wed, Dec 18, 2019 at 11:17 PM Yang Zhang <zh...@umn.edu> wrote:
>
> > Hi Debraj Manna,
> >
> > The app-def in previous email is just an example where you can configure
> > node labels. Yarn node labels
> > <
> >
> https://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-site/NodeLabel.html
> > >
> > is
> > a general feature (not specific to Samza), and it depends on the
> > configuration system your system uses. The example uses xml format to
> > configure Samza job, but Samza as a framework, it does not restrict
> > configuration format. Please let us know if you have further questions,
> and
> > we should detail the documents in OSS to describe the usage of certain
> > features.
> >
> > Best,
> > Yang
> >
> > On Tue, Dec 17, 2019 at 9:58 PM Debraj Manna <su...@gmail.com>
> > wrote:
> >
> > > Thanks, Yang for replaying.
> > >
> > > Yes, my use case is almost similar.
> > >
> > > Can you let me know which app-def you are referring to? I am not able
> to
> > > locate yarn.am.container.label in samza-configurations
> > > <
> > >
> >
> http://samza.apache.org/learn/documentation/latest/jobs/samza-configurations.html
> > > >
> > > .
> > > Is there any samza project whose code I can refer to regarding the
> usage
> > of
> > > these configurations?
> > >
> > > On Wed, Dec 18, 2019 at 7:42 AM Yang Zhang <zh...@umn.edu> wrote:
> > >
> > > > Hello Debraj,
> > > >
> > > > We do not have a formal documentation in open source to describe how
> > yarn
> > > > node label is used in general. In contrast, we have an example of
> using
> > > > yarn node label to specify Samza container to run over "HDD" rather
> > than
> > > > default "SSD" nodes. Please take a look at the following guide and
> let
> > us
> > > > know whether it can be applied for your use case. Thank you for
> > reporting
> > > > this issue!
> > > > =================================================Step-by-step guide
> > > >
> > > >
> > > >    1.
> > > >
> > > >    Add the *yarn.container.label *and* yarn.am.container.label* to
> the
> > > >    job's *app-def* if not already present. The default of an empty
> > string
> > > >    will keep the current default behavior of using SSD nodes.
> > > >    <?xml version="1.0" encoding="UTF-8"?>
> > > >    <application
> > xmlns="urn:com:linkedin:ns:configuration:definition:1.0"
> > > >    name="my-application" version="">
> > > >        <configuration-definition>
> > > >            <property name="yarn.am.container.label" default="" />
> <!--
> > > the
> > > >    label used for launching the application master -->
> > > >            <property name="yarn.container.label" default="" /> <!--
> the
> > > >    label used for other containers -->
> > > >        </configuration-definition>
> > > >    </application>
> > > >    2. If you had to modify your *app-def* in step 1, you will need to
> > do
> > > a
> > > >    trigger-build to get the change to take effect.
> > > >    3.
> > > >
> > > >    Add the label to *application.src* for your job. The *hdd* label
> > will
> > > >    assign your containers to machines with spinning disks instead of
> > > solid
> > > >    state disks.
> > > >    <?xml version="1.0" encoding="UTF-8"?>
> > > >    <application xmlns="urn:com:linkedin:ns:configuration:source:1.0"
> > > >    name="my-application">
> > > >      <configuration-source>
> > > >        <property name="yarn.container.label" value="hdd" />
> > > >        <property name="yarn.am.container.label" value="hdd" />
> > > >      </configuration-source>
> > > >    </application>
> > > >
> > > >
> > > >    4.
> > > >
> > > >    Deploy.
> > > >
> > > > =================================================
> > > >
> > > >
> > > > Best,
> > > >
> > > > Yang
> > > >
> > > > On Tue, Dec 17, 2019 at 10:13 AM Debraj Manna <
> > subharaj.manna@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi
> > > > >
> > > > > I am seeing running samza with yarn node label is resolved in 0.12.
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/SAMZA-1013?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel
> > > > >
> > > > > But I am not able to locate the relevant documentation in
> samza-yarn
> > > > > documentation
> > > > > <
> > > >
> > https://samza.apache.org/learn/documentation/latest/deployment/yarn.html
> > > >
> > > > >
> > > > > Can someone point me to the relevant documentation?
> > > > >
> > > >
> > >
> >
>

Re: Running Samza with YARN Node label support

Posted by Debraj Manna <su...@gmail.com>.
I understood how I can assign labels to yarn nodes.

But it is still not clear to me how can I specify the node label for a
samza application. I am referring to the section "Specifying node label for
application" in the link
<https://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-site/NodeLabel.html>
you shared in your last email.

On Wed, Dec 18, 2019 at 11:17 PM Yang Zhang <zh...@umn.edu> wrote:

> Hi Debraj Manna,
>
> The app-def in previous email is just an example where you can configure
> node labels. Yarn node labels
> <
> https://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-site/NodeLabel.html
> >
> is
> a general feature (not specific to Samza), and it depends on the
> configuration system your system uses. The example uses xml format to
> configure Samza job, but Samza as a framework, it does not restrict
> configuration format. Please let us know if you have further questions, and
> we should detail the documents in OSS to describe the usage of certain
> features.
>
> Best,
> Yang
>
> On Tue, Dec 17, 2019 at 9:58 PM Debraj Manna <su...@gmail.com>
> wrote:
>
> > Thanks, Yang for replaying.
> >
> > Yes, my use case is almost similar.
> >
> > Can you let me know which app-def you are referring to? I am not able to
> > locate yarn.am.container.label in samza-configurations
> > <
> >
> http://samza.apache.org/learn/documentation/latest/jobs/samza-configurations.html
> > >
> > .
> > Is there any samza project whose code I can refer to regarding the usage
> of
> > these configurations?
> >
> > On Wed, Dec 18, 2019 at 7:42 AM Yang Zhang <zh...@umn.edu> wrote:
> >
> > > Hello Debraj,
> > >
> > > We do not have a formal documentation in open source to describe how
> yarn
> > > node label is used in general. In contrast, we have an example of using
> > > yarn node label to specify Samza container to run over "HDD" rather
> than
> > > default "SSD" nodes. Please take a look at the following guide and let
> us
> > > know whether it can be applied for your use case. Thank you for
> reporting
> > > this issue!
> > > =================================================Step-by-step guide
> > >
> > >
> > >    1.
> > >
> > >    Add the *yarn.container.label *and* yarn.am.container.label* to the
> > >    job's *app-def* if not already present. The default of an empty
> string
> > >    will keep the current default behavior of using SSD nodes.
> > >    <?xml version="1.0" encoding="UTF-8"?>
> > >    <application
> xmlns="urn:com:linkedin:ns:configuration:definition:1.0"
> > >    name="my-application" version="">
> > >        <configuration-definition>
> > >            <property name="yarn.am.container.label" default="" /> <!--
> > the
> > >    label used for launching the application master -->
> > >            <property name="yarn.container.label" default="" /> <!-- the
> > >    label used for other containers -->
> > >        </configuration-definition>
> > >    </application>
> > >    2. If you had to modify your *app-def* in step 1, you will need to
> do
> > a
> > >    trigger-build to get the change to take effect.
> > >    3.
> > >
> > >    Add the label to *application.src* for your job. The *hdd* label
> will
> > >    assign your containers to machines with spinning disks instead of
> > solid
> > >    state disks.
> > >    <?xml version="1.0" encoding="UTF-8"?>
> > >    <application xmlns="urn:com:linkedin:ns:configuration:source:1.0"
> > >    name="my-application">
> > >      <configuration-source>
> > >        <property name="yarn.container.label" value="hdd" />
> > >        <property name="yarn.am.container.label" value="hdd" />
> > >      </configuration-source>
> > >    </application>
> > >
> > >
> > >    4.
> > >
> > >    Deploy.
> > >
> > > =================================================
> > >
> > >
> > > Best,
> > >
> > > Yang
> > >
> > > On Tue, Dec 17, 2019 at 10:13 AM Debraj Manna <
> subharaj.manna@gmail.com>
> > > wrote:
> > >
> > > > Hi
> > > >
> > > > I am seeing running samza with yarn node label is resolved in 0.12.
> > > >
> > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/SAMZA-1013?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel
> > > >
> > > > But I am not able to locate the relevant documentation in samza-yarn
> > > > documentation
> > > > <
> > >
> https://samza.apache.org/learn/documentation/latest/deployment/yarn.html
> > >
> > > >
> > > > Can someone point me to the relevant documentation?
> > > >
> > >
> >
>

Re: Running Samza with YARN Node label support

Posted by Yang Zhang <zh...@umn.edu>.
Hi Debraj Manna,

The app-def in previous email is just an example where you can configure
node labels. Yarn node labels
<https://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-site/NodeLabel.html>
is
a general feature (not specific to Samza), and it depends on the
configuration system your system uses. The example uses xml format to
configure Samza job, but Samza as a framework, it does not restrict
configuration format. Please let us know if you have further questions, and
we should detail the documents in OSS to describe the usage of certain
features.

Best,
Yang

On Tue, Dec 17, 2019 at 9:58 PM Debraj Manna <su...@gmail.com>
wrote:

> Thanks, Yang for replaying.
>
> Yes, my use case is almost similar.
>
> Can you let me know which app-def you are referring to? I am not able to
> locate yarn.am.container.label in samza-configurations
> <
> http://samza.apache.org/learn/documentation/latest/jobs/samza-configurations.html
> >
> .
> Is there any samza project whose code I can refer to regarding the usage of
> these configurations?
>
> On Wed, Dec 18, 2019 at 7:42 AM Yang Zhang <zh...@umn.edu> wrote:
>
> > Hello Debraj,
> >
> > We do not have a formal documentation in open source to describe how yarn
> > node label is used in general. In contrast, we have an example of using
> > yarn node label to specify Samza container to run over "HDD" rather than
> > default "SSD" nodes. Please take a look at the following guide and let us
> > know whether it can be applied for your use case. Thank you for reporting
> > this issue!
> > =================================================Step-by-step guide
> >
> >
> >    1.
> >
> >    Add the *yarn.container.label *and* yarn.am.container.label* to the
> >    job's *app-def* if not already present. The default of an empty string
> >    will keep the current default behavior of using SSD nodes.
> >    <?xml version="1.0" encoding="UTF-8"?>
> >    <application xmlns="urn:com:linkedin:ns:configuration:definition:1.0"
> >    name="my-application" version="">
> >        <configuration-definition>
> >            <property name="yarn.am.container.label" default="" /> <!--
> the
> >    label used for launching the application master -->
> >            <property name="yarn.container.label" default="" /> <!-- the
> >    label used for other containers -->
> >        </configuration-definition>
> >    </application>
> >    2. If you had to modify your *app-def* in step 1, you will need to do
> a
> >    trigger-build to get the change to take effect.
> >    3.
> >
> >    Add the label to *application.src* for your job. The *hdd* label will
> >    assign your containers to machines with spinning disks instead of
> solid
> >    state disks.
> >    <?xml version="1.0" encoding="UTF-8"?>
> >    <application xmlns="urn:com:linkedin:ns:configuration:source:1.0"
> >    name="my-application">
> >      <configuration-source>
> >        <property name="yarn.container.label" value="hdd" />
> >        <property name="yarn.am.container.label" value="hdd" />
> >      </configuration-source>
> >    </application>
> >
> >
> >    4.
> >
> >    Deploy.
> >
> > =================================================
> >
> >
> > Best,
> >
> > Yang
> >
> > On Tue, Dec 17, 2019 at 10:13 AM Debraj Manna <su...@gmail.com>
> > wrote:
> >
> > > Hi
> > >
> > > I am seeing running samza with yarn node label is resolved in 0.12.
> > >
> > >
> > >
> >
> https://issues.apache.org/jira/browse/SAMZA-1013?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel
> > >
> > > But I am not able to locate the relevant documentation in samza-yarn
> > > documentation
> > > <
> > https://samza.apache.org/learn/documentation/latest/deployment/yarn.html
> >
> > >
> > > Can someone point me to the relevant documentation?
> > >
> >
>

Re: Running Samza with YARN Node label support

Posted by Debraj Manna <su...@gmail.com>.
Thanks, Yang for replaying.

Yes, my use case is almost similar.

Can you let me know which app-def you are referring to? I am not able to
locate yarn.am.container.label in samza-configurations
<http://samza.apache.org/learn/documentation/latest/jobs/samza-configurations.html>
.
Is there any samza project whose code I can refer to regarding the usage of
these configurations?

On Wed, Dec 18, 2019 at 7:42 AM Yang Zhang <zh...@umn.edu> wrote:

> Hello Debraj,
>
> We do not have a formal documentation in open source to describe how yarn
> node label is used in general. In contrast, we have an example of using
> yarn node label to specify Samza container to run over "HDD" rather than
> default "SSD" nodes. Please take a look at the following guide and let us
> know whether it can be applied for your use case. Thank you for reporting
> this issue!
> =================================================Step-by-step guide
>
>
>    1.
>
>    Add the *yarn.container.label *and* yarn.am.container.label* to the
>    job's *app-def* if not already present. The default of an empty string
>    will keep the current default behavior of using SSD nodes.
>    <?xml version="1.0" encoding="UTF-8"?>
>    <application xmlns="urn:com:linkedin:ns:configuration:definition:1.0"
>    name="my-application" version="">
>        <configuration-definition>
>            <property name="yarn.am.container.label" default="" /> <!-- the
>    label used for launching the application master -->
>            <property name="yarn.container.label" default="" /> <!-- the
>    label used for other containers -->
>        </configuration-definition>
>    </application>
>    2. If you had to modify your *app-def* in step 1, you will need to do a
>    trigger-build to get the change to take effect.
>    3.
>
>    Add the label to *application.src* for your job. The *hdd* label will
>    assign your containers to machines with spinning disks instead of solid
>    state disks.
>    <?xml version="1.0" encoding="UTF-8"?>
>    <application xmlns="urn:com:linkedin:ns:configuration:source:1.0"
>    name="my-application">
>      <configuration-source>
>        <property name="yarn.container.label" value="hdd" />
>        <property name="yarn.am.container.label" value="hdd" />
>      </configuration-source>
>    </application>
>
>
>    4.
>
>    Deploy.
>
> =================================================
>
>
> Best,
>
> Yang
>
> On Tue, Dec 17, 2019 at 10:13 AM Debraj Manna <su...@gmail.com>
> wrote:
>
> > Hi
> >
> > I am seeing running samza with yarn node label is resolved in 0.12.
> >
> >
> >
> https://issues.apache.org/jira/browse/SAMZA-1013?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel
> >
> > But I am not able to locate the relevant documentation in samza-yarn
> > documentation
> > <
> https://samza.apache.org/learn/documentation/latest/deployment/yarn.html>
> >
> > Can someone point me to the relevant documentation?
> >
>

Re: Running Samza with YARN Node label support

Posted by Yang Zhang <zh...@umn.edu>.
Hello Debraj,

We do not have a formal documentation in open source to describe how yarn
node label is used in general. In contrast, we have an example of using
yarn node label to specify Samza container to run over "HDD" rather than
default "SSD" nodes. Please take a look at the following guide and let us
know whether it can be applied for your use case. Thank you for reporting
this issue!
=================================================Step-by-step guide


   1.

   Add the *yarn.container.label *and* yarn.am.container.label* to the
   job's *app-def* if not already present. The default of an empty string
   will keep the current default behavior of using SSD nodes.
   <?xml version="1.0" encoding="UTF-8"?>
   <application xmlns="urn:com:linkedin:ns:configuration:definition:1.0"
   name="my-application" version="">
       <configuration-definition>
           <property name="yarn.am.container.label" default="" /> <!-- the
   label used for launching the application master -->
           <property name="yarn.container.label" default="" /> <!-- the
   label used for other containers -->
       </configuration-definition>
   </application>
   2. If you had to modify your *app-def* in step 1, you will need to do a
   trigger-build to get the change to take effect.
   3.

   Add the label to *application.src* for your job. The *hdd* label will
   assign your containers to machines with spinning disks instead of solid
   state disks.
   <?xml version="1.0" encoding="UTF-8"?>
   <application xmlns="urn:com:linkedin:ns:configuration:source:1.0"
   name="my-application">
     <configuration-source>
       <property name="yarn.container.label" value="hdd" />
       <property name="yarn.am.container.label" value="hdd" />
     </configuration-source>
   </application>


   4.

   Deploy.

=================================================


Best,

Yang

On Tue, Dec 17, 2019 at 10:13 AM Debraj Manna <su...@gmail.com>
wrote:

> Hi
>
> I am seeing running samza with yarn node label is resolved in 0.12.
>
>
> https://issues.apache.org/jira/browse/SAMZA-1013?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel
>
> But I am not able to locate the relevant documentation in samza-yarn
> documentation
> <https://samza.apache.org/learn/documentation/latest/deployment/yarn.html>
>
> Can someone point me to the relevant documentation?
>