You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by Yue Wang <te...@gmail.com> on 2013/12/01 12:42:00 UTC

Implementing and running an applicationmaster

Hi,

I found the page (
http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
and know how to write an ApplicationMaster.

However, is there a complete example showing how to run this
ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?

Thanks!



Yue

Re: Implementing and running an applicationmaster

Posted by Rob Blah <tm...@gmail.com>.

Hi

There is a way but it's not an easy one. You should overwrite the container
request code in MR_AM. As each container in MapReduce gets the same amount
of memory, the OOM shouldn't be problem as inner task "buffers" can be
spilled to disk. I am no MapReduce (code) specialist but I would start by
finding MR_Driver.class and MR_AM.class. Then overwrite the Driver.class to
execute your class Custom_MR_AM (C_MR_AM). C_MR_AM will be a copy of MR_AM
but you should change the container request code, so that you can allocate
N containers with X memory and M container with Y memory.

The hadoop-mapreduce-examples.jar is just a bunch of HelloWorld jobs. So a
new user can pick up and "learn" MR quickly.

Maybe some real MR specialist can give you better advice than me.

regards
tmp


2013/12/5 Yue Wang <te...@gmail.com>

> Hi,
>
> Thank you for your answer. Now I understand the connection between the two
> ways.
>
> I asked this question because I want to take benefit from the YARN
> architecture.
> If I understood correctly, I can let my ApplicationMaster request
> containers more flexibly. For example, I can request two containers with
> 100MB memory and two containers with 200MB memory for my mappers on YARN.
> However, I cannot do that on MRv1.
>
> So if I execute a WordCount program by typing "yarn jar
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount
> wordcount/ wc-output/", such flexibility is gone.
>
> Is there a way to let my ApplicationMaster execute WordCount on HDFS on
> containers?
>
>
> Thanks!
>
>
> On Thu, Dec 5, 2013 at 4:28 AM, Rob Blah <tm...@gmail.com> wrote:
>
>> Hi
>>
>> If I understood you correctly, you would like to run your AM with YARN
>> Client from shell as oppose to run the Driver like in MRv1. But it's the
>> same thing (more or less). In the example you provided
>> (org.apache.hadoop.yarn.applications.DistributedShell) the Client.class is
>> the "driver". However since distributed-shell is a "simple" application you
>> do not need a lot of configuration (setting fields in Configuration.class,
>> I/O formats etc.). The same goes for any other application. As for the
>> second example (org.apache.hadoop.examples.WordCount) MapReduce AM requires
>> certain configuration, thus you have to to it the "old-way". The main
>> difference would be: MR -> end-user-config -> driver, DS -> driver (but you
>> still can create your own end-user-config). Hope this answers your question
>> and that I understood it correctly.
>>
>> regards
>> tmp
>>
>>
>> 2013/12/5 Yue Wang <te...@gmail.com>
>>
>>> Hi,
>>>
>>> I took a look at the codes and found some examples on the web.
>>> One example is: http://wiki.opf-labs.org/display/SP/Resource+management
>>>
>>> It seems that users can run simple shell commands using Client of YARN.
>>> But when it comes to a practical MapReduce example like WordCount,
>>> people still run commands in the old way as in MRv1.
>>>
>>> How can I run WordCount using Client and ApplicationMaster of YARN so
>>> that I can request resources flexibly?
>>>
>>>
>>> Thanks!
>>>
>>>
>>> On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah <tm...@gmail.com> wrote:
>>>
>>>> Hi
>>>>
>>>> Follow the example provided in
>>>> Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.
>>>>
>>>> regards
>>>> tmp
>>>>
>>>>
>>>> 2013/12/1 Yue Wang <te...@gmail.com>
>>>>
>>>>> Hi,
>>>>>
>>>>> I found the page (
>>>>> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
>>>>> and know how to write an ApplicationMaster.
>>>>>
>>>>> However, is there a complete example showing how to run this
>>>>> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>>
>>>>> Yue
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Implementing and running an applicationmaster

Posted by Rob Blah <tm...@gmail.com>.

Hi

There is a way but it's not an easy one. You should overwrite the container
request code in MR_AM. As each container in MapReduce gets the same amount
of memory, the OOM shouldn't be problem as inner task "buffers" can be
spilled to disk. I am no MapReduce (code) specialist but I would start by
finding MR_Driver.class and MR_AM.class. Then overwrite the Driver.class to
execute your class Custom_MR_AM (C_MR_AM). C_MR_AM will be a copy of MR_AM
but you should change the container request code, so that you can allocate
N containers with X memory and M container with Y memory.

The hadoop-mapreduce-examples.jar is just a bunch of HelloWorld jobs. So a
new user can pick up and "learn" MR quickly.

Maybe some real MR specialist can give you better advice than me.

regards
tmp


2013/12/5 Yue Wang <te...@gmail.com>

> Hi,
>
> Thank you for your answer. Now I understand the connection between the two
> ways.
>
> I asked this question because I want to take benefit from the YARN
> architecture.
> If I understood correctly, I can let my ApplicationMaster request
> containers more flexibly. For example, I can request two containers with
> 100MB memory and two containers with 200MB memory for my mappers on YARN.
> However, I cannot do that on MRv1.
>
> So if I execute a WordCount program by typing "yarn jar
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount
> wordcount/ wc-output/", such flexibility is gone.
>
> Is there a way to let my ApplicationMaster execute WordCount on HDFS on
> containers?
>
>
> Thanks!
>
>
> On Thu, Dec 5, 2013 at 4:28 AM, Rob Blah <tm...@gmail.com> wrote:
>
>> Hi
>>
>> If I understood you correctly, you would like to run your AM with YARN
>> Client from shell as oppose to run the Driver like in MRv1. But it's the
>> same thing (more or less). In the example you provided
>> (org.apache.hadoop.yarn.applications.DistributedShell) the Client.class is
>> the "driver". However since distributed-shell is a "simple" application you
>> do not need a lot of configuration (setting fields in Configuration.class,
>> I/O formats etc.). The same goes for any other application. As for the
>> second example (org.apache.hadoop.examples.WordCount) MapReduce AM requires
>> certain configuration, thus you have to to it the "old-way". The main
>> difference would be: MR -> end-user-config -> driver, DS -> driver (but you
>> still can create your own end-user-config). Hope this answers your question
>> and that I understood it correctly.
>>
>> regards
>> tmp
>>
>>
>> 2013/12/5 Yue Wang <te...@gmail.com>
>>
>>> Hi,
>>>
>>> I took a look at the codes and found some examples on the web.
>>> One example is: http://wiki.opf-labs.org/display/SP/Resource+management
>>>
>>> It seems that users can run simple shell commands using Client of YARN.
>>> But when it comes to a practical MapReduce example like WordCount,
>>> people still run commands in the old way as in MRv1.
>>>
>>> How can I run WordCount using Client and ApplicationMaster of YARN so
>>> that I can request resources flexibly?
>>>
>>>
>>> Thanks!
>>>
>>>
>>> On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah <tm...@gmail.com> wrote:
>>>
>>>> Hi
>>>>
>>>> Follow the example provided in
>>>> Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.
>>>>
>>>> regards
>>>> tmp
>>>>
>>>>
>>>> 2013/12/1 Yue Wang <te...@gmail.com>
>>>>
>>>>> Hi,
>>>>>
>>>>> I found the page (
>>>>> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
>>>>> and know how to write an ApplicationMaster.
>>>>>
>>>>> However, is there a complete example showing how to run this
>>>>> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>>
>>>>> Yue
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Implementing and running an applicationmaster

Posted by Rob Blah <tm...@gmail.com>.

Hi

There is a way but it's not an easy one. You should overwrite the container
request code in MR_AM. As each container in MapReduce gets the same amount
of memory, the OOM shouldn't be problem as inner task "buffers" can be
spilled to disk. I am no MapReduce (code) specialist but I would start by
finding MR_Driver.class and MR_AM.class. Then overwrite the Driver.class to
execute your class Custom_MR_AM (C_MR_AM). C_MR_AM will be a copy of MR_AM
but you should change the container request code, so that you can allocate
N containers with X memory and M container with Y memory.

The hadoop-mapreduce-examples.jar is just a bunch of HelloWorld jobs. So a
new user can pick up and "learn" MR quickly.

Maybe some real MR specialist can give you better advice than me.

regards
tmp


2013/12/5 Yue Wang <te...@gmail.com>

> Hi,
>
> Thank you for your answer. Now I understand the connection between the two
> ways.
>
> I asked this question because I want to take benefit from the YARN
> architecture.
> If I understood correctly, I can let my ApplicationMaster request
> containers more flexibly. For example, I can request two containers with
> 100MB memory and two containers with 200MB memory for my mappers on YARN.
> However, I cannot do that on MRv1.
>
> So if I execute a WordCount program by typing "yarn jar
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount
> wordcount/ wc-output/", such flexibility is gone.
>
> Is there a way to let my ApplicationMaster execute WordCount on HDFS on
> containers?
>
>
> Thanks!
>
>
> On Thu, Dec 5, 2013 at 4:28 AM, Rob Blah <tm...@gmail.com> wrote:
>
>> Hi
>>
>> If I understood you correctly, you would like to run your AM with YARN
>> Client from shell as oppose to run the Driver like in MRv1. But it's the
>> same thing (more or less). In the example you provided
>> (org.apache.hadoop.yarn.applications.DistributedShell) the Client.class is
>> the "driver". However since distributed-shell is a "simple" application you
>> do not need a lot of configuration (setting fields in Configuration.class,
>> I/O formats etc.). The same goes for any other application. As for the
>> second example (org.apache.hadoop.examples.WordCount) MapReduce AM requires
>> certain configuration, thus you have to to it the "old-way". The main
>> difference would be: MR -> end-user-config -> driver, DS -> driver (but you
>> still can create your own end-user-config). Hope this answers your question
>> and that I understood it correctly.
>>
>> regards
>> tmp
>>
>>
>> 2013/12/5 Yue Wang <te...@gmail.com>
>>
>>> Hi,
>>>
>>> I took a look at the codes and found some examples on the web.
>>> One example is: http://wiki.opf-labs.org/display/SP/Resource+management
>>>
>>> It seems that users can run simple shell commands using Client of YARN.
>>> But when it comes to a practical MapReduce example like WordCount,
>>> people still run commands in the old way as in MRv1.
>>>
>>> How can I run WordCount using Client and ApplicationMaster of YARN so
>>> that I can request resources flexibly?
>>>
>>>
>>> Thanks!
>>>
>>>
>>> On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah <tm...@gmail.com> wrote:
>>>
>>>> Hi
>>>>
>>>> Follow the example provided in
>>>> Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.
>>>>
>>>> regards
>>>> tmp
>>>>
>>>>
>>>> 2013/12/1 Yue Wang <te...@gmail.com>
>>>>
>>>>> Hi,
>>>>>
>>>>> I found the page (
>>>>> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
>>>>> and know how to write an ApplicationMaster.
>>>>>
>>>>> However, is there a complete example showing how to run this
>>>>> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>>
>>>>> Yue
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Implementing and running an applicationmaster

Posted by Rob Blah <tm...@gmail.com>.

Hi

There is a way but it's not an easy one. You should overwrite the container
request code in MR_AM. As each container in MapReduce gets the same amount
of memory, the OOM shouldn't be problem as inner task "buffers" can be
spilled to disk. I am no MapReduce (code) specialist but I would start by
finding MR_Driver.class and MR_AM.class. Then overwrite the Driver.class to
execute your class Custom_MR_AM (C_MR_AM). C_MR_AM will be a copy of MR_AM
but you should change the container request code, so that you can allocate
N containers with X memory and M container with Y memory.

The hadoop-mapreduce-examples.jar is just a bunch of HelloWorld jobs. So a
new user can pick up and "learn" MR quickly.

Maybe some real MR specialist can give you better advice than me.

regards
tmp


2013/12/5 Yue Wang <te...@gmail.com>

> Hi,
>
> Thank you for your answer. Now I understand the connection between the two
> ways.
>
> I asked this question because I want to take benefit from the YARN
> architecture.
> If I understood correctly, I can let my ApplicationMaster request
> containers more flexibly. For example, I can request two containers with
> 100MB memory and two containers with 200MB memory for my mappers on YARN.
> However, I cannot do that on MRv1.
>
> So if I execute a WordCount program by typing "yarn jar
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount
> wordcount/ wc-output/", such flexibility is gone.
>
> Is there a way to let my ApplicationMaster execute WordCount on HDFS on
> containers?
>
>
> Thanks!
>
>
> On Thu, Dec 5, 2013 at 4:28 AM, Rob Blah <tm...@gmail.com> wrote:
>
>> Hi
>>
>> If I understood you correctly, you would like to run your AM with YARN
>> Client from shell as oppose to run the Driver like in MRv1. But it's the
>> same thing (more or less). In the example you provided
>> (org.apache.hadoop.yarn.applications.DistributedShell) the Client.class is
>> the "driver". However since distributed-shell is a "simple" application you
>> do not need a lot of configuration (setting fields in Configuration.class,
>> I/O formats etc.). The same goes for any other application. As for the
>> second example (org.apache.hadoop.examples.WordCount) MapReduce AM requires
>> certain configuration, thus you have to to it the "old-way". The main
>> difference would be: MR -> end-user-config -> driver, DS -> driver (but you
>> still can create your own end-user-config). Hope this answers your question
>> and that I understood it correctly.
>>
>> regards
>> tmp
>>
>>
>> 2013/12/5 Yue Wang <te...@gmail.com>
>>
>>> Hi,
>>>
>>> I took a look at the codes and found some examples on the web.
>>> One example is: http://wiki.opf-labs.org/display/SP/Resource+management
>>>
>>> It seems that users can run simple shell commands using Client of YARN.
>>> But when it comes to a practical MapReduce example like WordCount,
>>> people still run commands in the old way as in MRv1.
>>>
>>> How can I run WordCount using Client and ApplicationMaster of YARN so
>>> that I can request resources flexibly?
>>>
>>>
>>> Thanks!
>>>
>>>
>>> On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah <tm...@gmail.com> wrote:
>>>
>>>> Hi
>>>>
>>>> Follow the example provided in
>>>> Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.
>>>>
>>>> regards
>>>> tmp
>>>>
>>>>
>>>> 2013/12/1 Yue Wang <te...@gmail.com>
>>>>
>>>>> Hi,
>>>>>
>>>>> I found the page (
>>>>> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
>>>>> and know how to write an ApplicationMaster.
>>>>>
>>>>> However, is there a complete example showing how to run this
>>>>> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>>
>>>>> Yue
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Implementing and running an applicationmaster

Posted by Yue Wang <te...@gmail.com>.

Hi,

Thank you for your answer. Now I understand the connection between the two
ways.

I asked this question because I want to take benefit from the YARN
architecture.
If I understood correctly, I can let my ApplicationMaster request
containers more flexibly. For example, I can request two containers with
100MB memory and two containers with 200MB memory for my mappers on YARN.
However, I cannot do that on MRv1.

So if I execute a WordCount program by typing "yarn jar
/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount
wordcount/ wc-output/", such flexibility is gone.

Is there a way to let my ApplicationMaster execute WordCount on HDFS on
containers?


Thanks!


On Thu, Dec 5, 2013 at 4:28 AM, Rob Blah <tm...@gmail.com> wrote:

> Hi
>
> If I understood you correctly, you would like to run your AM with YARN
> Client from shell as oppose to run the Driver like in MRv1. But it's the
> same thing (more or less). In the example you provided
> (org.apache.hadoop.yarn.applications.DistributedShell) the Client.class is
> the "driver". However since distributed-shell is a "simple" application you
> do not need a lot of configuration (setting fields in Configuration.class,
> I/O formats etc.). The same goes for any other application. As for the
> second example (org.apache.hadoop.examples.WordCount) MapReduce AM requires
> certain configuration, thus you have to to it the "old-way". The main
> difference would be: MR -> end-user-config -> driver, DS -> driver (but you
> still can create your own end-user-config). Hope this answers your question
> and that I understood it correctly.
>
> regards
> tmp
>
>
> 2013/12/5 Yue Wang <te...@gmail.com>
>
>> Hi,
>>
>> I took a look at the codes and found some examples on the web.
>> One example is: http://wiki.opf-labs.org/display/SP/Resource+management
>>
>> It seems that users can run simple shell commands using Client of YARN.
>> But when it comes to a practical MapReduce example like WordCount, people
>> still run commands in the old way as in MRv1.
>>
>> How can I run WordCount using Client and ApplicationMaster of YARN so
>> that I can request resources flexibly?
>>
>>
>> Thanks!
>>
>>
>> On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah <tm...@gmail.com> wrote:
>>
>>> Hi
>>>
>>> Follow the example provided in
>>> Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.
>>>
>>> regards
>>> tmp
>>>
>>>
>>> 2013/12/1 Yue Wang <te...@gmail.com>
>>>
>>>> Hi,
>>>>
>>>> I found the page (
>>>> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
>>>> and know how to write an ApplicationMaster.
>>>>
>>>> However, is there a complete example showing how to run this
>>>> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>>>>
>>>> Thanks!
>>>>
>>>>
>>>>
>>>> Yue
>>>>
>>>
>>>
>>
>

Re: Implementing and running an applicationmaster

Posted by Yue Wang <te...@gmail.com>.

Hi,

Thank you for your answer. Now I understand the connection between the two
ways.

I asked this question because I want to take benefit from the YARN
architecture.
If I understood correctly, I can let my ApplicationMaster request
containers more flexibly. For example, I can request two containers with
100MB memory and two containers with 200MB memory for my mappers on YARN.
However, I cannot do that on MRv1.

So if I execute a WordCount program by typing "yarn jar
/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount
wordcount/ wc-output/", such flexibility is gone.

Is there a way to let my ApplicationMaster execute WordCount on HDFS on
containers?


Thanks!


On Thu, Dec 5, 2013 at 4:28 AM, Rob Blah <tm...@gmail.com> wrote:

> Hi
>
> If I understood you correctly, you would like to run your AM with YARN
> Client from shell as oppose to run the Driver like in MRv1. But it's the
> same thing (more or less). In the example you provided
> (org.apache.hadoop.yarn.applications.DistributedShell) the Client.class is
> the "driver". However since distributed-shell is a "simple" application you
> do not need a lot of configuration (setting fields in Configuration.class,
> I/O formats etc.). The same goes for any other application. As for the
> second example (org.apache.hadoop.examples.WordCount) MapReduce AM requires
> certain configuration, thus you have to to it the "old-way". The main
> difference would be: MR -> end-user-config -> driver, DS -> driver (but you
> still can create your own end-user-config). Hope this answers your question
> and that I understood it correctly.
>
> regards
> tmp
>
>
> 2013/12/5 Yue Wang <te...@gmail.com>
>
>> Hi,
>>
>> I took a look at the codes and found some examples on the web.
>> One example is: http://wiki.opf-labs.org/display/SP/Resource+management
>>
>> It seems that users can run simple shell commands using Client of YARN.
>> But when it comes to a practical MapReduce example like WordCount, people
>> still run commands in the old way as in MRv1.
>>
>> How can I run WordCount using Client and ApplicationMaster of YARN so
>> that I can request resources flexibly?
>>
>>
>> Thanks!
>>
>>
>> On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah <tm...@gmail.com> wrote:
>>
>>> Hi
>>>
>>> Follow the example provided in
>>> Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.
>>>
>>> regards
>>> tmp
>>>
>>>
>>> 2013/12/1 Yue Wang <te...@gmail.com>
>>>
>>>> Hi,
>>>>
>>>> I found the page (
>>>> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
>>>> and know how to write an ApplicationMaster.
>>>>
>>>> However, is there a complete example showing how to run this
>>>> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>>>>
>>>> Thanks!
>>>>
>>>>
>>>>
>>>> Yue
>>>>
>>>
>>>
>>
>

Re: Implementing and running an applicationmaster

Posted by Yue Wang <te...@gmail.com>.

Hi,

Thank you for your answer. Now I understand the connection between the two
ways.

I asked this question because I want to take benefit from the YARN
architecture.
If I understood correctly, I can let my ApplicationMaster request
containers more flexibly. For example, I can request two containers with
100MB memory and two containers with 200MB memory for my mappers on YARN.
However, I cannot do that on MRv1.

So if I execute a WordCount program by typing "yarn jar
/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount
wordcount/ wc-output/", such flexibility is gone.

Is there a way to let my ApplicationMaster execute WordCount on HDFS on
containers?


Thanks!


On Thu, Dec 5, 2013 at 4:28 AM, Rob Blah <tm...@gmail.com> wrote:

> Hi
>
> If I understood you correctly, you would like to run your AM with YARN
> Client from shell as oppose to run the Driver like in MRv1. But it's the
> same thing (more or less). In the example you provided
> (org.apache.hadoop.yarn.applications.DistributedShell) the Client.class is
> the "driver". However since distributed-shell is a "simple" application you
> do not need a lot of configuration (setting fields in Configuration.class,
> I/O formats etc.). The same goes for any other application. As for the
> second example (org.apache.hadoop.examples.WordCount) MapReduce AM requires
> certain configuration, thus you have to to it the "old-way". The main
> difference would be: MR -> end-user-config -> driver, DS -> driver (but you
> still can create your own end-user-config). Hope this answers your question
> and that I understood it correctly.
>
> regards
> tmp
>
>
> 2013/12/5 Yue Wang <te...@gmail.com>
>
>> Hi,
>>
>> I took a look at the codes and found some examples on the web.
>> One example is: http://wiki.opf-labs.org/display/SP/Resource+management
>>
>> It seems that users can run simple shell commands using Client of YARN.
>> But when it comes to a practical MapReduce example like WordCount, people
>> still run commands in the old way as in MRv1.
>>
>> How can I run WordCount using Client and ApplicationMaster of YARN so
>> that I can request resources flexibly?
>>
>>
>> Thanks!
>>
>>
>> On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah <tm...@gmail.com> wrote:
>>
>>> Hi
>>>
>>> Follow the example provided in
>>> Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.
>>>
>>> regards
>>> tmp
>>>
>>>
>>> 2013/12/1 Yue Wang <te...@gmail.com>
>>>
>>>> Hi,
>>>>
>>>> I found the page (
>>>> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
>>>> and know how to write an ApplicationMaster.
>>>>
>>>> However, is there a complete example showing how to run this
>>>> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>>>>
>>>> Thanks!
>>>>
>>>>
>>>>
>>>> Yue
>>>>
>>>
>>>
>>
>

Re: Implementing and running an applicationmaster

Posted by Yue Wang <te...@gmail.com>.

Hi,

Thank you for your answer. Now I understand the connection between the two
ways.

I asked this question because I want to take benefit from the YARN
architecture.
If I understood correctly, I can let my ApplicationMaster request
containers more flexibly. For example, I can request two containers with
100MB memory and two containers with 200MB memory for my mappers on YARN.
However, I cannot do that on MRv1.

So if I execute a WordCount program by typing "yarn jar
/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount
wordcount/ wc-output/", such flexibility is gone.

Is there a way to let my ApplicationMaster execute WordCount on HDFS on
containers?


Thanks!


On Thu, Dec 5, 2013 at 4:28 AM, Rob Blah <tm...@gmail.com> wrote:

> Hi
>
> If I understood you correctly, you would like to run your AM with YARN
> Client from shell as oppose to run the Driver like in MRv1. But it's the
> same thing (more or less). In the example you provided
> (org.apache.hadoop.yarn.applications.DistributedShell) the Client.class is
> the "driver". However since distributed-shell is a "simple" application you
> do not need a lot of configuration (setting fields in Configuration.class,
> I/O formats etc.). The same goes for any other application. As for the
> second example (org.apache.hadoop.examples.WordCount) MapReduce AM requires
> certain configuration, thus you have to to it the "old-way". The main
> difference would be: MR -> end-user-config -> driver, DS -> driver (but you
> still can create your own end-user-config). Hope this answers your question
> and that I understood it correctly.
>
> regards
> tmp
>
>
> 2013/12/5 Yue Wang <te...@gmail.com>
>
>> Hi,
>>
>> I took a look at the codes and found some examples on the web.
>> One example is: http://wiki.opf-labs.org/display/SP/Resource+management
>>
>> It seems that users can run simple shell commands using Client of YARN.
>> But when it comes to a practical MapReduce example like WordCount, people
>> still run commands in the old way as in MRv1.
>>
>> How can I run WordCount using Client and ApplicationMaster of YARN so
>> that I can request resources flexibly?
>>
>>
>> Thanks!
>>
>>
>> On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah <tm...@gmail.com> wrote:
>>
>>> Hi
>>>
>>> Follow the example provided in
>>> Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.
>>>
>>> regards
>>> tmp
>>>
>>>
>>> 2013/12/1 Yue Wang <te...@gmail.com>
>>>
>>>> Hi,
>>>>
>>>> I found the page (
>>>> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
>>>> and know how to write an ApplicationMaster.
>>>>
>>>> However, is there a complete example showing how to run this
>>>> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>>>>
>>>> Thanks!
>>>>
>>>>
>>>>
>>>> Yue
>>>>
>>>
>>>
>>
>

Re: Implementing and running an applicationmaster

Posted by Rob Blah <tm...@gmail.com>.

Hi

If I understood you correctly, you would like to run your AM with YARN
Client from shell as oppose to run the Driver like in MRv1. But it's the
same thing (more or less). In the example you provided
(org.apache.hadoop.yarn.applications.DistributedShell) the Client.class is
the "driver". However since distributed-shell is a "simple" application you
do not need a lot of configuration (setting fields in Configuration.class,
I/O formats etc.). The same goes for any other application. As for the
second example (org.apache.hadoop.examples.WordCount) MapReduce AM requires
certain configuration, thus you have to to it the "old-way". The main
difference would be: MR -> end-user-config -> driver, DS -> driver (but you
still can create your own end-user-config). Hope this answers your question
and that I understood it correctly.

regards
tmp

2013/12/5 Yue Wang <te...@gmail.com>

> Hi,
>
> I took a look at the codes and found some examples on the web.
> One example is: http://wiki.opf-labs.org/display/SP/Resource+management
>
> It seems that users can run simple shell commands using Client of YARN.
> But when it comes to a practical MapReduce example like WordCount, people
> still run commands in the old way as in MRv1.
>
> How can I run WordCount using Client and ApplicationMaster of YARN so that
> I can request resources flexibly?
>
>
> Thanks!
>
>
> On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah <tm...@gmail.com> wrote:
>
>> Hi
>>
>> Follow the example provided in
>> Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.
>>
>> regards
>> tmp
>>
>>
>> 2013/12/1 Yue Wang <te...@gmail.com>
>>
>>> Hi,
>>>
>>> I found the page (
>>> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
>>> and know how to write an ApplicationMaster.
>>>
>>> However, is there a complete example showing how to run this
>>> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>>>
>>> Thanks!
>>>
>>>
>>>
>>> Yue
>>>
>>
>>
>

Re: Implementing and running an applicationmaster

Posted by Rob Blah <tm...@gmail.com>.

Hi

If I understood you correctly, you would like to run your AM with YARN
Client from shell as oppose to run the Driver like in MRv1. But it's the
same thing (more or less). In the example you provided
(org.apache.hadoop.yarn.applications.DistributedShell) the Client.class is
the "driver". However since distributed-shell is a "simple" application you
do not need a lot of configuration (setting fields in Configuration.class,
I/O formats etc.). The same goes for any other application. As for the
second example (org.apache.hadoop.examples.WordCount) MapReduce AM requires
certain configuration, thus you have to to it the "old-way". The main
difference would be: MR -> end-user-config -> driver, DS -> driver (but you
still can create your own end-user-config). Hope this answers your question
and that I understood it correctly.

regards
tmp

2013/12/5 Yue Wang <te...@gmail.com>

> Hi,
>
> I took a look at the codes and found some examples on the web.
> One example is: http://wiki.opf-labs.org/display/SP/Resource+management
>
> It seems that users can run simple shell commands using Client of YARN.
> But when it comes to a practical MapReduce example like WordCount, people
> still run commands in the old way as in MRv1.
>
> How can I run WordCount using Client and ApplicationMaster of YARN so that
> I can request resources flexibly?
>
>
> Thanks!
>
>
> On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah <tm...@gmail.com> wrote:
>
>> Hi
>>
>> Follow the example provided in
>> Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.
>>
>> regards
>> tmp
>>
>>
>> 2013/12/1 Yue Wang <te...@gmail.com>
>>
>>> Hi,
>>>
>>> I found the page (
>>> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
>>> and know how to write an ApplicationMaster.
>>>
>>> However, is there a complete example showing how to run this
>>> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>>>
>>> Thanks!
>>>
>>>
>>>
>>> Yue
>>>
>>
>>
>

Re: Implementing and running an applicationmaster

Posted by Rob Blah <tm...@gmail.com>.

Hi

If I understood you correctly, you would like to run your AM with YARN
Client from shell as oppose to run the Driver like in MRv1. But it's the
same thing (more or less). In the example you provided
(org.apache.hadoop.yarn.applications.DistributedShell) the Client.class is
the "driver". However since distributed-shell is a "simple" application you
do not need a lot of configuration (setting fields in Configuration.class,
I/O formats etc.). The same goes for any other application. As for the
second example (org.apache.hadoop.examples.WordCount) MapReduce AM requires
certain configuration, thus you have to to it the "old-way". The main
difference would be: MR -> end-user-config -> driver, DS -> driver (but you
still can create your own end-user-config). Hope this answers your question
and that I understood it correctly.

regards
tmp

2013/12/5 Yue Wang <te...@gmail.com>

> Hi,
>
> I took a look at the codes and found some examples on the web.
> One example is: http://wiki.opf-labs.org/display/SP/Resource+management
>
> It seems that users can run simple shell commands using Client of YARN.
> But when it comes to a practical MapReduce example like WordCount, people
> still run commands in the old way as in MRv1.
>
> How can I run WordCount using Client and ApplicationMaster of YARN so that
> I can request resources flexibly?
>
>
> Thanks!
>
>
> On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah <tm...@gmail.com> wrote:
>
>> Hi
>>
>> Follow the example provided in
>> Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.
>>
>> regards
>> tmp
>>
>>
>> 2013/12/1 Yue Wang <te...@gmail.com>
>>
>>> Hi,
>>>
>>> I found the page (
>>> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
>>> and know how to write an ApplicationMaster.
>>>
>>> However, is there a complete example showing how to run this
>>> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>>>
>>> Thanks!
>>>
>>>
>>>
>>> Yue
>>>
>>
>>
>

Re: Implementing and running an applicationmaster

Posted by Rob Blah <tm...@gmail.com>.

Hi

If I understood you correctly, you would like to run your AM with YARN
Client from shell as oppose to run the Driver like in MRv1. But it's the
same thing (more or less). In the example you provided
(org.apache.hadoop.yarn.applications.DistributedShell) the Client.class is
the "driver". However since distributed-shell is a "simple" application you
do not need a lot of configuration (setting fields in Configuration.class,
I/O formats etc.). The same goes for any other application. As for the
second example (org.apache.hadoop.examples.WordCount) MapReduce AM requires
certain configuration, thus you have to to it the "old-way". The main
difference would be: MR -> end-user-config -> driver, DS -> driver (but you
still can create your own end-user-config). Hope this answers your question
and that I understood it correctly.

regards
tmp

2013/12/5 Yue Wang <te...@gmail.com>

> Hi,
>
> I took a look at the codes and found some examples on the web.
> One example is: http://wiki.opf-labs.org/display/SP/Resource+management
>
> It seems that users can run simple shell commands using Client of YARN.
> But when it comes to a practical MapReduce example like WordCount, people
> still run commands in the old way as in MRv1.
>
> How can I run WordCount using Client and ApplicationMaster of YARN so that
> I can request resources flexibly?
>
>
> Thanks!
>
>
> On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah <tm...@gmail.com> wrote:
>
>> Hi
>>
>> Follow the example provided in
>> Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.
>>
>> regards
>> tmp
>>
>>
>> 2013/12/1 Yue Wang <te...@gmail.com>
>>
>>> Hi,
>>>
>>> I found the page (
>>> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
>>> and know how to write an ApplicationMaster.
>>>
>>> However, is there a complete example showing how to run this
>>> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>>>
>>> Thanks!
>>>
>>>
>>>
>>> Yue
>>>
>>
>>
>

Re: Implementing and running an applicationmaster

Posted by Yue Wang <te...@gmail.com>.

Hi,

I took a look at the codes and found some examples on the web.
One example is: http://wiki.opf-labs.org/display/SP/Resource+management

It seems that users can run simple shell commands using Client of YARN.
But when it comes to a practical MapReduce example like WordCount, people
still run commands in the old way as in MRv1.

How can I run WordCount using Client and ApplicationMaster of YARN so that
I can request resources flexibly?

Thanks!

On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah <tm...@gmail.com> wrote:

> Hi
>
> Follow the example provided in
> Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.
>
> regards
> tmp
>
>
> 2013/12/1 Yue Wang <te...@gmail.com>
>
>> Hi,
>>
>> I found the page (
>> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
>> and know how to write an ApplicationMaster.
>>
>> However, is there a complete example showing how to run this
>> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>>
>> Thanks!
>>
>>
>>
>> Yue
>>
>
>

Re: Implementing and running an applicationmaster

Posted by Yue Wang <te...@gmail.com>.

Hi,

I took a look at the codes and found some examples on the web.
One example is: http://wiki.opf-labs.org/display/SP/Resource+management

It seems that users can run simple shell commands using Client of YARN.
But when it comes to a practical MapReduce example like WordCount, people
still run commands in the old way as in MRv1.

How can I run WordCount using Client and ApplicationMaster of YARN so that
I can request resources flexibly?

Thanks!

On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah <tm...@gmail.com> wrote:

> Hi
>
> Follow the example provided in
> Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.
>
> regards
> tmp
>
>
> 2013/12/1 Yue Wang <te...@gmail.com>
>
>> Hi,
>>
>> I found the page (
>> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
>> and know how to write an ApplicationMaster.
>>
>> However, is there a complete example showing how to run this
>> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>>
>> Thanks!
>>
>>
>>
>> Yue
>>
>
>

Re: Implementing and running an applicationmaster

Posted by Yue Wang <te...@gmail.com>.

Hi,

I took a look at the codes and found some examples on the web.
One example is: http://wiki.opf-labs.org/display/SP/Resource+management

It seems that users can run simple shell commands using Client of YARN.
But when it comes to a practical MapReduce example like WordCount, people
still run commands in the old way as in MRv1.

How can I run WordCount using Client and ApplicationMaster of YARN so that
I can request resources flexibly?

Thanks!

On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah <tm...@gmail.com> wrote:

> Hi
>
> Follow the example provided in
> Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.
>
> regards
> tmp
>
>
> 2013/12/1 Yue Wang <te...@gmail.com>
>
>> Hi,
>>
>> I found the page (
>> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
>> and know how to write an ApplicationMaster.
>>
>> However, is there a complete example showing how to run this
>> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>>
>> Thanks!
>>
>>
>>
>> Yue
>>
>
>

Re: Implementing and running an applicationmaster

Posted by Yue Wang <te...@gmail.com>.

Hi,

I took a look at the codes and found some examples on the web.
One example is: http://wiki.opf-labs.org/display/SP/Resource+management

It seems that users can run simple shell commands using Client of YARN.
But when it comes to a practical MapReduce example like WordCount, people
still run commands in the old way as in MRv1.

How can I run WordCount using Client and ApplicationMaster of YARN so that
I can request resources flexibly?

Thanks!

On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah <tm...@gmail.com> wrote:

> Hi
>
> Follow the example provided in
> Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.
>
> regards
> tmp
>
>
> 2013/12/1 Yue Wang <te...@gmail.com>
>
>> Hi,
>>
>> I found the page (
>> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
>> and know how to write an ApplicationMaster.
>>
>> However, is there a complete example showing how to run this
>> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>>
>> Thanks!
>>
>>
>>
>> Yue
>>
>
>

Re: Implementing and running an applicationmaster

Posted by Rob Blah <tm...@gmail.com>.

Hi

Follow the example provided in
Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.

regards
tmp


2013/12/1 Yue Wang <te...@gmail.com>

> Hi,
>
> I found the page (
> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
> and know how to write an ApplicationMaster.
>
> However, is there a complete example showing how to run this
> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>
> Thanks!
>
>
>
> Yue
>

Re: Implementing and running an applicationmaster

Posted by Rob Blah <tm...@gmail.com>.

Hi

Follow the example provided in
Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.

regards
tmp


2013/12/1 Yue Wang <te...@gmail.com>

> Hi,
>
> I found the page (
> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
> and know how to write an ApplicationMaster.
>
> However, is there a complete example showing how to run this
> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>
> Thanks!
>
>
>
> Yue
>

Re: Implementing and running an applicationmaster

Posted by Rob Blah <tm...@gmail.com>.

Hi

Follow the example provided in
Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.

regards
tmp


2013/12/1 Yue Wang <te...@gmail.com>

> Hi,
>
> I found the page (
> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
> and know how to write an ApplicationMaster.
>
> However, is there a complete example showing how to run this
> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>
> Thanks!
>
>
>
> Yue
>

Re: Implementing and running an applicationmaster

Posted by Rob Blah <tm...@gmail.com>.

Hi

Follow the example provided in
Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.

regards
tmp


2013/12/1 Yue Wang <te...@gmail.com>

> Hi,
>
> I found the page (
> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
> and know how to write an ApplicationMaster.
>
> However, is there a complete example showing how to run this
> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>
> Thanks!
>
>
>
> Yue
>