You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by ch huang <ju...@gmail.com> on 2013/12/03 03:07:39 UTC

issure about MR job on yarn framework

hi,maillist:
              i run a job on my CDH4.4 yarn framework ,it's map task
finished very fast,but reduce is very slow, i check it use ps command
find it's work heap size is 200m,so i try to increase heap size used by
reduce task,i add "YARN_OPTS="$YARN_OPTS
-Dmapreduce.reduce.java.opts=-Xmx1024m -verbose:gc -XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-Xloggc:$YARN_LOG_DIR/gc-$(hostname)-resourcemanager.log
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=15M
-XX:-UseGCOverheadLimit"    in yarn-env.sh file ,but when i restart the
nodemanager ,i find new reduce task still use 200m heap ,why?

# jps
2853 DataNode
19533 Jps
10949 YarnChild
10661 NodeManager
15130 HRegionServer
# ps -ef|grep 10949
yarn     10949 10661 99 09:52 ?        00:19:31
/usr/java/jdk1.7.0_45/bin/java -Djava.net.preferIPv4Stack=true
-Dhadoop.metrics.log.level=WARN -Xmx200m
-Djava.io.tmpdir=/data/1/mrlocal/yarn/local/usercache/hdfs/appcache/application_1385983958793_0022/container_1385983958793_0022_01_005650/tmp
-Dlog4j.configuration=container-log4j.properties
-Dyarn.app.mapreduce.container.log.dir=/data/2/mrlocal/yarn/logs/application_1385983958793_0022/container_1385983958793_0022_01_005650
-Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
org.apache.hadoop.mapred.YarnChild 192.168.11.10 48936
attempt_1385983958793_0022_r_000000_14 5650

Re: issure about MR job on yarn framework

Posted by Adam Kawa <ka...@gmail.com>.

What command are you using to submit a job?

If your job implements the ToolRunner, then you can use
$ hadoop jar your.jar DriverClass -Dmapreduce.reduce.java.opts="-Xmx1024m"
<input-dir> <output-dir>

We have two setting for controlling memory of map or reduce tasks

e.g. for a map task

mapreduce.map.memory.mb <- logical size of a "container" that is used to run
a map tasks
mapreduce.map.java.opts <- the actual size of RAM that a map task gets inside
the container

mapreduce.map.memory.mb should be bigger than mapreduce.map.java.opts.


2013/12/3 Jian He <jh...@hortonworks.com>

> This link may help you with the configuration
>
> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>
> Thanks,
> Jian
>
>
> On Mon, Dec 2, 2013 at 7:53 PM, ch huang <ju...@gmail.com> wrote:
>
>> 13/12/03 11:46:56 INFO mapreduce.Job:  map 100% reduce 19%
>> 13/12/03 11:47:33 INFO mapreduce.Job:  map 100% reduce 20%
>> 13/12/03 11:47:54 INFO mapreduce.Job:  map 100% reduce 21%
>> 13/12/03 11:48:06 INFO mapreduce.Job:  map 100% reduce 22%
>> 13/12/03 11:48:17 INFO mapreduce.Job:  map 100% reduce 23%
>> 13/12/03 11:48:29 INFO mapreduce.Job:  map 100% reduce 24%
>> 13/12/03 11:49:23 INFO mapreduce.Job:  map 83% reduce 25%
>> 13/12/03 11:49:39 INFO mapreduce.Job:  map 84% reduce 25%
>> 13/12/03 11:49:52 INFO mapreduce.Job:  map 85% reduce 25%
>> 13/12/03 11:50:08 INFO mapreduce.Job:  map 86% reduce 25%
>> 13/12/03 11:50:16 INFO mapreduce.Job:  map 87% reduce 25%
>> 13/12/03 11:50:21 INFO mapreduce.Job:  map 88% reduce 25%
>> 13/12/03 11:50:30 INFO mapreduce.Job:  map 89% reduce 25%
>> 13/12/03 11:50:42 INFO mapreduce.Job:  map 90% reduce 25%
>> 13/12/03 11:50:57 INFO mapreduce.Job:  map 91% reduce 25%
>> 13/12/03 11:51:10 INFO mapreduce.Job:  map 92% reduce 25%
>> 13/12/03 11:51:18 INFO mapreduce.Job:  map 92% reduce 26%
>> 13/12/03 11:51:20 INFO mapreduce.Job:  map 93% reduce 26%
>> 13/12/03 11:51:25 INFO mapreduce.Job:  map 94% reduce 26%
>> 13/12/03 11:51:31 INFO mapreduce.Job:  map 95% reduce 26%
>> 13/12/03 11:51:43 INFO mapreduce.Job:  map 96% reduce 26%
>> 13/12/03 11:51:50 INFO mapreduce.Job:  map 97% reduce 26%
>> 13/12/03 11:51:59 INFO mapreduce.Job:  map 98% reduce 26%
>> 13/12/03 11:52:19 INFO mapreduce.Job:  map 99% reduce 26%
>> 13/12/03 11:52:29 INFO mapreduce.Job:  map 100% reduce 26%
>>
>>
>>
>> On Tue, Dec 3, 2013 at 10:25 AM, ch huang <ju...@gmail.com> wrote:
>>
>>> another question is why the map process progress will back when it reach
>>> 100%?
>>>
>>>
>>>
>>>
>>> On Tue, Dec 3, 2013 at 10:07 AM, ch huang <ju...@gmail.com> wrote:
>>>
>>>> hi,maillist:
>>>>               i run a job on my CDH4.4 yarn framework ,it's map task
>>>> finished very fast,but reduce is very slow, i check it use ps command
>>>> find it's work heap size is 200m,so i try to increase heap size used by
>>>> reduce task,i add "YARN_OPTS="$YARN_OPTS
>>>> -Dmapreduce.reduce.java.opts=-Xmx1024m -verbose:gc -XX:+PrintGCDetails
>>>> -XX:+PrintGCDateStamps
>>>> -Xloggc:$YARN_LOG_DIR/gc-$(hostname)-resourcemanager.log
>>>> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=15M
>>>> -XX:-UseGCOverheadLimit"    in yarn-env.sh file ,but when i restart the
>>>> nodemanager ,i find new reduce task still use 200m heap ,why?
>>>>
>>>> # jps
>>>> 2853 DataNode
>>>> 19533 Jps
>>>> 10949 YarnChild
>>>> 10661 NodeManager
>>>> 15130 HRegionServer
>>>> # ps -ef|grep 10949
>>>> yarn     10949 10661 99 09:52 ?        00:19:31
>>>> /usr/java/jdk1.7.0_45/bin/java -Djava.net.preferIPv4Stack=true
>>>> -Dhadoop.metrics.log.level=WARN -Xmx200m
>>>> -Djava.io.tmpdir=/data/1/mrlocal/yarn/local/usercache/hdfs/appcache/application_1385983958793_0022/container_1385983958793_0022_01_005650/tmp
>>>> -Dlog4j.configuration=container-log4j.properties
>>>> -Dyarn.app.mapreduce.container.log.dir=/data/2/mrlocal/yarn/logs/application_1385983958793_0022/container_1385983958793_0022_01_005650
>>>> -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
>>>> org.apache.hadoop.mapred.YarnChild 192.168.11.10 48936
>>>> attempt_1385983958793_0022_r_000000_14 5650
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

Re: issure about MR job on yarn framework

Posted by Adam Kawa <ka...@gmail.com>.

What command are you using to submit a job?

If your job implements the ToolRunner, then you can use
$ hadoop jar your.jar DriverClass -Dmapreduce.reduce.java.opts="-Xmx1024m"
<input-dir> <output-dir>

We have two setting for controlling memory of map or reduce tasks

e.g. for a map task

mapreduce.map.memory.mb <- logical size of a "container" that is used to run
a map tasks
mapreduce.map.java.opts <- the actual size of RAM that a map task gets inside
the container

mapreduce.map.memory.mb should be bigger than mapreduce.map.java.opts.


2013/12/3 Jian He <jh...@hortonworks.com>

> This link may help you with the configuration
>
> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>
> Thanks,
> Jian
>
>
> On Mon, Dec 2, 2013 at 7:53 PM, ch huang <ju...@gmail.com> wrote:
>
>> 13/12/03 11:46:56 INFO mapreduce.Job:  map 100% reduce 19%
>> 13/12/03 11:47:33 INFO mapreduce.Job:  map 100% reduce 20%
>> 13/12/03 11:47:54 INFO mapreduce.Job:  map 100% reduce 21%
>> 13/12/03 11:48:06 INFO mapreduce.Job:  map 100% reduce 22%
>> 13/12/03 11:48:17 INFO mapreduce.Job:  map 100% reduce 23%
>> 13/12/03 11:48:29 INFO mapreduce.Job:  map 100% reduce 24%
>> 13/12/03 11:49:23 INFO mapreduce.Job:  map 83% reduce 25%
>> 13/12/03 11:49:39 INFO mapreduce.Job:  map 84% reduce 25%
>> 13/12/03 11:49:52 INFO mapreduce.Job:  map 85% reduce 25%
>> 13/12/03 11:50:08 INFO mapreduce.Job:  map 86% reduce 25%
>> 13/12/03 11:50:16 INFO mapreduce.Job:  map 87% reduce 25%
>> 13/12/03 11:50:21 INFO mapreduce.Job:  map 88% reduce 25%
>> 13/12/03 11:50:30 INFO mapreduce.Job:  map 89% reduce 25%
>> 13/12/03 11:50:42 INFO mapreduce.Job:  map 90% reduce 25%
>> 13/12/03 11:50:57 INFO mapreduce.Job:  map 91% reduce 25%
>> 13/12/03 11:51:10 INFO mapreduce.Job:  map 92% reduce 25%
>> 13/12/03 11:51:18 INFO mapreduce.Job:  map 92% reduce 26%
>> 13/12/03 11:51:20 INFO mapreduce.Job:  map 93% reduce 26%
>> 13/12/03 11:51:25 INFO mapreduce.Job:  map 94% reduce 26%
>> 13/12/03 11:51:31 INFO mapreduce.Job:  map 95% reduce 26%
>> 13/12/03 11:51:43 INFO mapreduce.Job:  map 96% reduce 26%
>> 13/12/03 11:51:50 INFO mapreduce.Job:  map 97% reduce 26%
>> 13/12/03 11:51:59 INFO mapreduce.Job:  map 98% reduce 26%
>> 13/12/03 11:52:19 INFO mapreduce.Job:  map 99% reduce 26%
>> 13/12/03 11:52:29 INFO mapreduce.Job:  map 100% reduce 26%
>>
>>
>>
>> On Tue, Dec 3, 2013 at 10:25 AM, ch huang <ju...@gmail.com> wrote:
>>
>>> another question is why the map process progress will back when it reach
>>> 100%?
>>>
>>>
>>>
>>>
>>> On Tue, Dec 3, 2013 at 10:07 AM, ch huang <ju...@gmail.com> wrote:
>>>
>>>> hi,maillist:
>>>>               i run a job on my CDH4.4 yarn framework ,it's map task
>>>> finished very fast,but reduce is very slow, i check it use ps command
>>>> find it's work heap size is 200m,so i try to increase heap size used by
>>>> reduce task,i add "YARN_OPTS="$YARN_OPTS
>>>> -Dmapreduce.reduce.java.opts=-Xmx1024m -verbose:gc -XX:+PrintGCDetails
>>>> -XX:+PrintGCDateStamps
>>>> -Xloggc:$YARN_LOG_DIR/gc-$(hostname)-resourcemanager.log
>>>> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=15M
>>>> -XX:-UseGCOverheadLimit"    in yarn-env.sh file ,but when i restart the
>>>> nodemanager ,i find new reduce task still use 200m heap ,why?
>>>>
>>>> # jps
>>>> 2853 DataNode
>>>> 19533 Jps
>>>> 10949 YarnChild
>>>> 10661 NodeManager
>>>> 15130 HRegionServer
>>>> # ps -ef|grep 10949
>>>> yarn     10949 10661 99 09:52 ?        00:19:31
>>>> /usr/java/jdk1.7.0_45/bin/java -Djava.net.preferIPv4Stack=true
>>>> -Dhadoop.metrics.log.level=WARN -Xmx200m
>>>> -Djava.io.tmpdir=/data/1/mrlocal/yarn/local/usercache/hdfs/appcache/application_1385983958793_0022/container_1385983958793_0022_01_005650/tmp
>>>> -Dlog4j.configuration=container-log4j.properties
>>>> -Dyarn.app.mapreduce.container.log.dir=/data/2/mrlocal/yarn/logs/application_1385983958793_0022/container_1385983958793_0022_01_005650
>>>> -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
>>>> org.apache.hadoop.mapred.YarnChild 192.168.11.10 48936
>>>> attempt_1385983958793_0022_r_000000_14 5650
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

Re: issure about MR job on yarn framework

Posted by Adam Kawa <ka...@gmail.com>.

What command are you using to submit a job?

If your job implements the ToolRunner, then you can use
$ hadoop jar your.jar DriverClass -Dmapreduce.reduce.java.opts="-Xmx1024m"
<input-dir> <output-dir>

We have two setting for controlling memory of map or reduce tasks

e.g. for a map task

mapreduce.map.memory.mb <- logical size of a "container" that is used to run
a map tasks
mapreduce.map.java.opts <- the actual size of RAM that a map task gets inside
the container

mapreduce.map.memory.mb should be bigger than mapreduce.map.java.opts.


2013/12/3 Jian He <jh...@hortonworks.com>

> This link may help you with the configuration
>
> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>
> Thanks,
> Jian
>
>
> On Mon, Dec 2, 2013 at 7:53 PM, ch huang <ju...@gmail.com> wrote:
>
>> 13/12/03 11:46:56 INFO mapreduce.Job:  map 100% reduce 19%
>> 13/12/03 11:47:33 INFO mapreduce.Job:  map 100% reduce 20%
>> 13/12/03 11:47:54 INFO mapreduce.Job:  map 100% reduce 21%
>> 13/12/03 11:48:06 INFO mapreduce.Job:  map 100% reduce 22%
>> 13/12/03 11:48:17 INFO mapreduce.Job:  map 100% reduce 23%
>> 13/12/03 11:48:29 INFO mapreduce.Job:  map 100% reduce 24%
>> 13/12/03 11:49:23 INFO mapreduce.Job:  map 83% reduce 25%
>> 13/12/03 11:49:39 INFO mapreduce.Job:  map 84% reduce 25%
>> 13/12/03 11:49:52 INFO mapreduce.Job:  map 85% reduce 25%
>> 13/12/03 11:50:08 INFO mapreduce.Job:  map 86% reduce 25%
>> 13/12/03 11:50:16 INFO mapreduce.Job:  map 87% reduce 25%
>> 13/12/03 11:50:21 INFO mapreduce.Job:  map 88% reduce 25%
>> 13/12/03 11:50:30 INFO mapreduce.Job:  map 89% reduce 25%
>> 13/12/03 11:50:42 INFO mapreduce.Job:  map 90% reduce 25%
>> 13/12/03 11:50:57 INFO mapreduce.Job:  map 91% reduce 25%
>> 13/12/03 11:51:10 INFO mapreduce.Job:  map 92% reduce 25%
>> 13/12/03 11:51:18 INFO mapreduce.Job:  map 92% reduce 26%
>> 13/12/03 11:51:20 INFO mapreduce.Job:  map 93% reduce 26%
>> 13/12/03 11:51:25 INFO mapreduce.Job:  map 94% reduce 26%
>> 13/12/03 11:51:31 INFO mapreduce.Job:  map 95% reduce 26%
>> 13/12/03 11:51:43 INFO mapreduce.Job:  map 96% reduce 26%
>> 13/12/03 11:51:50 INFO mapreduce.Job:  map 97% reduce 26%
>> 13/12/03 11:51:59 INFO mapreduce.Job:  map 98% reduce 26%
>> 13/12/03 11:52:19 INFO mapreduce.Job:  map 99% reduce 26%
>> 13/12/03 11:52:29 INFO mapreduce.Job:  map 100% reduce 26%
>>
>>
>>
>> On Tue, Dec 3, 2013 at 10:25 AM, ch huang <ju...@gmail.com> wrote:
>>
>>> another question is why the map process progress will back when it reach
>>> 100%?
>>>
>>>
>>>
>>>
>>> On Tue, Dec 3, 2013 at 10:07 AM, ch huang <ju...@gmail.com> wrote:
>>>
>>>> hi,maillist:
>>>>               i run a job on my CDH4.4 yarn framework ,it's map task
>>>> finished very fast,but reduce is very slow, i check it use ps command
>>>> find it's work heap size is 200m,so i try to increase heap size used by
>>>> reduce task,i add "YARN_OPTS="$YARN_OPTS
>>>> -Dmapreduce.reduce.java.opts=-Xmx1024m -verbose:gc -XX:+PrintGCDetails
>>>> -XX:+PrintGCDateStamps
>>>> -Xloggc:$YARN_LOG_DIR/gc-$(hostname)-resourcemanager.log
>>>> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=15M
>>>> -XX:-UseGCOverheadLimit"    in yarn-env.sh file ,but when i restart the
>>>> nodemanager ,i find new reduce task still use 200m heap ,why?
>>>>
>>>> # jps
>>>> 2853 DataNode
>>>> 19533 Jps
>>>> 10949 YarnChild
>>>> 10661 NodeManager
>>>> 15130 HRegionServer
>>>> # ps -ef|grep 10949
>>>> yarn     10949 10661 99 09:52 ?        00:19:31
>>>> /usr/java/jdk1.7.0_45/bin/java -Djava.net.preferIPv4Stack=true
>>>> -Dhadoop.metrics.log.level=WARN -Xmx200m
>>>> -Djava.io.tmpdir=/data/1/mrlocal/yarn/local/usercache/hdfs/appcache/application_1385983958793_0022/container_1385983958793_0022_01_005650/tmp
>>>> -Dlog4j.configuration=container-log4j.properties
>>>> -Dyarn.app.mapreduce.container.log.dir=/data/2/mrlocal/yarn/logs/application_1385983958793_0022/container_1385983958793_0022_01_005650
>>>> -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
>>>> org.apache.hadoop.mapred.YarnChild 192.168.11.10 48936
>>>> attempt_1385983958793_0022_r_000000_14 5650
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

Re: issure about MR job on yarn framework

Posted by Adam Kawa <ka...@gmail.com>.

What command are you using to submit a job?

If your job implements the ToolRunner, then you can use
$ hadoop jar your.jar DriverClass -Dmapreduce.reduce.java.opts="-Xmx1024m"
<input-dir> <output-dir>

We have two setting for controlling memory of map or reduce tasks

e.g. for a map task

mapreduce.map.memory.mb <- logical size of a "container" that is used to run
a map tasks
mapreduce.map.java.opts <- the actual size of RAM that a map task gets inside
the container

mapreduce.map.memory.mb should be bigger than mapreduce.map.java.opts.


2013/12/3 Jian He <jh...@hortonworks.com>

> This link may help you with the configuration
>
> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>
> Thanks,
> Jian
>
>
> On Mon, Dec 2, 2013 at 7:53 PM, ch huang <ju...@gmail.com> wrote:
>
>> 13/12/03 11:46:56 INFO mapreduce.Job:  map 100% reduce 19%
>> 13/12/03 11:47:33 INFO mapreduce.Job:  map 100% reduce 20%
>> 13/12/03 11:47:54 INFO mapreduce.Job:  map 100% reduce 21%
>> 13/12/03 11:48:06 INFO mapreduce.Job:  map 100% reduce 22%
>> 13/12/03 11:48:17 INFO mapreduce.Job:  map 100% reduce 23%
>> 13/12/03 11:48:29 INFO mapreduce.Job:  map 100% reduce 24%
>> 13/12/03 11:49:23 INFO mapreduce.Job:  map 83% reduce 25%
>> 13/12/03 11:49:39 INFO mapreduce.Job:  map 84% reduce 25%
>> 13/12/03 11:49:52 INFO mapreduce.Job:  map 85% reduce 25%
>> 13/12/03 11:50:08 INFO mapreduce.Job:  map 86% reduce 25%
>> 13/12/03 11:50:16 INFO mapreduce.Job:  map 87% reduce 25%
>> 13/12/03 11:50:21 INFO mapreduce.Job:  map 88% reduce 25%
>> 13/12/03 11:50:30 INFO mapreduce.Job:  map 89% reduce 25%
>> 13/12/03 11:50:42 INFO mapreduce.Job:  map 90% reduce 25%
>> 13/12/03 11:50:57 INFO mapreduce.Job:  map 91% reduce 25%
>> 13/12/03 11:51:10 INFO mapreduce.Job:  map 92% reduce 25%
>> 13/12/03 11:51:18 INFO mapreduce.Job:  map 92% reduce 26%
>> 13/12/03 11:51:20 INFO mapreduce.Job:  map 93% reduce 26%
>> 13/12/03 11:51:25 INFO mapreduce.Job:  map 94% reduce 26%
>> 13/12/03 11:51:31 INFO mapreduce.Job:  map 95% reduce 26%
>> 13/12/03 11:51:43 INFO mapreduce.Job:  map 96% reduce 26%
>> 13/12/03 11:51:50 INFO mapreduce.Job:  map 97% reduce 26%
>> 13/12/03 11:51:59 INFO mapreduce.Job:  map 98% reduce 26%
>> 13/12/03 11:52:19 INFO mapreduce.Job:  map 99% reduce 26%
>> 13/12/03 11:52:29 INFO mapreduce.Job:  map 100% reduce 26%
>>
>>
>>
>> On Tue, Dec 3, 2013 at 10:25 AM, ch huang <ju...@gmail.com> wrote:
>>
>>> another question is why the map process progress will back when it reach
>>> 100%?
>>>
>>>
>>>
>>>
>>> On Tue, Dec 3, 2013 at 10:07 AM, ch huang <ju...@gmail.com> wrote:
>>>
>>>> hi,maillist:
>>>>               i run a job on my CDH4.4 yarn framework ,it's map task
>>>> finished very fast,but reduce is very slow, i check it use ps command
>>>> find it's work heap size is 200m,so i try to increase heap size used by
>>>> reduce task,i add "YARN_OPTS="$YARN_OPTS
>>>> -Dmapreduce.reduce.java.opts=-Xmx1024m -verbose:gc -XX:+PrintGCDetails
>>>> -XX:+PrintGCDateStamps
>>>> -Xloggc:$YARN_LOG_DIR/gc-$(hostname)-resourcemanager.log
>>>> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=15M
>>>> -XX:-UseGCOverheadLimit"    in yarn-env.sh file ,but when i restart the
>>>> nodemanager ,i find new reduce task still use 200m heap ,why?
>>>>
>>>> # jps
>>>> 2853 DataNode
>>>> 19533 Jps
>>>> 10949 YarnChild
>>>> 10661 NodeManager
>>>> 15130 HRegionServer
>>>> # ps -ef|grep 10949
>>>> yarn     10949 10661 99 09:52 ?        00:19:31
>>>> /usr/java/jdk1.7.0_45/bin/java -Djava.net.preferIPv4Stack=true
>>>> -Dhadoop.metrics.log.level=WARN -Xmx200m
>>>> -Djava.io.tmpdir=/data/1/mrlocal/yarn/local/usercache/hdfs/appcache/application_1385983958793_0022/container_1385983958793_0022_01_005650/tmp
>>>> -Dlog4j.configuration=container-log4j.properties
>>>> -Dyarn.app.mapreduce.container.log.dir=/data/2/mrlocal/yarn/logs/application_1385983958793_0022/container_1385983958793_0022_01_005650
>>>> -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
>>>> org.apache.hadoop.mapred.YarnChild 192.168.11.10 48936
>>>> attempt_1385983958793_0022_r_000000_14 5650
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

Re: issure about MR job on yarn framework

Posted by Jian He <jh...@hortonworks.com>.

This link may help you with the configuration

http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/

Thanks,
Jian


On Mon, Dec 2, 2013 at 7:53 PM, ch huang <ju...@gmail.com> wrote:

> 13/12/03 11:46:56 INFO mapreduce.Job:  map 100% reduce 19%
> 13/12/03 11:47:33 INFO mapreduce.Job:  map 100% reduce 20%
> 13/12/03 11:47:54 INFO mapreduce.Job:  map 100% reduce 21%
> 13/12/03 11:48:06 INFO mapreduce.Job:  map 100% reduce 22%
> 13/12/03 11:48:17 INFO mapreduce.Job:  map 100% reduce 23%
> 13/12/03 11:48:29 INFO mapreduce.Job:  map 100% reduce 24%
> 13/12/03 11:49:23 INFO mapreduce.Job:  map 83% reduce 25%
> 13/12/03 11:49:39 INFO mapreduce.Job:  map 84% reduce 25%
> 13/12/03 11:49:52 INFO mapreduce.Job:  map 85% reduce 25%
> 13/12/03 11:50:08 INFO mapreduce.Job:  map 86% reduce 25%
> 13/12/03 11:50:16 INFO mapreduce.Job:  map 87% reduce 25%
> 13/12/03 11:50:21 INFO mapreduce.Job:  map 88% reduce 25%
> 13/12/03 11:50:30 INFO mapreduce.Job:  map 89% reduce 25%
> 13/12/03 11:50:42 INFO mapreduce.Job:  map 90% reduce 25%
> 13/12/03 11:50:57 INFO mapreduce.Job:  map 91% reduce 25%
> 13/12/03 11:51:10 INFO mapreduce.Job:  map 92% reduce 25%
> 13/12/03 11:51:18 INFO mapreduce.Job:  map 92% reduce 26%
> 13/12/03 11:51:20 INFO mapreduce.Job:  map 93% reduce 26%
> 13/12/03 11:51:25 INFO mapreduce.Job:  map 94% reduce 26%
> 13/12/03 11:51:31 INFO mapreduce.Job:  map 95% reduce 26%
> 13/12/03 11:51:43 INFO mapreduce.Job:  map 96% reduce 26%
> 13/12/03 11:51:50 INFO mapreduce.Job:  map 97% reduce 26%
> 13/12/03 11:51:59 INFO mapreduce.Job:  map 98% reduce 26%
> 13/12/03 11:52:19 INFO mapreduce.Job:  map 99% reduce 26%
> 13/12/03 11:52:29 INFO mapreduce.Job:  map 100% reduce 26%
>
>
>
> On Tue, Dec 3, 2013 at 10:25 AM, ch huang <ju...@gmail.com> wrote:
>
>> another question is why the map process progress will back when it reach
>> 100%?
>>
>>
>>
>>
>> On Tue, Dec 3, 2013 at 10:07 AM, ch huang <ju...@gmail.com> wrote:
>>
>>> hi,maillist:
>>>               i run a job on my CDH4.4 yarn framework ,it's map task
>>> finished very fast,but reduce is very slow, i check it use ps command
>>> find it's work heap size is 200m,so i try to increase heap size used by
>>> reduce task,i add "YARN_OPTS="$YARN_OPTS
>>> -Dmapreduce.reduce.java.opts=-Xmx1024m -verbose:gc -XX:+PrintGCDetails
>>> -XX:+PrintGCDateStamps
>>> -Xloggc:$YARN_LOG_DIR/gc-$(hostname)-resourcemanager.log
>>> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=15M
>>> -XX:-UseGCOverheadLimit"    in yarn-env.sh file ,but when i restart the
>>> nodemanager ,i find new reduce task still use 200m heap ,why?
>>>
>>> # jps
>>> 2853 DataNode
>>> 19533 Jps
>>> 10949 YarnChild
>>> 10661 NodeManager
>>> 15130 HRegionServer
>>> # ps -ef|grep 10949
>>> yarn     10949 10661 99 09:52 ?        00:19:31
>>> /usr/java/jdk1.7.0_45/bin/java -Djava.net.preferIPv4Stack=true
>>> -Dhadoop.metrics.log.level=WARN -Xmx200m
>>> -Djava.io.tmpdir=/data/1/mrlocal/yarn/local/usercache/hdfs/appcache/application_1385983958793_0022/container_1385983958793_0022_01_005650/tmp
>>> -Dlog4j.configuration=container-log4j.properties
>>> -Dyarn.app.mapreduce.container.log.dir=/data/2/mrlocal/yarn/logs/application_1385983958793_0022/container_1385983958793_0022_01_005650
>>> -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
>>> org.apache.hadoop.mapred.YarnChild 192.168.11.10 48936
>>> attempt_1385983958793_0022_r_000000_14 5650
>>>
>>>
>>>
>>>
>>>
>>
>>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: issure about MR job on yarn framework

Posted by Jian He <jh...@hortonworks.com>.

This link may help you with the configuration

http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/

Thanks,
Jian


On Mon, Dec 2, 2013 at 7:53 PM, ch huang <ju...@gmail.com> wrote:

> 13/12/03 11:46:56 INFO mapreduce.Job:  map 100% reduce 19%
> 13/12/03 11:47:33 INFO mapreduce.Job:  map 100% reduce 20%
> 13/12/03 11:47:54 INFO mapreduce.Job:  map 100% reduce 21%
> 13/12/03 11:48:06 INFO mapreduce.Job:  map 100% reduce 22%
> 13/12/03 11:48:17 INFO mapreduce.Job:  map 100% reduce 23%
> 13/12/03 11:48:29 INFO mapreduce.Job:  map 100% reduce 24%
> 13/12/03 11:49:23 INFO mapreduce.Job:  map 83% reduce 25%
> 13/12/03 11:49:39 INFO mapreduce.Job:  map 84% reduce 25%
> 13/12/03 11:49:52 INFO mapreduce.Job:  map 85% reduce 25%
> 13/12/03 11:50:08 INFO mapreduce.Job:  map 86% reduce 25%
> 13/12/03 11:50:16 INFO mapreduce.Job:  map 87% reduce 25%
> 13/12/03 11:50:21 INFO mapreduce.Job:  map 88% reduce 25%
> 13/12/03 11:50:30 INFO mapreduce.Job:  map 89% reduce 25%
> 13/12/03 11:50:42 INFO mapreduce.Job:  map 90% reduce 25%
> 13/12/03 11:50:57 INFO mapreduce.Job:  map 91% reduce 25%
> 13/12/03 11:51:10 INFO mapreduce.Job:  map 92% reduce 25%
> 13/12/03 11:51:18 INFO mapreduce.Job:  map 92% reduce 26%
> 13/12/03 11:51:20 INFO mapreduce.Job:  map 93% reduce 26%
> 13/12/03 11:51:25 INFO mapreduce.Job:  map 94% reduce 26%
> 13/12/03 11:51:31 INFO mapreduce.Job:  map 95% reduce 26%
> 13/12/03 11:51:43 INFO mapreduce.Job:  map 96% reduce 26%
> 13/12/03 11:51:50 INFO mapreduce.Job:  map 97% reduce 26%
> 13/12/03 11:51:59 INFO mapreduce.Job:  map 98% reduce 26%
> 13/12/03 11:52:19 INFO mapreduce.Job:  map 99% reduce 26%
> 13/12/03 11:52:29 INFO mapreduce.Job:  map 100% reduce 26%
>
>
>
> On Tue, Dec 3, 2013 at 10:25 AM, ch huang <ju...@gmail.com> wrote:
>
>> another question is why the map process progress will back when it reach
>> 100%?
>>
>>
>>
>>
>> On Tue, Dec 3, 2013 at 10:07 AM, ch huang <ju...@gmail.com> wrote:
>>
>>> hi,maillist:
>>>               i run a job on my CDH4.4 yarn framework ,it's map task
>>> finished very fast,but reduce is very slow, i check it use ps command
>>> find it's work heap size is 200m,so i try to increase heap size used by
>>> reduce task,i add "YARN_OPTS="$YARN_OPTS
>>> -Dmapreduce.reduce.java.opts=-Xmx1024m -verbose:gc -XX:+PrintGCDetails
>>> -XX:+PrintGCDateStamps
>>> -Xloggc:$YARN_LOG_DIR/gc-$(hostname)-resourcemanager.log
>>> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=15M
>>> -XX:-UseGCOverheadLimit"    in yarn-env.sh file ,but when i restart the
>>> nodemanager ,i find new reduce task still use 200m heap ,why?
>>>
>>> # jps
>>> 2853 DataNode
>>> 19533 Jps
>>> 10949 YarnChild
>>> 10661 NodeManager
>>> 15130 HRegionServer
>>> # ps -ef|grep 10949
>>> yarn     10949 10661 99 09:52 ?        00:19:31
>>> /usr/java/jdk1.7.0_45/bin/java -Djava.net.preferIPv4Stack=true
>>> -Dhadoop.metrics.log.level=WARN -Xmx200m
>>> -Djava.io.tmpdir=/data/1/mrlocal/yarn/local/usercache/hdfs/appcache/application_1385983958793_0022/container_1385983958793_0022_01_005650/tmp
>>> -Dlog4j.configuration=container-log4j.properties
>>> -Dyarn.app.mapreduce.container.log.dir=/data/2/mrlocal/yarn/logs/application_1385983958793_0022/container_1385983958793_0022_01_005650
>>> -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
>>> org.apache.hadoop.mapred.YarnChild 192.168.11.10 48936
>>> attempt_1385983958793_0022_r_000000_14 5650
>>>
>>>
>>>
>>>
>>>
>>
>>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: issure about MR job on yarn framework

Posted by Jian He <jh...@hortonworks.com>.

This link may help you with the configuration

http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/

Thanks,
Jian


On Mon, Dec 2, 2013 at 7:53 PM, ch huang <ju...@gmail.com> wrote:

> 13/12/03 11:46:56 INFO mapreduce.Job:  map 100% reduce 19%
> 13/12/03 11:47:33 INFO mapreduce.Job:  map 100% reduce 20%
> 13/12/03 11:47:54 INFO mapreduce.Job:  map 100% reduce 21%
> 13/12/03 11:48:06 INFO mapreduce.Job:  map 100% reduce 22%
> 13/12/03 11:48:17 INFO mapreduce.Job:  map 100% reduce 23%
> 13/12/03 11:48:29 INFO mapreduce.Job:  map 100% reduce 24%
> 13/12/03 11:49:23 INFO mapreduce.Job:  map 83% reduce 25%
> 13/12/03 11:49:39 INFO mapreduce.Job:  map 84% reduce 25%
> 13/12/03 11:49:52 INFO mapreduce.Job:  map 85% reduce 25%
> 13/12/03 11:50:08 INFO mapreduce.Job:  map 86% reduce 25%
> 13/12/03 11:50:16 INFO mapreduce.Job:  map 87% reduce 25%
> 13/12/03 11:50:21 INFO mapreduce.Job:  map 88% reduce 25%
> 13/12/03 11:50:30 INFO mapreduce.Job:  map 89% reduce 25%
> 13/12/03 11:50:42 INFO mapreduce.Job:  map 90% reduce 25%
> 13/12/03 11:50:57 INFO mapreduce.Job:  map 91% reduce 25%
> 13/12/03 11:51:10 INFO mapreduce.Job:  map 92% reduce 25%
> 13/12/03 11:51:18 INFO mapreduce.Job:  map 92% reduce 26%
> 13/12/03 11:51:20 INFO mapreduce.Job:  map 93% reduce 26%
> 13/12/03 11:51:25 INFO mapreduce.Job:  map 94% reduce 26%
> 13/12/03 11:51:31 INFO mapreduce.Job:  map 95% reduce 26%
> 13/12/03 11:51:43 INFO mapreduce.Job:  map 96% reduce 26%
> 13/12/03 11:51:50 INFO mapreduce.Job:  map 97% reduce 26%
> 13/12/03 11:51:59 INFO mapreduce.Job:  map 98% reduce 26%
> 13/12/03 11:52:19 INFO mapreduce.Job:  map 99% reduce 26%
> 13/12/03 11:52:29 INFO mapreduce.Job:  map 100% reduce 26%
>
>
>
> On Tue, Dec 3, 2013 at 10:25 AM, ch huang <ju...@gmail.com> wrote:
>
>> another question is why the map process progress will back when it reach
>> 100%?
>>
>>
>>
>>
>> On Tue, Dec 3, 2013 at 10:07 AM, ch huang <ju...@gmail.com> wrote:
>>
>>> hi,maillist:
>>>               i run a job on my CDH4.4 yarn framework ,it's map task
>>> finished very fast,but reduce is very slow, i check it use ps command
>>> find it's work heap size is 200m,so i try to increase heap size used by
>>> reduce task,i add "YARN_OPTS="$YARN_OPTS
>>> -Dmapreduce.reduce.java.opts=-Xmx1024m -verbose:gc -XX:+PrintGCDetails
>>> -XX:+PrintGCDateStamps
>>> -Xloggc:$YARN_LOG_DIR/gc-$(hostname)-resourcemanager.log
>>> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=15M
>>> -XX:-UseGCOverheadLimit"    in yarn-env.sh file ,but when i restart the
>>> nodemanager ,i find new reduce task still use 200m heap ,why?
>>>
>>> # jps
>>> 2853 DataNode
>>> 19533 Jps
>>> 10949 YarnChild
>>> 10661 NodeManager
>>> 15130 HRegionServer
>>> # ps -ef|grep 10949
>>> yarn     10949 10661 99 09:52 ?        00:19:31
>>> /usr/java/jdk1.7.0_45/bin/java -Djava.net.preferIPv4Stack=true
>>> -Dhadoop.metrics.log.level=WARN -Xmx200m
>>> -Djava.io.tmpdir=/data/1/mrlocal/yarn/local/usercache/hdfs/appcache/application_1385983958793_0022/container_1385983958793_0022_01_005650/tmp
>>> -Dlog4j.configuration=container-log4j.properties
>>> -Dyarn.app.mapreduce.container.log.dir=/data/2/mrlocal/yarn/logs/application_1385983958793_0022/container_1385983958793_0022_01_005650
>>> -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
>>> org.apache.hadoop.mapred.YarnChild 192.168.11.10 48936
>>> attempt_1385983958793_0022_r_000000_14 5650
>>>
>>>
>>>
>>>
>>>
>>
>>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.