You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hadoop.apache.org by francexo83 <fr...@gmail.com> on 2014/11/18 16:23:04 UTC

MR job fails with too many mappers

Hi All,

I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed
on it.

The hadoop version is 2.3.0 and below my use case scenario.

I wrote a map reduce program that reads data from an hbase table and does
some transformations on these data.
Jobs are very simple so they didn't need the  reduce phase. I also wrote a
TableInputFormat  extension in order to maximize the number of concurrent
maps on the cluster.
In other words, each  row should be processed by a single map task.

Everything goes well until the number of rows and consequently  mappers
exceeds 300000 quota.

This is the only exception I see when the job fails:

Application application_1416304409718_0032 failed 2 times due to AM
Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1
due to:


Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
at org.apache.hadoop.util.Shell.run(Shell.java:424)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1


Cluster configuration details:
Node1: 12 GB, 4 core
Node2: 6 GB, 4 core
Node3: 6 GB, 4 core

yarn.scheduler.minimum-allocation-mb=2048
yarn.scheduler.maximum-allocation-mb=4096
yarn.nodemanager.resource.memory-mb=6144



Regards

Re: MR job fails with too many mappers

Posted by Palaniappan Viswanathan <re...@gmail.com>.

Dear All,

I am fairly new to hadoop. I have a technical background. I know the hadoop
architecture at a higher level but want to start understanding the software
design of the different components of hadoop so that I can start
contributing in terms of development. Can somebody suggest where I should
start ? Are there any software design documents, flow diagrams or any such
documentation that can help a newbie like me so that I can eventually
contribute to the development of hadoop?

Thanks,

Palani.V

On Wed, Nov 19, 2014 at 4:14 AM, francexo83 <fr...@gmail.com> wrote:

> Thank you very much for your suggestion, it was very helpful.
>
> This is what I have after  turning off log aggregation:
>
> 2014-11-18 18:39:01,507 INFO [main]
> org.apache.hadoop.service.AbstractService: Service
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state STARTED;
> cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_1416332245344_0004
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_1416332245344_0004
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1551)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1406)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1373)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:986)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1249)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1049)
>         at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1460)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
> Caused by: java.io.IOException: Split metadata size exceeded 10000000.
> Aborting job job_1416332245344_0004
>         at
> org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:53)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1546)
>
>
> I exceeded the split metadata size so I  added the following property into
> the mapred-site.xml and it worked:
>
> <property>
>     <name>mapreduce.job.split.metainfo.maxsize</name>
>     <value>500000000</value>
> </property>
>
> thanks again.
>
>
>
>
>
>
>
>
> 2014-11-18 17:59 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>
>>  If log aggregation is enabled, log folder will be deleted. So I suggest
>> disable “yarn.log-aggregation-enable” and run job again. All the logs
>> remains at log folder. Then you can find container logs
>>
>>
>>
>> Thanks & Regards
>>
>> Rohith Sharma K S
>>
>>
>>
>> This e-mail and its attachments contain confidential information from
>> HUAWEI, which is intended only for the person or entity whose address is
>> listed above. Any use of the information contained herein in any way
>> (including, but not limited to, total or partial disclosure, reproduction,
>> or dissemination) by persons other than the intended recipient(s) is
>> prohibited. If you receive this e-mail in error, please notify the sender
>> by phone or email immediately and delete it!
>>
>>
>>
>> *From:* francexo83 [mailto:francexo83@gmail.com]
>> *Sent:* 18 November 2014 22:15
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: MR job fails with too many mappers
>>
>>
>>
>> Hi,
>>
>>
>>
>> thank you for your quick response, but I was not able to see the logs for
>> the container.
>>
>>
>>
>> I get a  "no such file or directory" when I try to access the logs of the
>> container from the shell:
>>
>>
>>
>> cd /var/log/hadoop-yarn/containers/application_1416304409718_0032
>>
>>
>>
>>
>>
>> It seems that the container has never been created.
>>
>>
>>
>>
>>
>>
>>
>> thanks
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>>
>> Hi
>>
>>
>>
>> Could you get syserr and sysout log for contrainer.? These logs will be
>> available in the same location  syslog for container.
>>
>> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
>>
>> This helps to find problem!!
>>
>>
>>
>>
>>
>> Thanks & Regards
>>
>> Rohith Sharma K S
>>
>>
>>
>> *From:* francexo83 [mailto:francexo83@gmail.com]
>> *Sent:* 18 November 2014 20:53
>> *To:* user@hadoop.apache.org
>> *Subject:* MR job fails with too many mappers
>>
>>
>>
>> Hi All,
>>
>>
>>
>> I have a small  hadoop cluster with three nodes and HBase 0.98.1
>> installed on it.
>>
>>
>>
>> The hadoop version is 2.3.0 and below my use case scenario.
>>
>>
>>
>> I wrote a map reduce program that reads data from an hbase table and does
>> some transformations on these data.
>>
>> Jobs are very simple so they didn't need the  reduce phase. I also wrote
>> a TableInputFormat  extension in order to maximize the number of concurrent
>> maps on the cluster.
>>
>> In other words, each  row should be processed by a single map task.
>>
>>
>>
>> Everything goes well until the number of rows and consequently  mappers
>> exceeds 300000 quota.
>>
>>
>>
>> This is the only exception I see when the job fails:
>>
>>
>>
>> Application application_1416304409718_0032 failed 2 times due to AM
>> Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1
>> due to:
>>
>>
>>
>>
>>
>> Exception from container-launch:
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>
>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
>>
>> at org.apache.hadoop.util.Shell.run(Shell.java:424)
>>
>> at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>>
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>
>> at java.lang.Thread.run(Thread.java:745)
>>
>> Container exited with a non-zero exit code 1
>>
>>
>>
>>
>>
>> Cluster configuration details:
>>
>> Node1: 12 GB, 4 core
>>
>> Node2: 6 GB, 4 core
>>
>> Node3: 6 GB, 4 core
>>
>>
>>
>> yarn.scheduler.minimum-allocation-mb=2048
>>
>> yarn.scheduler.maximum-allocation-mb=4096
>>
>> yarn.nodemanager.resource.memory-mb=6144
>>
>>
>>
>>
>>
>>
>>
>> Regards
>>
>>
>>
>
>

Re: MR job fails with too many mappers

Posted by Palaniappan Viswanathan <re...@gmail.com>.

Dear All,

I am fairly new to hadoop. I have a technical background. I know the hadoop
architecture at a higher level but want to start understanding the software
design of the different components of hadoop so that I can start
contributing in terms of development. Can somebody suggest where I should
start ? Are there any software design documents, flow diagrams or any such
documentation that can help a newbie like me so that I can eventually
contribute to the development of hadoop?

Thanks,

Palani.V

On Wed, Nov 19, 2014 at 4:14 AM, francexo83 <fr...@gmail.com> wrote:

> Thank you very much for your suggestion, it was very helpful.
>
> This is what I have after  turning off log aggregation:
>
> 2014-11-18 18:39:01,507 INFO [main]
> org.apache.hadoop.service.AbstractService: Service
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state STARTED;
> cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_1416332245344_0004
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_1416332245344_0004
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1551)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1406)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1373)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:986)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1249)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1049)
>         at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1460)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
> Caused by: java.io.IOException: Split metadata size exceeded 10000000.
> Aborting job job_1416332245344_0004
>         at
> org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:53)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1546)
>
>
> I exceeded the split metadata size so I  added the following property into
> the mapred-site.xml and it worked:
>
> <property>
>     <name>mapreduce.job.split.metainfo.maxsize</name>
>     <value>500000000</value>
> </property>
>
> thanks again.
>
>
>
>
>
>
>
>
> 2014-11-18 17:59 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>
>>  If log aggregation is enabled, log folder will be deleted. So I suggest
>> disable “yarn.log-aggregation-enable” and run job again. All the logs
>> remains at log folder. Then you can find container logs
>>
>>
>>
>> Thanks & Regards
>>
>> Rohith Sharma K S
>>
>>
>>
>> This e-mail and its attachments contain confidential information from
>> HUAWEI, which is intended only for the person or entity whose address is
>> listed above. Any use of the information contained herein in any way
>> (including, but not limited to, total or partial disclosure, reproduction,
>> or dissemination) by persons other than the intended recipient(s) is
>> prohibited. If you receive this e-mail in error, please notify the sender
>> by phone or email immediately and delete it!
>>
>>
>>
>> *From:* francexo83 [mailto:francexo83@gmail.com]
>> *Sent:* 18 November 2014 22:15
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: MR job fails with too many mappers
>>
>>
>>
>> Hi,
>>
>>
>>
>> thank you for your quick response, but I was not able to see the logs for
>> the container.
>>
>>
>>
>> I get a  "no such file or directory" when I try to access the logs of the
>> container from the shell:
>>
>>
>>
>> cd /var/log/hadoop-yarn/containers/application_1416304409718_0032
>>
>>
>>
>>
>>
>> It seems that the container has never been created.
>>
>>
>>
>>
>>
>>
>>
>> thanks
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>>
>> Hi
>>
>>
>>
>> Could you get syserr and sysout log for contrainer.? These logs will be
>> available in the same location  syslog for container.
>>
>> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
>>
>> This helps to find problem!!
>>
>>
>>
>>
>>
>> Thanks & Regards
>>
>> Rohith Sharma K S
>>
>>
>>
>> *From:* francexo83 [mailto:francexo83@gmail.com]
>> *Sent:* 18 November 2014 20:53
>> *To:* user@hadoop.apache.org
>> *Subject:* MR job fails with too many mappers
>>
>>
>>
>> Hi All,
>>
>>
>>
>> I have a small  hadoop cluster with three nodes and HBase 0.98.1
>> installed on it.
>>
>>
>>
>> The hadoop version is 2.3.0 and below my use case scenario.
>>
>>
>>
>> I wrote a map reduce program that reads data from an hbase table and does
>> some transformations on these data.
>>
>> Jobs are very simple so they didn't need the  reduce phase. I also wrote
>> a TableInputFormat  extension in order to maximize the number of concurrent
>> maps on the cluster.
>>
>> In other words, each  row should be processed by a single map task.
>>
>>
>>
>> Everything goes well until the number of rows and consequently  mappers
>> exceeds 300000 quota.
>>
>>
>>
>> This is the only exception I see when the job fails:
>>
>>
>>
>> Application application_1416304409718_0032 failed 2 times due to AM
>> Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1
>> due to:
>>
>>
>>
>>
>>
>> Exception from container-launch:
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>
>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
>>
>> at org.apache.hadoop.util.Shell.run(Shell.java:424)
>>
>> at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>>
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>
>> at java.lang.Thread.run(Thread.java:745)
>>
>> Container exited with a non-zero exit code 1
>>
>>
>>
>>
>>
>> Cluster configuration details:
>>
>> Node1: 12 GB, 4 core
>>
>> Node2: 6 GB, 4 core
>>
>> Node3: 6 GB, 4 core
>>
>>
>>
>> yarn.scheduler.minimum-allocation-mb=2048
>>
>> yarn.scheduler.maximum-allocation-mb=4096
>>
>> yarn.nodemanager.resource.memory-mb=6144
>>
>>
>>
>>
>>
>>
>>
>> Regards
>>
>>
>>
>
>

Re: MR job fails with too many mappers

Posted by Palaniappan Viswanathan <re...@gmail.com>.

Dear All,

I am fairly new to hadoop. I have a technical background. I know the hadoop
architecture at a higher level but want to start understanding the software
design of the different components of hadoop so that I can start
contributing in terms of development. Can somebody suggest where I should
start ? Are there any software design documents, flow diagrams or any such
documentation that can help a newbie like me so that I can eventually
contribute to the development of hadoop?

Thanks,

Palani.V

On Wed, Nov 19, 2014 at 4:14 AM, francexo83 <fr...@gmail.com> wrote:

> Thank you very much for your suggestion, it was very helpful.
>
> This is what I have after  turning off log aggregation:
>
> 2014-11-18 18:39:01,507 INFO [main]
> org.apache.hadoop.service.AbstractService: Service
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state STARTED;
> cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_1416332245344_0004
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_1416332245344_0004
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1551)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1406)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1373)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:986)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1249)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1049)
>         at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1460)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
> Caused by: java.io.IOException: Split metadata size exceeded 10000000.
> Aborting job job_1416332245344_0004
>         at
> org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:53)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1546)
>
>
> I exceeded the split metadata size so I  added the following property into
> the mapred-site.xml and it worked:
>
> <property>
>     <name>mapreduce.job.split.metainfo.maxsize</name>
>     <value>500000000</value>
> </property>
>
> thanks again.
>
>
>
>
>
>
>
>
> 2014-11-18 17:59 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>
>>  If log aggregation is enabled, log folder will be deleted. So I suggest
>> disable “yarn.log-aggregation-enable” and run job again. All the logs
>> remains at log folder. Then you can find container logs
>>
>>
>>
>> Thanks & Regards
>>
>> Rohith Sharma K S
>>
>>
>>
>> This e-mail and its attachments contain confidential information from
>> HUAWEI, which is intended only for the person or entity whose address is
>> listed above. Any use of the information contained herein in any way
>> (including, but not limited to, total or partial disclosure, reproduction,
>> or dissemination) by persons other than the intended recipient(s) is
>> prohibited. If you receive this e-mail in error, please notify the sender
>> by phone or email immediately and delete it!
>>
>>
>>
>> *From:* francexo83 [mailto:francexo83@gmail.com]
>> *Sent:* 18 November 2014 22:15
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: MR job fails with too many mappers
>>
>>
>>
>> Hi,
>>
>>
>>
>> thank you for your quick response, but I was not able to see the logs for
>> the container.
>>
>>
>>
>> I get a  "no such file or directory" when I try to access the logs of the
>> container from the shell:
>>
>>
>>
>> cd /var/log/hadoop-yarn/containers/application_1416304409718_0032
>>
>>
>>
>>
>>
>> It seems that the container has never been created.
>>
>>
>>
>>
>>
>>
>>
>> thanks
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>>
>> Hi
>>
>>
>>
>> Could you get syserr and sysout log for contrainer.? These logs will be
>> available in the same location  syslog for container.
>>
>> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
>>
>> This helps to find problem!!
>>
>>
>>
>>
>>
>> Thanks & Regards
>>
>> Rohith Sharma K S
>>
>>
>>
>> *From:* francexo83 [mailto:francexo83@gmail.com]
>> *Sent:* 18 November 2014 20:53
>> *To:* user@hadoop.apache.org
>> *Subject:* MR job fails with too many mappers
>>
>>
>>
>> Hi All,
>>
>>
>>
>> I have a small  hadoop cluster with three nodes and HBase 0.98.1
>> installed on it.
>>
>>
>>
>> The hadoop version is 2.3.0 and below my use case scenario.
>>
>>
>>
>> I wrote a map reduce program that reads data from an hbase table and does
>> some transformations on these data.
>>
>> Jobs are very simple so they didn't need the  reduce phase. I also wrote
>> a TableInputFormat  extension in order to maximize the number of concurrent
>> maps on the cluster.
>>
>> In other words, each  row should be processed by a single map task.
>>
>>
>>
>> Everything goes well until the number of rows and consequently  mappers
>> exceeds 300000 quota.
>>
>>
>>
>> This is the only exception I see when the job fails:
>>
>>
>>
>> Application application_1416304409718_0032 failed 2 times due to AM
>> Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1
>> due to:
>>
>>
>>
>>
>>
>> Exception from container-launch:
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>
>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
>>
>> at org.apache.hadoop.util.Shell.run(Shell.java:424)
>>
>> at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>>
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>
>> at java.lang.Thread.run(Thread.java:745)
>>
>> Container exited with a non-zero exit code 1
>>
>>
>>
>>
>>
>> Cluster configuration details:
>>
>> Node1: 12 GB, 4 core
>>
>> Node2: 6 GB, 4 core
>>
>> Node3: 6 GB, 4 core
>>
>>
>>
>> yarn.scheduler.minimum-allocation-mb=2048
>>
>> yarn.scheduler.maximum-allocation-mb=4096
>>
>> yarn.nodemanager.resource.memory-mb=6144
>>
>>
>>
>>
>>
>>
>>
>> Regards
>>
>>
>>
>
>

Re: MR job fails with too many mappers

Posted by Palaniappan Viswanathan <re...@gmail.com>.

Dear All,

I am fairly new to hadoop. I have a technical background. I know the hadoop
architecture at a higher level but want to start understanding the software
design of the different components of hadoop so that I can start
contributing in terms of development. Can somebody suggest where I should
start ? Are there any software design documents, flow diagrams or any such
documentation that can help a newbie like me so that I can eventually
contribute to the development of hadoop?

Thanks,

Palani.V

On Wed, Nov 19, 2014 at 4:14 AM, francexo83 <fr...@gmail.com> wrote:

> Thank you very much for your suggestion, it was very helpful.
>
> This is what I have after  turning off log aggregation:
>
> 2014-11-18 18:39:01,507 INFO [main]
> org.apache.hadoop.service.AbstractService: Service
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state STARTED;
> cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_1416332245344_0004
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_1416332245344_0004
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1551)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1406)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1373)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:986)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1249)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1049)
>         at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1460)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
> Caused by: java.io.IOException: Split metadata size exceeded 10000000.
> Aborting job job_1416332245344_0004
>         at
> org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:53)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1546)
>
>
> I exceeded the split metadata size so I  added the following property into
> the mapred-site.xml and it worked:
>
> <property>
>     <name>mapreduce.job.split.metainfo.maxsize</name>
>     <value>500000000</value>
> </property>
>
> thanks again.
>
>
>
>
>
>
>
>
> 2014-11-18 17:59 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>
>>  If log aggregation is enabled, log folder will be deleted. So I suggest
>> disable “yarn.log-aggregation-enable” and run job again. All the logs
>> remains at log folder. Then you can find container logs
>>
>>
>>
>> Thanks & Regards
>>
>> Rohith Sharma K S
>>
>>
>>
>> This e-mail and its attachments contain confidential information from
>> HUAWEI, which is intended only for the person or entity whose address is
>> listed above. Any use of the information contained herein in any way
>> (including, but not limited to, total or partial disclosure, reproduction,
>> or dissemination) by persons other than the intended recipient(s) is
>> prohibited. If you receive this e-mail in error, please notify the sender
>> by phone or email immediately and delete it!
>>
>>
>>
>> *From:* francexo83 [mailto:francexo83@gmail.com]
>> *Sent:* 18 November 2014 22:15
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: MR job fails with too many mappers
>>
>>
>>
>> Hi,
>>
>>
>>
>> thank you for your quick response, but I was not able to see the logs for
>> the container.
>>
>>
>>
>> I get a  "no such file or directory" when I try to access the logs of the
>> container from the shell:
>>
>>
>>
>> cd /var/log/hadoop-yarn/containers/application_1416304409718_0032
>>
>>
>>
>>
>>
>> It seems that the container has never been created.
>>
>>
>>
>>
>>
>>
>>
>> thanks
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>>
>> Hi
>>
>>
>>
>> Could you get syserr and sysout log for contrainer.? These logs will be
>> available in the same location  syslog for container.
>>
>> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
>>
>> This helps to find problem!!
>>
>>
>>
>>
>>
>> Thanks & Regards
>>
>> Rohith Sharma K S
>>
>>
>>
>> *From:* francexo83 [mailto:francexo83@gmail.com]
>> *Sent:* 18 November 2014 20:53
>> *To:* user@hadoop.apache.org
>> *Subject:* MR job fails with too many mappers
>>
>>
>>
>> Hi All,
>>
>>
>>
>> I have a small  hadoop cluster with three nodes and HBase 0.98.1
>> installed on it.
>>
>>
>>
>> The hadoop version is 2.3.0 and below my use case scenario.
>>
>>
>>
>> I wrote a map reduce program that reads data from an hbase table and does
>> some transformations on these data.
>>
>> Jobs are very simple so they didn't need the  reduce phase. I also wrote
>> a TableInputFormat  extension in order to maximize the number of concurrent
>> maps on the cluster.
>>
>> In other words, each  row should be processed by a single map task.
>>
>>
>>
>> Everything goes well until the number of rows and consequently  mappers
>> exceeds 300000 quota.
>>
>>
>>
>> This is the only exception I see when the job fails:
>>
>>
>>
>> Application application_1416304409718_0032 failed 2 times due to AM
>> Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1
>> due to:
>>
>>
>>
>>
>>
>> Exception from container-launch:
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>
>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
>>
>> at org.apache.hadoop.util.Shell.run(Shell.java:424)
>>
>> at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>>
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>
>> at java.lang.Thread.run(Thread.java:745)
>>
>> Container exited with a non-zero exit code 1
>>
>>
>>
>>
>>
>> Cluster configuration details:
>>
>> Node1: 12 GB, 4 core
>>
>> Node2: 6 GB, 4 core
>>
>> Node3: 6 GB, 4 core
>>
>>
>>
>> yarn.scheduler.minimum-allocation-mb=2048
>>
>> yarn.scheduler.maximum-allocation-mb=4096
>>
>> yarn.nodemanager.resource.memory-mb=6144
>>
>>
>>
>>
>>
>>
>>
>> Regards
>>
>>
>>
>
>

Re: MR job fails with too many mappers

Posted by francexo83 <fr...@gmail.com>.

Hi,

as I said before, I wrote TableInputFormat and RecordReader extension that
reads input data from an Hbase table,

in my case every single row is associated with a single InputSplit.

For example if I have 300000 rows to process,  my custom TableInputFormat
will generate 300000 input splits and as a result

300000 mapper task in my MapReguce job.

That's all.

Regards



2014-11-20 6:02 GMT+01:00 Susheel Kumar Gadalay <sk...@gmail.com>:

> In which case the split metadata go beyond 10MB?
> Can u give some details of your input file and splits.
>
> On 11/19/14, francexo83 <fr...@gmail.com> wrote:
> > Thank you very much for your suggestion, it was very helpful.
> >
> > This is what I have after  turning off log aggregation:
> >
> > 2014-11-18 18:39:01,507 INFO [main]
> > org.apache.hadoop.service.AbstractService: Service
> > org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state STARTED;
> > cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> > java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> > job_1416332245344_0004
> > org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> > java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> > job_1416332245344_0004
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1551)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1406)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1373)
> >         at
> >
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
> >         at
> >
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> >         at
> >
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> >         at
> >
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:986)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1249)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1049)
> >         at
> > org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1460)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at javax.security.auth.Subject.doAs(Subject.java:422)
> >         at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
> > Caused by: java.io.IOException: Split metadata size exceeded 10000000.
> > Aborting job job_1416332245344_0004
> >         at
> >
> org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:53)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1546)
> >
> >
> > I exceeded the split metadata size so I  added the following property
> into
> > the mapred-site.xml and it worked:
> >
> > <property>
> >     <name>mapreduce.job.split.metainfo.maxsize</name>
> >     <value>500000000</value>
> > </property>
> >
> > thanks again.
> >
> >
> >
> >
> >
> >
> >
> >
> > 2014-11-18 17:59 GMT+01:00 Rohith Sharma K S <rohithsharmaks@huawei.com
> >:
> >
> >>  If log aggregation is enabled, log folder will be deleted. So I suggest
> >> disable “yarn.log-aggregation-enable” and run job again. All the logs
> >> remains at log folder. Then you can find container logs
> >>
> >>
> >>
> >> Thanks & Regards
> >>
> >> Rohith Sharma K S
> >>
> >>
> >>
> >> This e-mail and its attachments contain confidential information from
> >> HUAWEI, which is intended only for the person or entity whose address is
> >> listed above. Any use of the information contained herein in any way
> >> (including, but not limited to, total or partial disclosure,
> >> reproduction,
> >> or dissemination) by persons other than the intended recipient(s) is
> >> prohibited. If you receive this e-mail in error, please notify the
> sender
> >> by phone or email immediately and delete it!
> >>
> >>
> >>
> >> *From:* francexo83 [mailto:francexo83@gmail.com]
> >> *Sent:* 18 November 2014 22:15
> >> *To:* user@hadoop.apache.org
> >> *Subject:* Re: MR job fails with too many mappers
> >>
> >>
> >>
> >> Hi,
> >>
> >>
> >>
> >> thank you for your quick response, but I was not able to see the logs
> for
> >> the container.
> >>
> >>
> >>
> >> I get a  "no such file or directory" when I try to access the logs of
> the
> >> container from the shell:
> >>
> >>
> >>
> >> cd /var/log/hadoop-yarn/containers/application_1416304409718_0032
> >>
> >>
> >>
> >>
> >>
> >> It seems that the container has never been created.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> thanks
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> 2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <rohithsharmaks@huawei.com
> >:
> >>
> >> Hi
> >>
> >>
> >>
> >> Could you get syserr and sysout log for contrainer.? These logs will be
> >> available in the same location  syslog for container.
> >>
> >> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
> >>
> >> This helps to find problem!!
> >>
> >>
> >>
> >>
> >>
> >> Thanks & Regards
> >>
> >> Rohith Sharma K S
> >>
> >>
> >>
> >> *From:* francexo83 [mailto:francexo83@gmail.com]
> >> *Sent:* 18 November 2014 20:53
> >> *To:* user@hadoop.apache.org
> >> *Subject:* MR job fails with too many mappers
> >>
> >>
> >>
> >> Hi All,
> >>
> >>
> >>
> >> I have a small  hadoop cluster with three nodes and HBase 0.98.1
> >> installed
> >> on it.
> >>
> >>
> >>
> >> The hadoop version is 2.3.0 and below my use case scenario.
> >>
> >>
> >>
> >> I wrote a map reduce program that reads data from an hbase table and
> does
> >> some transformations on these data.
> >>
> >> Jobs are very simple so they didn't need the  reduce phase. I also wrote
> >> a
> >> TableInputFormat  extension in order to maximize the number of
> concurrent
> >> maps on the cluster.
> >>
> >> In other words, each  row should be processed by a single map task.
> >>
> >>
> >>
> >> Everything goes well until the number of rows and consequently  mappers
> >> exceeds 300000 quota.
> >>
> >>
> >>
> >> This is the only exception I see when the job fails:
> >>
> >>
> >>
> >> Application application_1416304409718_0032 failed 2 times due to AM
> >> Container for appattempt_1416304409718_0032_000002 exited with exitCode:
> >> 1
> >> due to:
> >>
> >>
> >>
> >>
> >>
> >> Exception from container-launch:
> >> org.apache.hadoop.util.Shell$ExitCodeException:
> >>
> >> org.apache.hadoop.util.Shell$ExitCodeException:
> >>
> >> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
> >>
> >> at org.apache.hadoop.util.Shell.run(Shell.java:424)
> >>
> >> at
> >>
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
> >>
> >> at
> >>
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> >>
> >> at
> >>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> >>
> >> at
> >>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> >>
> >> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >>
> >> at
> >>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >>
> >> at
> >>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >>
> >> at java.lang.Thread.run(Thread.java:745)
> >>
> >> Container exited with a non-zero exit code 1
> >>
> >>
> >>
> >>
> >>
> >> Cluster configuration details:
> >>
> >> Node1: 12 GB, 4 core
> >>
> >> Node2: 6 GB, 4 core
> >>
> >> Node3: 6 GB, 4 core
> >>
> >>
> >>
> >> yarn.scheduler.minimum-allocation-mb=2048
> >>
> >> yarn.scheduler.maximum-allocation-mb=4096
> >>
> >> yarn.nodemanager.resource.memory-mb=6144
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Regards
> >>
> >>
> >>
> >
>

Re: MR job fails with too many mappers

Posted by francexo83 <fr...@gmail.com>.

Hi,

as I said before, I wrote TableInputFormat and RecordReader extension that
reads input data from an Hbase table,

in my case every single row is associated with a single InputSplit.

For example if I have 300000 rows to process,  my custom TableInputFormat
will generate 300000 input splits and as a result

300000 mapper task in my MapReguce job.

That's all.

Regards



2014-11-20 6:02 GMT+01:00 Susheel Kumar Gadalay <sk...@gmail.com>:

> In which case the split metadata go beyond 10MB?
> Can u give some details of your input file and splits.
>
> On 11/19/14, francexo83 <fr...@gmail.com> wrote:
> > Thank you very much for your suggestion, it was very helpful.
> >
> > This is what I have after  turning off log aggregation:
> >
> > 2014-11-18 18:39:01,507 INFO [main]
> > org.apache.hadoop.service.AbstractService: Service
> > org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state STARTED;
> > cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> > java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> > job_1416332245344_0004
> > org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> > java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> > job_1416332245344_0004
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1551)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1406)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1373)
> >         at
> >
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
> >         at
> >
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> >         at
> >
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> >         at
> >
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:986)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1249)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1049)
> >         at
> > org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1460)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at javax.security.auth.Subject.doAs(Subject.java:422)
> >         at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
> > Caused by: java.io.IOException: Split metadata size exceeded 10000000.
> > Aborting job job_1416332245344_0004
> >         at
> >
> org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:53)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1546)
> >
> >
> > I exceeded the split metadata size so I  added the following property
> into
> > the mapred-site.xml and it worked:
> >
> > <property>
> >     <name>mapreduce.job.split.metainfo.maxsize</name>
> >     <value>500000000</value>
> > </property>
> >
> > thanks again.
> >
> >
> >
> >
> >
> >
> >
> >
> > 2014-11-18 17:59 GMT+01:00 Rohith Sharma K S <rohithsharmaks@huawei.com
> >:
> >
> >>  If log aggregation is enabled, log folder will be deleted. So I suggest
> >> disable “yarn.log-aggregation-enable” and run job again. All the logs
> >> remains at log folder. Then you can find container logs
> >>
> >>
> >>
> >> Thanks & Regards
> >>
> >> Rohith Sharma K S
> >>
> >>
> >>
> >> This e-mail and its attachments contain confidential information from
> >> HUAWEI, which is intended only for the person or entity whose address is
> >> listed above. Any use of the information contained herein in any way
> >> (including, but not limited to, total or partial disclosure,
> >> reproduction,
> >> or dissemination) by persons other than the intended recipient(s) is
> >> prohibited. If you receive this e-mail in error, please notify the
> sender
> >> by phone or email immediately and delete it!
> >>
> >>
> >>
> >> *From:* francexo83 [mailto:francexo83@gmail.com]
> >> *Sent:* 18 November 2014 22:15
> >> *To:* user@hadoop.apache.org
> >> *Subject:* Re: MR job fails with too many mappers
> >>
> >>
> >>
> >> Hi,
> >>
> >>
> >>
> >> thank you for your quick response, but I was not able to see the logs
> for
> >> the container.
> >>
> >>
> >>
> >> I get a  "no such file or directory" when I try to access the logs of
> the
> >> container from the shell:
> >>
> >>
> >>
> >> cd /var/log/hadoop-yarn/containers/application_1416304409718_0032
> >>
> >>
> >>
> >>
> >>
> >> It seems that the container has never been created.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> thanks
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> 2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <rohithsharmaks@huawei.com
> >:
> >>
> >> Hi
> >>
> >>
> >>
> >> Could you get syserr and sysout log for contrainer.? These logs will be
> >> available in the same location  syslog for container.
> >>
> >> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
> >>
> >> This helps to find problem!!
> >>
> >>
> >>
> >>
> >>
> >> Thanks & Regards
> >>
> >> Rohith Sharma K S
> >>
> >>
> >>
> >> *From:* francexo83 [mailto:francexo83@gmail.com]
> >> *Sent:* 18 November 2014 20:53
> >> *To:* user@hadoop.apache.org
> >> *Subject:* MR job fails with too many mappers
> >>
> >>
> >>
> >> Hi All,
> >>
> >>
> >>
> >> I have a small  hadoop cluster with three nodes and HBase 0.98.1
> >> installed
> >> on it.
> >>
> >>
> >>
> >> The hadoop version is 2.3.0 and below my use case scenario.
> >>
> >>
> >>
> >> I wrote a map reduce program that reads data from an hbase table and
> does
> >> some transformations on these data.
> >>
> >> Jobs are very simple so they didn't need the  reduce phase. I also wrote
> >> a
> >> TableInputFormat  extension in order to maximize the number of
> concurrent
> >> maps on the cluster.
> >>
> >> In other words, each  row should be processed by a single map task.
> >>
> >>
> >>
> >> Everything goes well until the number of rows and consequently  mappers
> >> exceeds 300000 quota.
> >>
> >>
> >>
> >> This is the only exception I see when the job fails:
> >>
> >>
> >>
> >> Application application_1416304409718_0032 failed 2 times due to AM
> >> Container for appattempt_1416304409718_0032_000002 exited with exitCode:
> >> 1
> >> due to:
> >>
> >>
> >>
> >>
> >>
> >> Exception from container-launch:
> >> org.apache.hadoop.util.Shell$ExitCodeException:
> >>
> >> org.apache.hadoop.util.Shell$ExitCodeException:
> >>
> >> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
> >>
> >> at org.apache.hadoop.util.Shell.run(Shell.java:424)
> >>
> >> at
> >>
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
> >>
> >> at
> >>
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> >>
> >> at
> >>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> >>
> >> at
> >>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> >>
> >> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >>
> >> at
> >>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >>
> >> at
> >>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >>
> >> at java.lang.Thread.run(Thread.java:745)
> >>
> >> Container exited with a non-zero exit code 1
> >>
> >>
> >>
> >>
> >>
> >> Cluster configuration details:
> >>
> >> Node1: 12 GB, 4 core
> >>
> >> Node2: 6 GB, 4 core
> >>
> >> Node3: 6 GB, 4 core
> >>
> >>
> >>
> >> yarn.scheduler.minimum-allocation-mb=2048
> >>
> >> yarn.scheduler.maximum-allocation-mb=4096
> >>
> >> yarn.nodemanager.resource.memory-mb=6144
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Regards
> >>
> >>
> >>
> >
>

Re: MR job fails with too many mappers

Posted by francexo83 <fr...@gmail.com>.

Hi,

as I said before, I wrote TableInputFormat and RecordReader extension that
reads input data from an Hbase table,

in my case every single row is associated with a single InputSplit.

For example if I have 300000 rows to process,  my custom TableInputFormat
will generate 300000 input splits and as a result

300000 mapper task in my MapReguce job.

That's all.

Regards



2014-11-20 6:02 GMT+01:00 Susheel Kumar Gadalay <sk...@gmail.com>:

> In which case the split metadata go beyond 10MB?
> Can u give some details of your input file and splits.
>
> On 11/19/14, francexo83 <fr...@gmail.com> wrote:
> > Thank you very much for your suggestion, it was very helpful.
> >
> > This is what I have after  turning off log aggregation:
> >
> > 2014-11-18 18:39:01,507 INFO [main]
> > org.apache.hadoop.service.AbstractService: Service
> > org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state STARTED;
> > cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> > java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> > job_1416332245344_0004
> > org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> > java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> > job_1416332245344_0004
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1551)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1406)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1373)
> >         at
> >
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
> >         at
> >
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> >         at
> >
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> >         at
> >
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:986)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1249)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1049)
> >         at
> > org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1460)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at javax.security.auth.Subject.doAs(Subject.java:422)
> >         at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
> > Caused by: java.io.IOException: Split metadata size exceeded 10000000.
> > Aborting job job_1416332245344_0004
> >         at
> >
> org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:53)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1546)
> >
> >
> > I exceeded the split metadata size so I  added the following property
> into
> > the mapred-site.xml and it worked:
> >
> > <property>
> >     <name>mapreduce.job.split.metainfo.maxsize</name>
> >     <value>500000000</value>
> > </property>
> >
> > thanks again.
> >
> >
> >
> >
> >
> >
> >
> >
> > 2014-11-18 17:59 GMT+01:00 Rohith Sharma K S <rohithsharmaks@huawei.com
> >:
> >
> >>  If log aggregation is enabled, log folder will be deleted. So I suggest
> >> disable “yarn.log-aggregation-enable” and run job again. All the logs
> >> remains at log folder. Then you can find container logs
> >>
> >>
> >>
> >> Thanks & Regards
> >>
> >> Rohith Sharma K S
> >>
> >>
> >>
> >> This e-mail and its attachments contain confidential information from
> >> HUAWEI, which is intended only for the person or entity whose address is
> >> listed above. Any use of the information contained herein in any way
> >> (including, but not limited to, total or partial disclosure,
> >> reproduction,
> >> or dissemination) by persons other than the intended recipient(s) is
> >> prohibited. If you receive this e-mail in error, please notify the
> sender
> >> by phone or email immediately and delete it!
> >>
> >>
> >>
> >> *From:* francexo83 [mailto:francexo83@gmail.com]
> >> *Sent:* 18 November 2014 22:15
> >> *To:* user@hadoop.apache.org
> >> *Subject:* Re: MR job fails with too many mappers
> >>
> >>
> >>
> >> Hi,
> >>
> >>
> >>
> >> thank you for your quick response, but I was not able to see the logs
> for
> >> the container.
> >>
> >>
> >>
> >> I get a  "no such file or directory" when I try to access the logs of
> the
> >> container from the shell:
> >>
> >>
> >>
> >> cd /var/log/hadoop-yarn/containers/application_1416304409718_0032
> >>
> >>
> >>
> >>
> >>
> >> It seems that the container has never been created.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> thanks
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> 2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <rohithsharmaks@huawei.com
> >:
> >>
> >> Hi
> >>
> >>
> >>
> >> Could you get syserr and sysout log for contrainer.? These logs will be
> >> available in the same location  syslog for container.
> >>
> >> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
> >>
> >> This helps to find problem!!
> >>
> >>
> >>
> >>
> >>
> >> Thanks & Regards
> >>
> >> Rohith Sharma K S
> >>
> >>
> >>
> >> *From:* francexo83 [mailto:francexo83@gmail.com]
> >> *Sent:* 18 November 2014 20:53
> >> *To:* user@hadoop.apache.org
> >> *Subject:* MR job fails with too many mappers
> >>
> >>
> >>
> >> Hi All,
> >>
> >>
> >>
> >> I have a small  hadoop cluster with three nodes and HBase 0.98.1
> >> installed
> >> on it.
> >>
> >>
> >>
> >> The hadoop version is 2.3.0 and below my use case scenario.
> >>
> >>
> >>
> >> I wrote a map reduce program that reads data from an hbase table and
> does
> >> some transformations on these data.
> >>
> >> Jobs are very simple so they didn't need the  reduce phase. I also wrote
> >> a
> >> TableInputFormat  extension in order to maximize the number of
> concurrent
> >> maps on the cluster.
> >>
> >> In other words, each  row should be processed by a single map task.
> >>
> >>
> >>
> >> Everything goes well until the number of rows and consequently  mappers
> >> exceeds 300000 quota.
> >>
> >>
> >>
> >> This is the only exception I see when the job fails:
> >>
> >>
> >>
> >> Application application_1416304409718_0032 failed 2 times due to AM
> >> Container for appattempt_1416304409718_0032_000002 exited with exitCode:
> >> 1
> >> due to:
> >>
> >>
> >>
> >>
> >>
> >> Exception from container-launch:
> >> org.apache.hadoop.util.Shell$ExitCodeException:
> >>
> >> org.apache.hadoop.util.Shell$ExitCodeException:
> >>
> >> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
> >>
> >> at org.apache.hadoop.util.Shell.run(Shell.java:424)
> >>
> >> at
> >>
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
> >>
> >> at
> >>
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> >>
> >> at
> >>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> >>
> >> at
> >>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> >>
> >> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >>
> >> at
> >>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >>
> >> at
> >>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >>
> >> at java.lang.Thread.run(Thread.java:745)
> >>
> >> Container exited with a non-zero exit code 1
> >>
> >>
> >>
> >>
> >>
> >> Cluster configuration details:
> >>
> >> Node1: 12 GB, 4 core
> >>
> >> Node2: 6 GB, 4 core
> >>
> >> Node3: 6 GB, 4 core
> >>
> >>
> >>
> >> yarn.scheduler.minimum-allocation-mb=2048
> >>
> >> yarn.scheduler.maximum-allocation-mb=4096
> >>
> >> yarn.nodemanager.resource.memory-mb=6144
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Regards
> >>
> >>
> >>
> >
>

Re: MR job fails with too many mappers

Posted by francexo83 <fr...@gmail.com>.

Hi,

as I said before, I wrote TableInputFormat and RecordReader extension that
reads input data from an Hbase table,

in my case every single row is associated with a single InputSplit.

For example if I have 300000 rows to process,  my custom TableInputFormat
will generate 300000 input splits and as a result

300000 mapper task in my MapReguce job.

That's all.

Regards



2014-11-20 6:02 GMT+01:00 Susheel Kumar Gadalay <sk...@gmail.com>:

> In which case the split metadata go beyond 10MB?
> Can u give some details of your input file and splits.
>
> On 11/19/14, francexo83 <fr...@gmail.com> wrote:
> > Thank you very much for your suggestion, it was very helpful.
> >
> > This is what I have after  turning off log aggregation:
> >
> > 2014-11-18 18:39:01,507 INFO [main]
> > org.apache.hadoop.service.AbstractService: Service
> > org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state STARTED;
> > cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> > java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> > job_1416332245344_0004
> > org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> > java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> > job_1416332245344_0004
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1551)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1406)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1373)
> >         at
> >
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
> >         at
> >
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> >         at
> >
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> >         at
> >
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:986)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1249)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1049)
> >         at
> > org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1460)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at javax.security.auth.Subject.doAs(Subject.java:422)
> >         at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
> > Caused by: java.io.IOException: Split metadata size exceeded 10000000.
> > Aborting job job_1416332245344_0004
> >         at
> >
> org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:53)
> >         at
> >
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1546)
> >
> >
> > I exceeded the split metadata size so I  added the following property
> into
> > the mapred-site.xml and it worked:
> >
> > <property>
> >     <name>mapreduce.job.split.metainfo.maxsize</name>
> >     <value>500000000</value>
> > </property>
> >
> > thanks again.
> >
> >
> >
> >
> >
> >
> >
> >
> > 2014-11-18 17:59 GMT+01:00 Rohith Sharma K S <rohithsharmaks@huawei.com
> >:
> >
> >>  If log aggregation is enabled, log folder will be deleted. So I suggest
> >> disable “yarn.log-aggregation-enable” and run job again. All the logs
> >> remains at log folder. Then you can find container logs
> >>
> >>
> >>
> >> Thanks & Regards
> >>
> >> Rohith Sharma K S
> >>
> >>
> >>
> >> This e-mail and its attachments contain confidential information from
> >> HUAWEI, which is intended only for the person or entity whose address is
> >> listed above. Any use of the information contained herein in any way
> >> (including, but not limited to, total or partial disclosure,
> >> reproduction,
> >> or dissemination) by persons other than the intended recipient(s) is
> >> prohibited. If you receive this e-mail in error, please notify the
> sender
> >> by phone or email immediately and delete it!
> >>
> >>
> >>
> >> *From:* francexo83 [mailto:francexo83@gmail.com]
> >> *Sent:* 18 November 2014 22:15
> >> *To:* user@hadoop.apache.org
> >> *Subject:* Re: MR job fails with too many mappers
> >>
> >>
> >>
> >> Hi,
> >>
> >>
> >>
> >> thank you for your quick response, but I was not able to see the logs
> for
> >> the container.
> >>
> >>
> >>
> >> I get a  "no such file or directory" when I try to access the logs of
> the
> >> container from the shell:
> >>
> >>
> >>
> >> cd /var/log/hadoop-yarn/containers/application_1416304409718_0032
> >>
> >>
> >>
> >>
> >>
> >> It seems that the container has never been created.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> thanks
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> 2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <rohithsharmaks@huawei.com
> >:
> >>
> >> Hi
> >>
> >>
> >>
> >> Could you get syserr and sysout log for contrainer.? These logs will be
> >> available in the same location  syslog for container.
> >>
> >> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
> >>
> >> This helps to find problem!!
> >>
> >>
> >>
> >>
> >>
> >> Thanks & Regards
> >>
> >> Rohith Sharma K S
> >>
> >>
> >>
> >> *From:* francexo83 [mailto:francexo83@gmail.com]
> >> *Sent:* 18 November 2014 20:53
> >> *To:* user@hadoop.apache.org
> >> *Subject:* MR job fails with too many mappers
> >>
> >>
> >>
> >> Hi All,
> >>
> >>
> >>
> >> I have a small  hadoop cluster with three nodes and HBase 0.98.1
> >> installed
> >> on it.
> >>
> >>
> >>
> >> The hadoop version is 2.3.0 and below my use case scenario.
> >>
> >>
> >>
> >> I wrote a map reduce program that reads data from an hbase table and
> does
> >> some transformations on these data.
> >>
> >> Jobs are very simple so they didn't need the  reduce phase. I also wrote
> >> a
> >> TableInputFormat  extension in order to maximize the number of
> concurrent
> >> maps on the cluster.
> >>
> >> In other words, each  row should be processed by a single map task.
> >>
> >>
> >>
> >> Everything goes well until the number of rows and consequently  mappers
> >> exceeds 300000 quota.
> >>
> >>
> >>
> >> This is the only exception I see when the job fails:
> >>
> >>
> >>
> >> Application application_1416304409718_0032 failed 2 times due to AM
> >> Container for appattempt_1416304409718_0032_000002 exited with exitCode:
> >> 1
> >> due to:
> >>
> >>
> >>
> >>
> >>
> >> Exception from container-launch:
> >> org.apache.hadoop.util.Shell$ExitCodeException:
> >>
> >> org.apache.hadoop.util.Shell$ExitCodeException:
> >>
> >> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
> >>
> >> at org.apache.hadoop.util.Shell.run(Shell.java:424)
> >>
> >> at
> >>
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
> >>
> >> at
> >>
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> >>
> >> at
> >>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> >>
> >> at
> >>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> >>
> >> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >>
> >> at
> >>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >>
> >> at
> >>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >>
> >> at java.lang.Thread.run(Thread.java:745)
> >>
> >> Container exited with a non-zero exit code 1
> >>
> >>
> >>
> >>
> >>
> >> Cluster configuration details:
> >>
> >> Node1: 12 GB, 4 core
> >>
> >> Node2: 6 GB, 4 core
> >>
> >> Node3: 6 GB, 4 core
> >>
> >>
> >>
> >> yarn.scheduler.minimum-allocation-mb=2048
> >>
> >> yarn.scheduler.maximum-allocation-mb=4096
> >>
> >> yarn.nodemanager.resource.memory-mb=6144
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Regards
> >>
> >>
> >>
> >
>

Re: MR job fails with too many mappers

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.

In which case the split metadata go beyond 10MB?
Can u give some details of your input file and splits.

On 11/19/14, francexo83 <fr...@gmail.com> wrote:
> Thank you very much for your suggestion, it was very helpful.
>
> This is what I have after  turning off log aggregation:
>
> 2014-11-18 18:39:01,507 INFO [main]
> org.apache.hadoop.service.AbstractService: Service
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state STARTED;
> cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_1416332245344_0004
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_1416332245344_0004
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1551)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1406)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1373)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:986)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1249)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1049)
>         at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1460)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
> Caused by: java.io.IOException: Split metadata size exceeded 10000000.
> Aborting job job_1416332245344_0004
>         at
> org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:53)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1546)
>
>
> I exceeded the split metadata size so I  added the following property into
> the mapred-site.xml and it worked:
>
> <property>
>     <name>mapreduce.job.split.metainfo.maxsize</name>
>     <value>500000000</value>
> </property>
>
> thanks again.
>
>
>
>
>
>
>
>
> 2014-11-18 17:59 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>
>>  If log aggregation is enabled, log folder will be deleted. So I suggest
>> disable “yarn.log-aggregation-enable” and run job again. All the logs
>> remains at log folder. Then you can find container logs
>>
>>
>>
>> Thanks & Regards
>>
>> Rohith Sharma K S
>>
>>
>>
>> This e-mail and its attachments contain confidential information from
>> HUAWEI, which is intended only for the person or entity whose address is
>> listed above. Any use of the information contained herein in any way
>> (including, but not limited to, total or partial disclosure,
>> reproduction,
>> or dissemination) by persons other than the intended recipient(s) is
>> prohibited. If you receive this e-mail in error, please notify the sender
>> by phone or email immediately and delete it!
>>
>>
>>
>> *From:* francexo83 [mailto:francexo83@gmail.com]
>> *Sent:* 18 November 2014 22:15
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: MR job fails with too many mappers
>>
>>
>>
>> Hi,
>>
>>
>>
>> thank you for your quick response, but I was not able to see the logs for
>> the container.
>>
>>
>>
>> I get a  "no such file or directory" when I try to access the logs of the
>> container from the shell:
>>
>>
>>
>> cd /var/log/hadoop-yarn/containers/application_1416304409718_0032
>>
>>
>>
>>
>>
>> It seems that the container has never been created.
>>
>>
>>
>>
>>
>>
>>
>> thanks
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>>
>> Hi
>>
>>
>>
>> Could you get syserr and sysout log for contrainer.? These logs will be
>> available in the same location  syslog for container.
>>
>> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
>>
>> This helps to find problem!!
>>
>>
>>
>>
>>
>> Thanks & Regards
>>
>> Rohith Sharma K S
>>
>>
>>
>> *From:* francexo83 [mailto:francexo83@gmail.com]
>> *Sent:* 18 November 2014 20:53
>> *To:* user@hadoop.apache.org
>> *Subject:* MR job fails with too many mappers
>>
>>
>>
>> Hi All,
>>
>>
>>
>> I have a small  hadoop cluster with three nodes and HBase 0.98.1
>> installed
>> on it.
>>
>>
>>
>> The hadoop version is 2.3.0 and below my use case scenario.
>>
>>
>>
>> I wrote a map reduce program that reads data from an hbase table and does
>> some transformations on these data.
>>
>> Jobs are very simple so they didn't need the  reduce phase. I also wrote
>> a
>> TableInputFormat  extension in order to maximize the number of concurrent
>> maps on the cluster.
>>
>> In other words, each  row should be processed by a single map task.
>>
>>
>>
>> Everything goes well until the number of rows and consequently  mappers
>> exceeds 300000 quota.
>>
>>
>>
>> This is the only exception I see when the job fails:
>>
>>
>>
>> Application application_1416304409718_0032 failed 2 times due to AM
>> Container for appattempt_1416304409718_0032_000002 exited with exitCode:
>> 1
>> due to:
>>
>>
>>
>>
>>
>> Exception from container-launch:
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>
>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
>>
>> at org.apache.hadoop.util.Shell.run(Shell.java:424)
>>
>> at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>>
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>
>> at java.lang.Thread.run(Thread.java:745)
>>
>> Container exited with a non-zero exit code 1
>>
>>
>>
>>
>>
>> Cluster configuration details:
>>
>> Node1: 12 GB, 4 core
>>
>> Node2: 6 GB, 4 core
>>
>> Node3: 6 GB, 4 core
>>
>>
>>
>> yarn.scheduler.minimum-allocation-mb=2048
>>
>> yarn.scheduler.maximum-allocation-mb=4096
>>
>> yarn.nodemanager.resource.memory-mb=6144
>>
>>
>>
>>
>>
>>
>>
>> Regards
>>
>>
>>
>

Re: MR job fails with too many mappers

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.

In which case the split metadata go beyond 10MB?
Can u give some details of your input file and splits.

On 11/19/14, francexo83 <fr...@gmail.com> wrote:
> Thank you very much for your suggestion, it was very helpful.
>
> This is what I have after  turning off log aggregation:
>
> 2014-11-18 18:39:01,507 INFO [main]
> org.apache.hadoop.service.AbstractService: Service
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state STARTED;
> cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_1416332245344_0004
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_1416332245344_0004
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1551)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1406)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1373)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:986)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1249)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1049)
>         at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1460)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
> Caused by: java.io.IOException: Split metadata size exceeded 10000000.
> Aborting job job_1416332245344_0004
>         at
> org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:53)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1546)
>
>
> I exceeded the split metadata size so I  added the following property into
> the mapred-site.xml and it worked:
>
> <property>
>     <name>mapreduce.job.split.metainfo.maxsize</name>
>     <value>500000000</value>
> </property>
>
> thanks again.
>
>
>
>
>
>
>
>
> 2014-11-18 17:59 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>
>>  If log aggregation is enabled, log folder will be deleted. So I suggest
>> disable “yarn.log-aggregation-enable” and run job again. All the logs
>> remains at log folder. Then you can find container logs
>>
>>
>>
>> Thanks & Regards
>>
>> Rohith Sharma K S
>>
>>
>>
>> This e-mail and its attachments contain confidential information from
>> HUAWEI, which is intended only for the person or entity whose address is
>> listed above. Any use of the information contained herein in any way
>> (including, but not limited to, total or partial disclosure,
>> reproduction,
>> or dissemination) by persons other than the intended recipient(s) is
>> prohibited. If you receive this e-mail in error, please notify the sender
>> by phone or email immediately and delete it!
>>
>>
>>
>> *From:* francexo83 [mailto:francexo83@gmail.com]
>> *Sent:* 18 November 2014 22:15
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: MR job fails with too many mappers
>>
>>
>>
>> Hi,
>>
>>
>>
>> thank you for your quick response, but I was not able to see the logs for
>> the container.
>>
>>
>>
>> I get a  "no such file or directory" when I try to access the logs of the
>> container from the shell:
>>
>>
>>
>> cd /var/log/hadoop-yarn/containers/application_1416304409718_0032
>>
>>
>>
>>
>>
>> It seems that the container has never been created.
>>
>>
>>
>>
>>
>>
>>
>> thanks
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>>
>> Hi
>>
>>
>>
>> Could you get syserr and sysout log for contrainer.? These logs will be
>> available in the same location  syslog for container.
>>
>> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
>>
>> This helps to find problem!!
>>
>>
>>
>>
>>
>> Thanks & Regards
>>
>> Rohith Sharma K S
>>
>>
>>
>> *From:* francexo83 [mailto:francexo83@gmail.com]
>> *Sent:* 18 November 2014 20:53
>> *To:* user@hadoop.apache.org
>> *Subject:* MR job fails with too many mappers
>>
>>
>>
>> Hi All,
>>
>>
>>
>> I have a small  hadoop cluster with three nodes and HBase 0.98.1
>> installed
>> on it.
>>
>>
>>
>> The hadoop version is 2.3.0 and below my use case scenario.
>>
>>
>>
>> I wrote a map reduce program that reads data from an hbase table and does
>> some transformations on these data.
>>
>> Jobs are very simple so they didn't need the  reduce phase. I also wrote
>> a
>> TableInputFormat  extension in order to maximize the number of concurrent
>> maps on the cluster.
>>
>> In other words, each  row should be processed by a single map task.
>>
>>
>>
>> Everything goes well until the number of rows and consequently  mappers
>> exceeds 300000 quota.
>>
>>
>>
>> This is the only exception I see when the job fails:
>>
>>
>>
>> Application application_1416304409718_0032 failed 2 times due to AM
>> Container for appattempt_1416304409718_0032_000002 exited with exitCode:
>> 1
>> due to:
>>
>>
>>
>>
>>
>> Exception from container-launch:
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>
>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
>>
>> at org.apache.hadoop.util.Shell.run(Shell.java:424)
>>
>> at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>>
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>
>> at java.lang.Thread.run(Thread.java:745)
>>
>> Container exited with a non-zero exit code 1
>>
>>
>>
>>
>>
>> Cluster configuration details:
>>
>> Node1: 12 GB, 4 core
>>
>> Node2: 6 GB, 4 core
>>
>> Node3: 6 GB, 4 core
>>
>>
>>
>> yarn.scheduler.minimum-allocation-mb=2048
>>
>> yarn.scheduler.maximum-allocation-mb=4096
>>
>> yarn.nodemanager.resource.memory-mb=6144
>>
>>
>>
>>
>>
>>
>>
>> Regards
>>
>>
>>
>

Re: MR job fails with too many mappers

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.

In which case the split metadata go beyond 10MB?
Can u give some details of your input file and splits.

On 11/19/14, francexo83 <fr...@gmail.com> wrote:
> Thank you very much for your suggestion, it was very helpful.
>
> This is what I have after  turning off log aggregation:
>
> 2014-11-18 18:39:01,507 INFO [main]
> org.apache.hadoop.service.AbstractService: Service
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state STARTED;
> cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_1416332245344_0004
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_1416332245344_0004
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1551)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1406)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1373)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:986)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1249)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1049)
>         at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1460)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
> Caused by: java.io.IOException: Split metadata size exceeded 10000000.
> Aborting job job_1416332245344_0004
>         at
> org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:53)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1546)
>
>
> I exceeded the split metadata size so I  added the following property into
> the mapred-site.xml and it worked:
>
> <property>
>     <name>mapreduce.job.split.metainfo.maxsize</name>
>     <value>500000000</value>
> </property>
>
> thanks again.
>
>
>
>
>
>
>
>
> 2014-11-18 17:59 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>
>>  If log aggregation is enabled, log folder will be deleted. So I suggest
>> disable “yarn.log-aggregation-enable” and run job again. All the logs
>> remains at log folder. Then you can find container logs
>>
>>
>>
>> Thanks & Regards
>>
>> Rohith Sharma K S
>>
>>
>>
>> This e-mail and its attachments contain confidential information from
>> HUAWEI, which is intended only for the person or entity whose address is
>> listed above. Any use of the information contained herein in any way
>> (including, but not limited to, total or partial disclosure,
>> reproduction,
>> or dissemination) by persons other than the intended recipient(s) is
>> prohibited. If you receive this e-mail in error, please notify the sender
>> by phone or email immediately and delete it!
>>
>>
>>
>> *From:* francexo83 [mailto:francexo83@gmail.com]
>> *Sent:* 18 November 2014 22:15
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: MR job fails with too many mappers
>>
>>
>>
>> Hi,
>>
>>
>>
>> thank you for your quick response, but I was not able to see the logs for
>> the container.
>>
>>
>>
>> I get a  "no such file or directory" when I try to access the logs of the
>> container from the shell:
>>
>>
>>
>> cd /var/log/hadoop-yarn/containers/application_1416304409718_0032
>>
>>
>>
>>
>>
>> It seems that the container has never been created.
>>
>>
>>
>>
>>
>>
>>
>> thanks
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>>
>> Hi
>>
>>
>>
>> Could you get syserr and sysout log for contrainer.? These logs will be
>> available in the same location  syslog for container.
>>
>> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
>>
>> This helps to find problem!!
>>
>>
>>
>>
>>
>> Thanks & Regards
>>
>> Rohith Sharma K S
>>
>>
>>
>> *From:* francexo83 [mailto:francexo83@gmail.com]
>> *Sent:* 18 November 2014 20:53
>> *To:* user@hadoop.apache.org
>> *Subject:* MR job fails with too many mappers
>>
>>
>>
>> Hi All,
>>
>>
>>
>> I have a small  hadoop cluster with three nodes and HBase 0.98.1
>> installed
>> on it.
>>
>>
>>
>> The hadoop version is 2.3.0 and below my use case scenario.
>>
>>
>>
>> I wrote a map reduce program that reads data from an hbase table and does
>> some transformations on these data.
>>
>> Jobs are very simple so they didn't need the  reduce phase. I also wrote
>> a
>> TableInputFormat  extension in order to maximize the number of concurrent
>> maps on the cluster.
>>
>> In other words, each  row should be processed by a single map task.
>>
>>
>>
>> Everything goes well until the number of rows and consequently  mappers
>> exceeds 300000 quota.
>>
>>
>>
>> This is the only exception I see when the job fails:
>>
>>
>>
>> Application application_1416304409718_0032 failed 2 times due to AM
>> Container for appattempt_1416304409718_0032_000002 exited with exitCode:
>> 1
>> due to:
>>
>>
>>
>>
>>
>> Exception from container-launch:
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>
>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
>>
>> at org.apache.hadoop.util.Shell.run(Shell.java:424)
>>
>> at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>>
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>
>> at java.lang.Thread.run(Thread.java:745)
>>
>> Container exited with a non-zero exit code 1
>>
>>
>>
>>
>>
>> Cluster configuration details:
>>
>> Node1: 12 GB, 4 core
>>
>> Node2: 6 GB, 4 core
>>
>> Node3: 6 GB, 4 core
>>
>>
>>
>> yarn.scheduler.minimum-allocation-mb=2048
>>
>> yarn.scheduler.maximum-allocation-mb=4096
>>
>> yarn.nodemanager.resource.memory-mb=6144
>>
>>
>>
>>
>>
>>
>>
>> Regards
>>
>>
>>
>

Re: MR job fails with too many mappers

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.

In which case the split metadata go beyond 10MB?
Can u give some details of your input file and splits.

On 11/19/14, francexo83 <fr...@gmail.com> wrote:
> Thank you very much for your suggestion, it was very helpful.
>
> This is what I have after  turning off log aggregation:
>
> 2014-11-18 18:39:01,507 INFO [main]
> org.apache.hadoop.service.AbstractService: Service
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state STARTED;
> cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_1416332245344_0004
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_1416332245344_0004
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1551)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1406)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1373)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:986)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1249)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1049)
>         at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1460)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
>         at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
> Caused by: java.io.IOException: Split metadata size exceeded 10000000.
> Aborting job job_1416332245344_0004
>         at
> org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:53)
>         at
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1546)
>
>
> I exceeded the split metadata size so I  added the following property into
> the mapred-site.xml and it worked:
>
> <property>
>     <name>mapreduce.job.split.metainfo.maxsize</name>
>     <value>500000000</value>
> </property>
>
> thanks again.
>
>
>
>
>
>
>
>
> 2014-11-18 17:59 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>
>>  If log aggregation is enabled, log folder will be deleted. So I suggest
>> disable “yarn.log-aggregation-enable” and run job again. All the logs
>> remains at log folder. Then you can find container logs
>>
>>
>>
>> Thanks & Regards
>>
>> Rohith Sharma K S
>>
>>
>>
>> This e-mail and its attachments contain confidential information from
>> HUAWEI, which is intended only for the person or entity whose address is
>> listed above. Any use of the information contained herein in any way
>> (including, but not limited to, total or partial disclosure,
>> reproduction,
>> or dissemination) by persons other than the intended recipient(s) is
>> prohibited. If you receive this e-mail in error, please notify the sender
>> by phone or email immediately and delete it!
>>
>>
>>
>> *From:* francexo83 [mailto:francexo83@gmail.com]
>> *Sent:* 18 November 2014 22:15
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: MR job fails with too many mappers
>>
>>
>>
>> Hi,
>>
>>
>>
>> thank you for your quick response, but I was not able to see the logs for
>> the container.
>>
>>
>>
>> I get a  "no such file or directory" when I try to access the logs of the
>> container from the shell:
>>
>>
>>
>> cd /var/log/hadoop-yarn/containers/application_1416304409718_0032
>>
>>
>>
>>
>>
>> It seems that the container has never been created.
>>
>>
>>
>>
>>
>>
>>
>> thanks
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>>
>> Hi
>>
>>
>>
>> Could you get syserr and sysout log for contrainer.? These logs will be
>> available in the same location  syslog for container.
>>
>> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
>>
>> This helps to find problem!!
>>
>>
>>
>>
>>
>> Thanks & Regards
>>
>> Rohith Sharma K S
>>
>>
>>
>> *From:* francexo83 [mailto:francexo83@gmail.com]
>> *Sent:* 18 November 2014 20:53
>> *To:* user@hadoop.apache.org
>> *Subject:* MR job fails with too many mappers
>>
>>
>>
>> Hi All,
>>
>>
>>
>> I have a small  hadoop cluster with three nodes and HBase 0.98.1
>> installed
>> on it.
>>
>>
>>
>> The hadoop version is 2.3.0 and below my use case scenario.
>>
>>
>>
>> I wrote a map reduce program that reads data from an hbase table and does
>> some transformations on these data.
>>
>> Jobs are very simple so they didn't need the  reduce phase. I also wrote
>> a
>> TableInputFormat  extension in order to maximize the number of concurrent
>> maps on the cluster.
>>
>> In other words, each  row should be processed by a single map task.
>>
>>
>>
>> Everything goes well until the number of rows and consequently  mappers
>> exceeds 300000 quota.
>>
>>
>>
>> This is the only exception I see when the job fails:
>>
>>
>>
>> Application application_1416304409718_0032 failed 2 times due to AM
>> Container for appattempt_1416304409718_0032_000002 exited with exitCode:
>> 1
>> due to:
>>
>>
>>
>>
>>
>> Exception from container-launch:
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>
>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
>>
>> at org.apache.hadoop.util.Shell.run(Shell.java:424)
>>
>> at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>>
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>
>> at java.lang.Thread.run(Thread.java:745)
>>
>> Container exited with a non-zero exit code 1
>>
>>
>>
>>
>>
>> Cluster configuration details:
>>
>> Node1: 12 GB, 4 core
>>
>> Node2: 6 GB, 4 core
>>
>> Node3: 6 GB, 4 core
>>
>>
>>
>> yarn.scheduler.minimum-allocation-mb=2048
>>
>> yarn.scheduler.maximum-allocation-mb=4096
>>
>> yarn.nodemanager.resource.memory-mb=6144
>>
>>
>>
>>
>>
>>
>>
>> Regards
>>
>>
>>
>

Re: MR job fails with too many mappers

Posted by francexo83 <fr...@gmail.com>.

Thank you very much for your suggestion, it was very helpful.

This is what I have after  turning off log aggregation:

2014-11-18 18:39:01,507 INFO [main]
org.apache.hadoop.service.AbstractService: Service
org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state STARTED;
cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
java.io.IOException: Split metadata size exceeded 10000000. Aborting job
job_1416332245344_0004
org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
java.io.IOException: Split metadata size exceeded 10000000. Aborting job
job_1416332245344_0004
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1551)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1406)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1373)
        at
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
        at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
        at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:986)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1249)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1049)
        at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1460)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
Caused by: java.io.IOException: Split metadata size exceeded 10000000.
Aborting job job_1416332245344_0004
        at
org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:53)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1546)


I exceeded the split metadata size so I  added the following property into
the mapred-site.xml and it worked:

<property>
    <name>mapreduce.job.split.metainfo.maxsize</name>
    <value>500000000</value>
</property>

thanks again.








2014-11-18 17:59 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:

>  If log aggregation is enabled, log folder will be deleted. So I suggest
> disable “yarn.log-aggregation-enable” and run job again. All the logs
> remains at log folder. Then you can find container logs
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> This e-mail and its attachments contain confidential information from
> HUAWEI, which is intended only for the person or entity whose address is
> listed above. Any use of the information contained herein in any way
> (including, but not limited to, total or partial disclosure, reproduction,
> or dissemination) by persons other than the intended recipient(s) is
> prohibited. If you receive this e-mail in error, please notify the sender
> by phone or email immediately and delete it!
>
>
>
> *From:* francexo83 [mailto:francexo83@gmail.com]
> *Sent:* 18 November 2014 22:15
> *To:* user@hadoop.apache.org
> *Subject:* Re: MR job fails with too many mappers
>
>
>
> Hi,
>
>
>
> thank you for your quick response, but I was not able to see the logs for
> the container.
>
>
>
> I get a  "no such file or directory" when I try to access the logs of the
> container from the shell:
>
>
>
> cd /var/log/hadoop-yarn/containers/application_1416304409718_0032
>
>
>
>
>
> It seems that the container has never been created.
>
>
>
>
>
>
>
> thanks
>
>
>
>
>
>
>
>
>
>
> 2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>
> Hi
>
>
>
> Could you get syserr and sysout log for contrainer.? These logs will be
> available in the same location  syslog for container.
>
> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
>
> This helps to find problem!!
>
>
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> *From:* francexo83 [mailto:francexo83@gmail.com]
> *Sent:* 18 November 2014 20:53
> *To:* user@hadoop.apache.org
> *Subject:* MR job fails with too many mappers
>
>
>
> Hi All,
>
>
>
> I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed
> on it.
>
>
>
> The hadoop version is 2.3.0 and below my use case scenario.
>
>
>
> I wrote a map reduce program that reads data from an hbase table and does
> some transformations on these data.
>
> Jobs are very simple so they didn't need the  reduce phase. I also wrote a
> TableInputFormat  extension in order to maximize the number of concurrent
> maps on the cluster.
>
> In other words, each  row should be processed by a single map task.
>
>
>
> Everything goes well until the number of rows and consequently  mappers
> exceeds 300000 quota.
>
>
>
> This is the only exception I see when the job fails:
>
>
>
> Application application_1416304409718_0032 failed 2 times due to AM
> Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1
> due to:
>
>
>
>
>
> Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException:
>
> org.apache.hadoop.util.Shell$ExitCodeException:
>
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
>
> at org.apache.hadoop.util.Shell.run(Shell.java:424)
>
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
> Container exited with a non-zero exit code 1
>
>
>
>
>
> Cluster configuration details:
>
> Node1: 12 GB, 4 core
>
> Node2: 6 GB, 4 core
>
> Node3: 6 GB, 4 core
>
>
>
> yarn.scheduler.minimum-allocation-mb=2048
>
> yarn.scheduler.maximum-allocation-mb=4096
>
> yarn.nodemanager.resource.memory-mb=6144
>
>
>
>
>
>
>
> Regards
>
>
>

Re: MR job fails with too many mappers

Posted by francexo83 <fr...@gmail.com>.

Thank you very much for your suggestion, it was very helpful.

This is what I have after  turning off log aggregation:

2014-11-18 18:39:01,507 INFO [main]
org.apache.hadoop.service.AbstractService: Service
org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state STARTED;
cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
java.io.IOException: Split metadata size exceeded 10000000. Aborting job
job_1416332245344_0004
org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
java.io.IOException: Split metadata size exceeded 10000000. Aborting job
job_1416332245344_0004
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1551)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1406)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1373)
        at
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
        at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
        at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:986)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1249)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1049)
        at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1460)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
Caused by: java.io.IOException: Split metadata size exceeded 10000000.
Aborting job job_1416332245344_0004
        at
org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:53)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1546)


I exceeded the split metadata size so I  added the following property into
the mapred-site.xml and it worked:

<property>
    <name>mapreduce.job.split.metainfo.maxsize</name>
    <value>500000000</value>
</property>

thanks again.








2014-11-18 17:59 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:

>  If log aggregation is enabled, log folder will be deleted. So I suggest
> disable “yarn.log-aggregation-enable” and run job again. All the logs
> remains at log folder. Then you can find container logs
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> This e-mail and its attachments contain confidential information from
> HUAWEI, which is intended only for the person or entity whose address is
> listed above. Any use of the information contained herein in any way
> (including, but not limited to, total or partial disclosure, reproduction,
> or dissemination) by persons other than the intended recipient(s) is
> prohibited. If you receive this e-mail in error, please notify the sender
> by phone or email immediately and delete it!
>
>
>
> *From:* francexo83 [mailto:francexo83@gmail.com]
> *Sent:* 18 November 2014 22:15
> *To:* user@hadoop.apache.org
> *Subject:* Re: MR job fails with too many mappers
>
>
>
> Hi,
>
>
>
> thank you for your quick response, but I was not able to see the logs for
> the container.
>
>
>
> I get a  "no such file or directory" when I try to access the logs of the
> container from the shell:
>
>
>
> cd /var/log/hadoop-yarn/containers/application_1416304409718_0032
>
>
>
>
>
> It seems that the container has never been created.
>
>
>
>
>
>
>
> thanks
>
>
>
>
>
>
>
>
>
>
> 2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>
> Hi
>
>
>
> Could you get syserr and sysout log for contrainer.? These logs will be
> available in the same location  syslog for container.
>
> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
>
> This helps to find problem!!
>
>
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> *From:* francexo83 [mailto:francexo83@gmail.com]
> *Sent:* 18 November 2014 20:53
> *To:* user@hadoop.apache.org
> *Subject:* MR job fails with too many mappers
>
>
>
> Hi All,
>
>
>
> I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed
> on it.
>
>
>
> The hadoop version is 2.3.0 and below my use case scenario.
>
>
>
> I wrote a map reduce program that reads data from an hbase table and does
> some transformations on these data.
>
> Jobs are very simple so they didn't need the  reduce phase. I also wrote a
> TableInputFormat  extension in order to maximize the number of concurrent
> maps on the cluster.
>
> In other words, each  row should be processed by a single map task.
>
>
>
> Everything goes well until the number of rows and consequently  mappers
> exceeds 300000 quota.
>
>
>
> This is the only exception I see when the job fails:
>
>
>
> Application application_1416304409718_0032 failed 2 times due to AM
> Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1
> due to:
>
>
>
>
>
> Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException:
>
> org.apache.hadoop.util.Shell$ExitCodeException:
>
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
>
> at org.apache.hadoop.util.Shell.run(Shell.java:424)
>
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
> Container exited with a non-zero exit code 1
>
>
>
>
>
> Cluster configuration details:
>
> Node1: 12 GB, 4 core
>
> Node2: 6 GB, 4 core
>
> Node3: 6 GB, 4 core
>
>
>
> yarn.scheduler.minimum-allocation-mb=2048
>
> yarn.scheduler.maximum-allocation-mb=4096
>
> yarn.nodemanager.resource.memory-mb=6144
>
>
>
>
>
>
>
> Regards
>
>
>

Re: MR job fails with too many mappers

Posted by francexo83 <fr...@gmail.com>.

Thank you very much for your suggestion, it was very helpful.

This is what I have after  turning off log aggregation:

2014-11-18 18:39:01,507 INFO [main]
org.apache.hadoop.service.AbstractService: Service
org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state STARTED;
cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
java.io.IOException: Split metadata size exceeded 10000000. Aborting job
job_1416332245344_0004
org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
java.io.IOException: Split metadata size exceeded 10000000. Aborting job
job_1416332245344_0004
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1551)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1406)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1373)
        at
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
        at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
        at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:986)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1249)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1049)
        at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1460)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
Caused by: java.io.IOException: Split metadata size exceeded 10000000.
Aborting job job_1416332245344_0004
        at
org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:53)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1546)


I exceeded the split metadata size so I  added the following property into
the mapred-site.xml and it worked:

<property>
    <name>mapreduce.job.split.metainfo.maxsize</name>
    <value>500000000</value>
</property>

thanks again.








2014-11-18 17:59 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:

>  If log aggregation is enabled, log folder will be deleted. So I suggest
> disable “yarn.log-aggregation-enable” and run job again. All the logs
> remains at log folder. Then you can find container logs
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> This e-mail and its attachments contain confidential information from
> HUAWEI, which is intended only for the person or entity whose address is
> listed above. Any use of the information contained herein in any way
> (including, but not limited to, total or partial disclosure, reproduction,
> or dissemination) by persons other than the intended recipient(s) is
> prohibited. If you receive this e-mail in error, please notify the sender
> by phone or email immediately and delete it!
>
>
>
> *From:* francexo83 [mailto:francexo83@gmail.com]
> *Sent:* 18 November 2014 22:15
> *To:* user@hadoop.apache.org
> *Subject:* Re: MR job fails with too many mappers
>
>
>
> Hi,
>
>
>
> thank you for your quick response, but I was not able to see the logs for
> the container.
>
>
>
> I get a  "no such file or directory" when I try to access the logs of the
> container from the shell:
>
>
>
> cd /var/log/hadoop-yarn/containers/application_1416304409718_0032
>
>
>
>
>
> It seems that the container has never been created.
>
>
>
>
>
>
>
> thanks
>
>
>
>
>
>
>
>
>
>
> 2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>
> Hi
>
>
>
> Could you get syserr and sysout log for contrainer.? These logs will be
> available in the same location  syslog for container.
>
> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
>
> This helps to find problem!!
>
>
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> *From:* francexo83 [mailto:francexo83@gmail.com]
> *Sent:* 18 November 2014 20:53
> *To:* user@hadoop.apache.org
> *Subject:* MR job fails with too many mappers
>
>
>
> Hi All,
>
>
>
> I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed
> on it.
>
>
>
> The hadoop version is 2.3.0 and below my use case scenario.
>
>
>
> I wrote a map reduce program that reads data from an hbase table and does
> some transformations on these data.
>
> Jobs are very simple so they didn't need the  reduce phase. I also wrote a
> TableInputFormat  extension in order to maximize the number of concurrent
> maps on the cluster.
>
> In other words, each  row should be processed by a single map task.
>
>
>
> Everything goes well until the number of rows and consequently  mappers
> exceeds 300000 quota.
>
>
>
> This is the only exception I see when the job fails:
>
>
>
> Application application_1416304409718_0032 failed 2 times due to AM
> Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1
> due to:
>
>
>
>
>
> Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException:
>
> org.apache.hadoop.util.Shell$ExitCodeException:
>
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
>
> at org.apache.hadoop.util.Shell.run(Shell.java:424)
>
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
> Container exited with a non-zero exit code 1
>
>
>
>
>
> Cluster configuration details:
>
> Node1: 12 GB, 4 core
>
> Node2: 6 GB, 4 core
>
> Node3: 6 GB, 4 core
>
>
>
> yarn.scheduler.minimum-allocation-mb=2048
>
> yarn.scheduler.maximum-allocation-mb=4096
>
> yarn.nodemanager.resource.memory-mb=6144
>
>
>
>
>
>
>
> Regards
>
>
>

Re: MR job fails with too many mappers

Posted by francexo83 <fr...@gmail.com>.

Thank you very much for your suggestion, it was very helpful.

This is what I have after  turning off log aggregation:

2014-11-18 18:39:01,507 INFO [main]
org.apache.hadoop.service.AbstractService: Service
org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state STARTED;
cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
java.io.IOException: Split metadata size exceeded 10000000. Aborting job
job_1416332245344_0004
org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
java.io.IOException: Split metadata size exceeded 10000000. Aborting job
job_1416332245344_0004
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1551)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1406)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1373)
        at
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
        at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
        at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:986)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1249)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1049)
        at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1460)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
Caused by: java.io.IOException: Split metadata size exceeded 10000000.
Aborting job job_1416332245344_0004
        at
org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:53)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1546)


I exceeded the split metadata size so I  added the following property into
the mapred-site.xml and it worked:

<property>
    <name>mapreduce.job.split.metainfo.maxsize</name>
    <value>500000000</value>
</property>

thanks again.








2014-11-18 17:59 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:

>  If log aggregation is enabled, log folder will be deleted. So I suggest
> disable “yarn.log-aggregation-enable” and run job again. All the logs
> remains at log folder. Then you can find container logs
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> This e-mail and its attachments contain confidential information from
> HUAWEI, which is intended only for the person or entity whose address is
> listed above. Any use of the information contained herein in any way
> (including, but not limited to, total or partial disclosure, reproduction,
> or dissemination) by persons other than the intended recipient(s) is
> prohibited. If you receive this e-mail in error, please notify the sender
> by phone or email immediately and delete it!
>
>
>
> *From:* francexo83 [mailto:francexo83@gmail.com]
> *Sent:* 18 November 2014 22:15
> *To:* user@hadoop.apache.org
> *Subject:* Re: MR job fails with too many mappers
>
>
>
> Hi,
>
>
>
> thank you for your quick response, but I was not able to see the logs for
> the container.
>
>
>
> I get a  "no such file or directory" when I try to access the logs of the
> container from the shell:
>
>
>
> cd /var/log/hadoop-yarn/containers/application_1416304409718_0032
>
>
>
>
>
> It seems that the container has never been created.
>
>
>
>
>
>
>
> thanks
>
>
>
>
>
>
>
>
>
>
> 2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:
>
> Hi
>
>
>
> Could you get syserr and sysout log for contrainer.? These logs will be
> available in the same location  syslog for container.
>
> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
>
> This helps to find problem!!
>
>
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> *From:* francexo83 [mailto:francexo83@gmail.com]
> *Sent:* 18 November 2014 20:53
> *To:* user@hadoop.apache.org
> *Subject:* MR job fails with too many mappers
>
>
>
> Hi All,
>
>
>
> I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed
> on it.
>
>
>
> The hadoop version is 2.3.0 and below my use case scenario.
>
>
>
> I wrote a map reduce program that reads data from an hbase table and does
> some transformations on these data.
>
> Jobs are very simple so they didn't need the  reduce phase. I also wrote a
> TableInputFormat  extension in order to maximize the number of concurrent
> maps on the cluster.
>
> In other words, each  row should be processed by a single map task.
>
>
>
> Everything goes well until the number of rows and consequently  mappers
> exceeds 300000 quota.
>
>
>
> This is the only exception I see when the job fails:
>
>
>
> Application application_1416304409718_0032 failed 2 times due to AM
> Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1
> due to:
>
>
>
>
>
> Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException:
>
> org.apache.hadoop.util.Shell$ExitCodeException:
>
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
>
> at org.apache.hadoop.util.Shell.run(Shell.java:424)
>
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
> Container exited with a non-zero exit code 1
>
>
>
>
>
> Cluster configuration details:
>
> Node1: 12 GB, 4 core
>
> Node2: 6 GB, 4 core
>
> Node3: 6 GB, 4 core
>
>
>
> yarn.scheduler.minimum-allocation-mb=2048
>
> yarn.scheduler.maximum-allocation-mb=4096
>
> yarn.nodemanager.resource.memory-mb=6144
>
>
>
>
>
>
>
> Regards
>
>
>

RE: MR job fails with too many mappers

Posted by Rohith Sharma K S <ro...@huawei.com>.

If log aggregation is enabled, log folder will be deleted. So I suggest disable “yarn.log-aggregation-enable” and run job again. All the logs remains at log folder. Then you can find container logs

Thanks & Regards
Rohith Sharma K S

This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!

From: francexo83 [mailto:francexo83@gmail.com]
Sent: 18 November 2014 22:15
To: user@hadoop.apache.org
Subject: Re: MR job fails with too many mappers

Hi,

thank you for your quick response, but I was not able to see the logs for the container.

I get a  "no such file or directory" when I try to access the logs of the container from the shell:

cd /var/log/hadoop-yarn/containers/application_1416304409718_0032

It seems that the container has never been created.

thanks

2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>>:
Hi

Could you get syserr and sysout log for contrainer.? These logs will be available in the same location  syslog for container.
${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
This helps to find problem!!

Thanks & Regards
Rohith Sharma K S

From: francexo83 [mailto:francexo83@gmail.com<ma...@gmail.com>]
Sent: 18 November 2014 20:53
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: MR job fails with too many mappers

Hi All,

I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed on it.

The hadoop version is 2.3.0 and below my use case scenario.

I wrote a map reduce program that reads data from an hbase table and does some transformations on these data.
Jobs are very simple so they didn't need the  reduce phase. I also wrote a TableInputFormat  extension in order to maximize the number of concurrent maps on the cluster.
In other words, each  row should be processed by a single map task.

Everything goes well until the number of rows and consequently  mappers exceeds 300000 quota.

This is the only exception I see when the job fails:

Application application_1416304409718_0032 failed 2 times due to AM Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1 due to:

Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
at org.apache.hadoop.util.Shell.run(Shell.java:424)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1

Cluster configuration details:
Node1: 12 GB, 4 core
Node2: 6 GB, 4 core
Node3: 6 GB, 4 core

yarn.scheduler.minimum-allocation-mb=2048
yarn.scheduler.maximum-allocation-mb=4096
yarn.nodemanager.resource.memory-mb=6144

Regards

RE: MR job fails with too many mappers

Posted by Rohith Sharma K S <ro...@huawei.com>.

If log aggregation is enabled, log folder will be deleted. So I suggest disable “yarn.log-aggregation-enable” and run job again. All the logs remains at log folder. Then you can find container logs

Thanks & Regards
Rohith Sharma K S

This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!

From: francexo83 [mailto:francexo83@gmail.com]
Sent: 18 November 2014 22:15
To: user@hadoop.apache.org
Subject: Re: MR job fails with too many mappers

Hi,

thank you for your quick response, but I was not able to see the logs for the container.

I get a  "no such file or directory" when I try to access the logs of the container from the shell:

cd /var/log/hadoop-yarn/containers/application_1416304409718_0032

It seems that the container has never been created.

thanks

2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>>:
Hi

Could you get syserr and sysout log for contrainer.? These logs will be available in the same location  syslog for container.
${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
This helps to find problem!!

Thanks & Regards
Rohith Sharma K S

From: francexo83 [mailto:francexo83@gmail.com<ma...@gmail.com>]
Sent: 18 November 2014 20:53
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: MR job fails with too many mappers

Hi All,

I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed on it.

The hadoop version is 2.3.0 and below my use case scenario.

I wrote a map reduce program that reads data from an hbase table and does some transformations on these data.
Jobs are very simple so they didn't need the  reduce phase. I also wrote a TableInputFormat  extension in order to maximize the number of concurrent maps on the cluster.
In other words, each  row should be processed by a single map task.

Everything goes well until the number of rows and consequently  mappers exceeds 300000 quota.

This is the only exception I see when the job fails:

Application application_1416304409718_0032 failed 2 times due to AM Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1 due to:

Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
at org.apache.hadoop.util.Shell.run(Shell.java:424)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1

Cluster configuration details:
Node1: 12 GB, 4 core
Node2: 6 GB, 4 core
Node3: 6 GB, 4 core

yarn.scheduler.minimum-allocation-mb=2048
yarn.scheduler.maximum-allocation-mb=4096
yarn.nodemanager.resource.memory-mb=6144

Regards

RE: MR job fails with too many mappers

Posted by Rohith Sharma K S <ro...@huawei.com>.

If log aggregation is enabled, log folder will be deleted. So I suggest disable “yarn.log-aggregation-enable” and run job again. All the logs remains at log folder. Then you can find container logs

Thanks & Regards
Rohith Sharma K S

This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!

From: francexo83 [mailto:francexo83@gmail.com]
Sent: 18 November 2014 22:15
To: user@hadoop.apache.org
Subject: Re: MR job fails with too many mappers

Hi,

thank you for your quick response, but I was not able to see the logs for the container.

I get a  "no such file or directory" when I try to access the logs of the container from the shell:

cd /var/log/hadoop-yarn/containers/application_1416304409718_0032

It seems that the container has never been created.

thanks

2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>>:
Hi

Could you get syserr and sysout log for contrainer.? These logs will be available in the same location  syslog for container.
${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
This helps to find problem!!

Thanks & Regards
Rohith Sharma K S

From: francexo83 [mailto:francexo83@gmail.com<ma...@gmail.com>]
Sent: 18 November 2014 20:53
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: MR job fails with too many mappers

Hi All,

I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed on it.

The hadoop version is 2.3.0 and below my use case scenario.

I wrote a map reduce program that reads data from an hbase table and does some transformations on these data.
Jobs are very simple so they didn't need the  reduce phase. I also wrote a TableInputFormat  extension in order to maximize the number of concurrent maps on the cluster.
In other words, each  row should be processed by a single map task.

Everything goes well until the number of rows and consequently  mappers exceeds 300000 quota.

This is the only exception I see when the job fails:

Application application_1416304409718_0032 failed 2 times due to AM Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1 due to:

Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
at org.apache.hadoop.util.Shell.run(Shell.java:424)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1

Cluster configuration details:
Node1: 12 GB, 4 core
Node2: 6 GB, 4 core
Node3: 6 GB, 4 core

yarn.scheduler.minimum-allocation-mb=2048
yarn.scheduler.maximum-allocation-mb=4096
yarn.nodemanager.resource.memory-mb=6144

Regards

RE: MR job fails with too many mappers

Posted by Rohith Sharma K S <ro...@huawei.com>.

If log aggregation is enabled, log folder will be deleted. So I suggest disable “yarn.log-aggregation-enable” and run job again. All the logs remains at log folder. Then you can find container logs

Thanks & Regards
Rohith Sharma K S

This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!

From: francexo83 [mailto:francexo83@gmail.com]
Sent: 18 November 2014 22:15
To: user@hadoop.apache.org
Subject: Re: MR job fails with too many mappers

Hi,

thank you for your quick response, but I was not able to see the logs for the container.

I get a  "no such file or directory" when I try to access the logs of the container from the shell:

cd /var/log/hadoop-yarn/containers/application_1416304409718_0032

It seems that the container has never been created.

thanks

2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>>:
Hi

Could you get syserr and sysout log for contrainer.? These logs will be available in the same location  syslog for container.
${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
This helps to find problem!!

Thanks & Regards
Rohith Sharma K S

From: francexo83 [mailto:francexo83@gmail.com<ma...@gmail.com>]
Sent: 18 November 2014 20:53
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: MR job fails with too many mappers

Hi All,

I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed on it.

The hadoop version is 2.3.0 and below my use case scenario.

I wrote a map reduce program that reads data from an hbase table and does some transformations on these data.
Jobs are very simple so they didn't need the  reduce phase. I also wrote a TableInputFormat  extension in order to maximize the number of concurrent maps on the cluster.
In other words, each  row should be processed by a single map task.

Everything goes well until the number of rows and consequently  mappers exceeds 300000 quota.

This is the only exception I see when the job fails:

Application application_1416304409718_0032 failed 2 times due to AM Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1 due to:

Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
at org.apache.hadoop.util.Shell.run(Shell.java:424)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1

Cluster configuration details:
Node1: 12 GB, 4 core
Node2: 6 GB, 4 core
Node3: 6 GB, 4 core

yarn.scheduler.minimum-allocation-mb=2048
yarn.scheduler.maximum-allocation-mb=4096
yarn.nodemanager.resource.memory-mb=6144

Regards

Re: MR job fails with too many mappers

Posted by francexo83 <fr...@gmail.com>.

Hi,

thank you for your quick response, but I was not able to see the logs for
the container.

I get a  "no such file or directory" when I try to access the logs of the
container from the shell:

cd /var/log/hadoop-yarn/containers/application_1416304409718_0032


It seems that the container has never been created.



thanks





2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:

>  Hi
>
>
>
> Could you get syserr and sysout log for contrainer.? These logs will be
> available in the same location  syslog for container.
>
> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
>
> This helps to find problem!!
>
>
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> *From:* francexo83 [mailto:francexo83@gmail.com]
> *Sent:* 18 November 2014 20:53
> *To:* user@hadoop.apache.org
> *Subject:* MR job fails with too many mappers
>
>
>
> Hi All,
>
>
>
> I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed
> on it.
>
>
>
> The hadoop version is 2.3.0 and below my use case scenario.
>
>
>
> I wrote a map reduce program that reads data from an hbase table and does
> some transformations on these data.
>
> Jobs are very simple so they didn't need the  reduce phase. I also wrote a
> TableInputFormat  extension in order to maximize the number of concurrent
> maps on the cluster.
>
> In other words, each  row should be processed by a single map task.
>
>
>
> Everything goes well until the number of rows and consequently  mappers
> exceeds 300000 quota.
>
>
>
> This is the only exception I see when the job fails:
>
>
>
> Application application_1416304409718_0032 failed 2 times due to AM
> Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1
> due to:
>
>
>
>
>
> Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException:
>
> org.apache.hadoop.util.Shell$ExitCodeException:
>
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
>
> at org.apache.hadoop.util.Shell.run(Shell.java:424)
>
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
> Container exited with a non-zero exit code 1
>
>
>
>
>
> Cluster configuration details:
>
> Node1: 12 GB, 4 core
>
> Node2: 6 GB, 4 core
>
> Node3: 6 GB, 4 core
>
>
>
> yarn.scheduler.minimum-allocation-mb=2048
>
> yarn.scheduler.maximum-allocation-mb=4096
>
> yarn.nodemanager.resource.memory-mb=6144
>
>
>
>
>
>
>
> Regards
>

Re: MR job fails with too many mappers

Posted by francexo83 <fr...@gmail.com>.

Hi,

thank you for your quick response, but I was not able to see the logs for
the container.

I get a  "no such file or directory" when I try to access the logs of the
container from the shell:

cd /var/log/hadoop-yarn/containers/application_1416304409718_0032


It seems that the container has never been created.



thanks





2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:

>  Hi
>
>
>
> Could you get syserr and sysout log for contrainer.? These logs will be
> available in the same location  syslog for container.
>
> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
>
> This helps to find problem!!
>
>
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> *From:* francexo83 [mailto:francexo83@gmail.com]
> *Sent:* 18 November 2014 20:53
> *To:* user@hadoop.apache.org
> *Subject:* MR job fails with too many mappers
>
>
>
> Hi All,
>
>
>
> I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed
> on it.
>
>
>
> The hadoop version is 2.3.0 and below my use case scenario.
>
>
>
> I wrote a map reduce program that reads data from an hbase table and does
> some transformations on these data.
>
> Jobs are very simple so they didn't need the  reduce phase. I also wrote a
> TableInputFormat  extension in order to maximize the number of concurrent
> maps on the cluster.
>
> In other words, each  row should be processed by a single map task.
>
>
>
> Everything goes well until the number of rows and consequently  mappers
> exceeds 300000 quota.
>
>
>
> This is the only exception I see when the job fails:
>
>
>
> Application application_1416304409718_0032 failed 2 times due to AM
> Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1
> due to:
>
>
>
>
>
> Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException:
>
> org.apache.hadoop.util.Shell$ExitCodeException:
>
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
>
> at org.apache.hadoop.util.Shell.run(Shell.java:424)
>
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
> Container exited with a non-zero exit code 1
>
>
>
>
>
> Cluster configuration details:
>
> Node1: 12 GB, 4 core
>
> Node2: 6 GB, 4 core
>
> Node3: 6 GB, 4 core
>
>
>
> yarn.scheduler.minimum-allocation-mb=2048
>
> yarn.scheduler.maximum-allocation-mb=4096
>
> yarn.nodemanager.resource.memory-mb=6144
>
>
>
>
>
>
>
> Regards
>

Re: MR job fails with too many mappers

Posted by francexo83 <fr...@gmail.com>.

Hi,

thank you for your quick response, but I was not able to see the logs for
the container.

I get a  "no such file or directory" when I try to access the logs of the
container from the shell:

cd /var/log/hadoop-yarn/containers/application_1416304409718_0032


It seems that the container has never been created.



thanks





2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:

>  Hi
>
>
>
> Could you get syserr and sysout log for contrainer.? These logs will be
> available in the same location  syslog for container.
>
> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
>
> This helps to find problem!!
>
>
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> *From:* francexo83 [mailto:francexo83@gmail.com]
> *Sent:* 18 November 2014 20:53
> *To:* user@hadoop.apache.org
> *Subject:* MR job fails with too many mappers
>
>
>
> Hi All,
>
>
>
> I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed
> on it.
>
>
>
> The hadoop version is 2.3.0 and below my use case scenario.
>
>
>
> I wrote a map reduce program that reads data from an hbase table and does
> some transformations on these data.
>
> Jobs are very simple so they didn't need the  reduce phase. I also wrote a
> TableInputFormat  extension in order to maximize the number of concurrent
> maps on the cluster.
>
> In other words, each  row should be processed by a single map task.
>
>
>
> Everything goes well until the number of rows and consequently  mappers
> exceeds 300000 quota.
>
>
>
> This is the only exception I see when the job fails:
>
>
>
> Application application_1416304409718_0032 failed 2 times due to AM
> Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1
> due to:
>
>
>
>
>
> Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException:
>
> org.apache.hadoop.util.Shell$ExitCodeException:
>
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
>
> at org.apache.hadoop.util.Shell.run(Shell.java:424)
>
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
> Container exited with a non-zero exit code 1
>
>
>
>
>
> Cluster configuration details:
>
> Node1: 12 GB, 4 core
>
> Node2: 6 GB, 4 core
>
> Node3: 6 GB, 4 core
>
>
>
> yarn.scheduler.minimum-allocation-mb=2048
>
> yarn.scheduler.maximum-allocation-mb=4096
>
> yarn.nodemanager.resource.memory-mb=6144
>
>
>
>
>
>
>
> Regards
>

Re: MR job fails with too many mappers

Posted by francexo83 <fr...@gmail.com>.

Hi,

thank you for your quick response, but I was not able to see the logs for
the container.

I get a  "no such file or directory" when I try to access the logs of the
container from the shell:

cd /var/log/hadoop-yarn/containers/application_1416304409718_0032


It seems that the container has never been created.



thanks





2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <ro...@huawei.com>:

>  Hi
>
>
>
> Could you get syserr and sysout log for contrainer.? These logs will be
> available in the same location  syslog for container.
>
> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
>
> This helps to find problem!!
>
>
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> *From:* francexo83 [mailto:francexo83@gmail.com]
> *Sent:* 18 November 2014 20:53
> *To:* user@hadoop.apache.org
> *Subject:* MR job fails with too many mappers
>
>
>
> Hi All,
>
>
>
> I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed
> on it.
>
>
>
> The hadoop version is 2.3.0 and below my use case scenario.
>
>
>
> I wrote a map reduce program that reads data from an hbase table and does
> some transformations on these data.
>
> Jobs are very simple so they didn't need the  reduce phase. I also wrote a
> TableInputFormat  extension in order to maximize the number of concurrent
> maps on the cluster.
>
> In other words, each  row should be processed by a single map task.
>
>
>
> Everything goes well until the number of rows and consequently  mappers
> exceeds 300000 quota.
>
>
>
> This is the only exception I see when the job fails:
>
>
>
> Application application_1416304409718_0032 failed 2 times due to AM
> Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1
> due to:
>
>
>
>
>
> Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException:
>
> org.apache.hadoop.util.Shell$ExitCodeException:
>
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
>
> at org.apache.hadoop.util.Shell.run(Shell.java:424)
>
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
> Container exited with a non-zero exit code 1
>
>
>
>
>
> Cluster configuration details:
>
> Node1: 12 GB, 4 core
>
> Node2: 6 GB, 4 core
>
> Node3: 6 GB, 4 core
>
>
>
> yarn.scheduler.minimum-allocation-mb=2048
>
> yarn.scheduler.maximum-allocation-mb=4096
>
> yarn.nodemanager.resource.memory-mb=6144
>
>
>
>
>
>
>
> Regards
>

RE: MR job fails with too many mappers

Posted by Rohith Sharma K S <ro...@huawei.com>.

Hi

Could you get syserr and sysout log for contrainer.? These logs will be available in the same location  syslog for container.
${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
This helps to find problem!!


Thanks & Regards
Rohith Sharma K S

From: francexo83 [mailto:francexo83@gmail.com]
Sent: 18 November 2014 20:53
To: user@hadoop.apache.org
Subject: MR job fails with too many mappers

Hi All,

I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed on it.

The hadoop version is 2.3.0 and below my use case scenario.

I wrote a map reduce program that reads data from an hbase table and does some transformations on these data.
Jobs are very simple so they didn't need the  reduce phase. I also wrote a TableInputFormat  extension in order to maximize the number of concurrent maps on the cluster.
In other words, each  row should be processed by a single map task.

Everything goes well until the number of rows and consequently  mappers exceeds 300000 quota.

This is the only exception I see when the job fails:

Application application_1416304409718_0032 failed 2 times due to AM Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1 due to:


Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
at org.apache.hadoop.util.Shell.run(Shell.java:424)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1


Cluster configuration details:
Node1: 12 GB, 4 core
Node2: 6 GB, 4 core
Node3: 6 GB, 4 core

yarn.scheduler.minimum-allocation-mb=2048
yarn.scheduler.maximum-allocation-mb=4096
yarn.nodemanager.resource.memory-mb=6144



Regards

RE: MR job fails with too many mappers

Posted by Rohith Sharma K S <ro...@huawei.com>.

Hi

Could you get syserr and sysout log for contrainer.? These logs will be available in the same location  syslog for container.
${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
This helps to find problem!!


Thanks & Regards
Rohith Sharma K S

From: francexo83 [mailto:francexo83@gmail.com]
Sent: 18 November 2014 20:53
To: user@hadoop.apache.org
Subject: MR job fails with too many mappers

Hi All,

I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed on it.

The hadoop version is 2.3.0 and below my use case scenario.

I wrote a map reduce program that reads data from an hbase table and does some transformations on these data.
Jobs are very simple so they didn't need the  reduce phase. I also wrote a TableInputFormat  extension in order to maximize the number of concurrent maps on the cluster.
In other words, each  row should be processed by a single map task.

Everything goes well until the number of rows and consequently  mappers exceeds 300000 quota.

This is the only exception I see when the job fails:

Application application_1416304409718_0032 failed 2 times due to AM Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1 due to:


Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
at org.apache.hadoop.util.Shell.run(Shell.java:424)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1


Cluster configuration details:
Node1: 12 GB, 4 core
Node2: 6 GB, 4 core
Node3: 6 GB, 4 core

yarn.scheduler.minimum-allocation-mb=2048
yarn.scheduler.maximum-allocation-mb=4096
yarn.nodemanager.resource.memory-mb=6144



Regards

RE: MR job fails with too many mappers

Posted by Rohith Sharma K S <ro...@huawei.com>.

Hi

Could you get syserr and sysout log for contrainer.? These logs will be available in the same location  syslog for container.
${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
This helps to find problem!!


Thanks & Regards
Rohith Sharma K S

From: francexo83 [mailto:francexo83@gmail.com]
Sent: 18 November 2014 20:53
To: user@hadoop.apache.org
Subject: MR job fails with too many mappers

Hi All,

I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed on it.

The hadoop version is 2.3.0 and below my use case scenario.

I wrote a map reduce program that reads data from an hbase table and does some transformations on these data.
Jobs are very simple so they didn't need the  reduce phase. I also wrote a TableInputFormat  extension in order to maximize the number of concurrent maps on the cluster.
In other words, each  row should be processed by a single map task.

Everything goes well until the number of rows and consequently  mappers exceeds 300000 quota.

This is the only exception I see when the job fails:

Application application_1416304409718_0032 failed 2 times due to AM Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1 due to:


Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
at org.apache.hadoop.util.Shell.run(Shell.java:424)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1


Cluster configuration details:
Node1: 12 GB, 4 core
Node2: 6 GB, 4 core
Node3: 6 GB, 4 core

yarn.scheduler.minimum-allocation-mb=2048
yarn.scheduler.maximum-allocation-mb=4096
yarn.nodemanager.resource.memory-mb=6144



Regards

Re: MR job fails with too many mappers

Posted by francexo83 <fr...@gmail.com>.

Hi Tsuyoshi,

these are the configurations you requested:

yarn.app.mapreduce.am.resource.mb=256

mapreduce.map.memory.mb=Not set
mapreduce.reduce.memory.mb=Not set
mapreduce.map.java.opts=Not set
mapreduce.reduce.java.opts=Not set


thanks

2014-11-18 17:01 GMT+01:00 Tsuyoshi OZAWA <oz...@gmail.com>:

> Hi,
>
> Could you share following configurations? It can be failures because
> of out of memory at mapper side.
>
> yarn.app.mapreduce.am.resource.mb
> mapreduce.map.memory.mb
> mapreduce.reduce.memory.mb
> mapreduce.map.java.opts
> mapreduce.reduce.java.opts
>
> On Wed, Nov 19, 2014 at 12:23 AM, francexo83 <fr...@gmail.com> wrote:
> > Hi All,
> >
> > I have a small  hadoop cluster with three nodes and HBase 0.98.1
> installed
> > on it.
> >
> > The hadoop version is 2.3.0 and below my use case scenario.
> >
> > I wrote a map reduce program that reads data from an hbase table and does
> > some transformations on these data.
> > Jobs are very simple so they didn't need the  reduce phase. I also wrote
> a
> > TableInputFormat  extension in order to maximize the number of concurrent
> > maps on the cluster.
> > In other words, each  row should be processed by a single map task.
> >
> > Everything goes well until the number of rows and consequently  mappers
> > exceeds 300000 quota.
> >
> > This is the only exception I see when the job fails:
> >
> > Application application_1416304409718_0032 failed 2 times due to AM
> > Container for appattempt_1416304409718_0032_000002 exited with exitCode:
> 1
> > due to:
> >
> >
> > Exception from container-launch:
> > org.apache.hadoop.util.Shell$ExitCodeException:
> > org.apache.hadoop.util.Shell$ExitCodeException:
> > at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
> > at org.apache.hadoop.util.Shell.run(Shell.java:424)
> > at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > at java.lang.Thread.run(Thread.java:745)
> > Container exited with a non-zero exit code 1
> >
> >
> > Cluster configuration details:
> > Node1: 12 GB, 4 core
> > Node2: 6 GB, 4 core
> > Node3: 6 GB, 4 core
> >
> > yarn.scheduler.minimum-allocation-mb=2048
> > yarn.scheduler.maximum-allocation-mb=4096
> > yarn.nodemanager.resource.memory-mb=6144
> >
> >
> >
> > Regards
>
>
>
> --
> - Tsuyoshi
>

Re: MR job fails with too many mappers

Posted by francexo83 <fr...@gmail.com>.

Hi Tsuyoshi,

these are the configurations you requested:

yarn.app.mapreduce.am.resource.mb=256

mapreduce.map.memory.mb=Not set
mapreduce.reduce.memory.mb=Not set
mapreduce.map.java.opts=Not set
mapreduce.reduce.java.opts=Not set


thanks

2014-11-18 17:01 GMT+01:00 Tsuyoshi OZAWA <oz...@gmail.com>:

> Hi,
>
> Could you share following configurations? It can be failures because
> of out of memory at mapper side.
>
> yarn.app.mapreduce.am.resource.mb
> mapreduce.map.memory.mb
> mapreduce.reduce.memory.mb
> mapreduce.map.java.opts
> mapreduce.reduce.java.opts
>
> On Wed, Nov 19, 2014 at 12:23 AM, francexo83 <fr...@gmail.com> wrote:
> > Hi All,
> >
> > I have a small  hadoop cluster with three nodes and HBase 0.98.1
> installed
> > on it.
> >
> > The hadoop version is 2.3.0 and below my use case scenario.
> >
> > I wrote a map reduce program that reads data from an hbase table and does
> > some transformations on these data.
> > Jobs are very simple so they didn't need the  reduce phase. I also wrote
> a
> > TableInputFormat  extension in order to maximize the number of concurrent
> > maps on the cluster.
> > In other words, each  row should be processed by a single map task.
> >
> > Everything goes well until the number of rows and consequently  mappers
> > exceeds 300000 quota.
> >
> > This is the only exception I see when the job fails:
> >
> > Application application_1416304409718_0032 failed 2 times due to AM
> > Container for appattempt_1416304409718_0032_000002 exited with exitCode:
> 1
> > due to:
> >
> >
> > Exception from container-launch:
> > org.apache.hadoop.util.Shell$ExitCodeException:
> > org.apache.hadoop.util.Shell$ExitCodeException:
> > at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
> > at org.apache.hadoop.util.Shell.run(Shell.java:424)
> > at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > at java.lang.Thread.run(Thread.java:745)
> > Container exited with a non-zero exit code 1
> >
> >
> > Cluster configuration details:
> > Node1: 12 GB, 4 core
> > Node2: 6 GB, 4 core
> > Node3: 6 GB, 4 core
> >
> > yarn.scheduler.minimum-allocation-mb=2048
> > yarn.scheduler.maximum-allocation-mb=4096
> > yarn.nodemanager.resource.memory-mb=6144
> >
> >
> >
> > Regards
>
>
>
> --
> - Tsuyoshi
>

Re: MR job fails with too many mappers

Posted by francexo83 <fr...@gmail.com>.

Hi Tsuyoshi,

these are the configurations you requested:

yarn.app.mapreduce.am.resource.mb=256

mapreduce.map.memory.mb=Not set
mapreduce.reduce.memory.mb=Not set
mapreduce.map.java.opts=Not set
mapreduce.reduce.java.opts=Not set


thanks

2014-11-18 17:01 GMT+01:00 Tsuyoshi OZAWA <oz...@gmail.com>:

> Hi,
>
> Could you share following configurations? It can be failures because
> of out of memory at mapper side.
>
> yarn.app.mapreduce.am.resource.mb
> mapreduce.map.memory.mb
> mapreduce.reduce.memory.mb
> mapreduce.map.java.opts
> mapreduce.reduce.java.opts
>
> On Wed, Nov 19, 2014 at 12:23 AM, francexo83 <fr...@gmail.com> wrote:
> > Hi All,
> >
> > I have a small  hadoop cluster with three nodes and HBase 0.98.1
> installed
> > on it.
> >
> > The hadoop version is 2.3.0 and below my use case scenario.
> >
> > I wrote a map reduce program that reads data from an hbase table and does
> > some transformations on these data.
> > Jobs are very simple so they didn't need the  reduce phase. I also wrote
> a
> > TableInputFormat  extension in order to maximize the number of concurrent
> > maps on the cluster.
> > In other words, each  row should be processed by a single map task.
> >
> > Everything goes well until the number of rows and consequently  mappers
> > exceeds 300000 quota.
> >
> > This is the only exception I see when the job fails:
> >
> > Application application_1416304409718_0032 failed 2 times due to AM
> > Container for appattempt_1416304409718_0032_000002 exited with exitCode:
> 1
> > due to:
> >
> >
> > Exception from container-launch:
> > org.apache.hadoop.util.Shell$ExitCodeException:
> > org.apache.hadoop.util.Shell$ExitCodeException:
> > at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
> > at org.apache.hadoop.util.Shell.run(Shell.java:424)
> > at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > at java.lang.Thread.run(Thread.java:745)
> > Container exited with a non-zero exit code 1
> >
> >
> > Cluster configuration details:
> > Node1: 12 GB, 4 core
> > Node2: 6 GB, 4 core
> > Node3: 6 GB, 4 core
> >
> > yarn.scheduler.minimum-allocation-mb=2048
> > yarn.scheduler.maximum-allocation-mb=4096
> > yarn.nodemanager.resource.memory-mb=6144
> >
> >
> >
> > Regards
>
>
>
> --
> - Tsuyoshi
>

Re: MR job fails with too many mappers

Posted by francexo83 <fr...@gmail.com>.

Hi Tsuyoshi,

these are the configurations you requested:

yarn.app.mapreduce.am.resource.mb=256

mapreduce.map.memory.mb=Not set
mapreduce.reduce.memory.mb=Not set
mapreduce.map.java.opts=Not set
mapreduce.reduce.java.opts=Not set


thanks

2014-11-18 17:01 GMT+01:00 Tsuyoshi OZAWA <oz...@gmail.com>:

> Hi,
>
> Could you share following configurations? It can be failures because
> of out of memory at mapper side.
>
> yarn.app.mapreduce.am.resource.mb
> mapreduce.map.memory.mb
> mapreduce.reduce.memory.mb
> mapreduce.map.java.opts
> mapreduce.reduce.java.opts
>
> On Wed, Nov 19, 2014 at 12:23 AM, francexo83 <fr...@gmail.com> wrote:
> > Hi All,
> >
> > I have a small  hadoop cluster with three nodes and HBase 0.98.1
> installed
> > on it.
> >
> > The hadoop version is 2.3.0 and below my use case scenario.
> >
> > I wrote a map reduce program that reads data from an hbase table and does
> > some transformations on these data.
> > Jobs are very simple so they didn't need the  reduce phase. I also wrote
> a
> > TableInputFormat  extension in order to maximize the number of concurrent
> > maps on the cluster.
> > In other words, each  row should be processed by a single map task.
> >
> > Everything goes well until the number of rows and consequently  mappers
> > exceeds 300000 quota.
> >
> > This is the only exception I see when the job fails:
> >
> > Application application_1416304409718_0032 failed 2 times due to AM
> > Container for appattempt_1416304409718_0032_000002 exited with exitCode:
> 1
> > due to:
> >
> >
> > Exception from container-launch:
> > org.apache.hadoop.util.Shell$ExitCodeException:
> > org.apache.hadoop.util.Shell$ExitCodeException:
> > at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
> > at org.apache.hadoop.util.Shell.run(Shell.java:424)
> > at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > at java.lang.Thread.run(Thread.java:745)
> > Container exited with a non-zero exit code 1
> >
> >
> > Cluster configuration details:
> > Node1: 12 GB, 4 core
> > Node2: 6 GB, 4 core
> > Node3: 6 GB, 4 core
> >
> > yarn.scheduler.minimum-allocation-mb=2048
> > yarn.scheduler.maximum-allocation-mb=4096
> > yarn.nodemanager.resource.memory-mb=6144
> >
> >
> >
> > Regards
>
>
>
> --
> - Tsuyoshi
>

Re: MR job fails with too many mappers

Posted by Tsuyoshi OZAWA <oz...@gmail.com>.

Hi,

Could you share following configurations? It can be failures because
of out of memory at mapper side.

yarn.app.mapreduce.am.resource.mb
mapreduce.map.memory.mb
mapreduce.reduce.memory.mb
mapreduce.map.java.opts
mapreduce.reduce.java.opts

On Wed, Nov 19, 2014 at 12:23 AM, francexo83 <fr...@gmail.com> wrote:
> Hi All,
>
> I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed
> on it.
>
> The hadoop version is 2.3.0 and below my use case scenario.
>
> I wrote a map reduce program that reads data from an hbase table and does
> some transformations on these data.
> Jobs are very simple so they didn't need the  reduce phase. I also wrote a
> TableInputFormat  extension in order to maximize the number of concurrent
> maps on the cluster.
> In other words, each  row should be processed by a single map task.
>
> Everything goes well until the number of rows and consequently  mappers
> exceeds 300000 quota.
>
> This is the only exception I see when the job fails:
>
> Application application_1416304409718_0032 failed 2 times due to AM
> Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1
> due to:
>
>
> Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException:
> org.apache.hadoop.util.Shell$ExitCodeException:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
> at org.apache.hadoop.util.Shell.run(Shell.java:424)
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Container exited with a non-zero exit code 1
>
>
> Cluster configuration details:
> Node1: 12 GB, 4 core
> Node2: 6 GB, 4 core
> Node3: 6 GB, 4 core
>
> yarn.scheduler.minimum-allocation-mb=2048
> yarn.scheduler.maximum-allocation-mb=4096
> yarn.nodemanager.resource.memory-mb=6144
>
>
>
> Regards



-- 
- Tsuyoshi

RE: MR job fails with too many mappers

Posted by Rohith Sharma K S <ro...@huawei.com>.

Hi

Could you get syserr and sysout log for contrainer.? These logs will be available in the same location  syslog for container.
${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
This helps to find problem!!


Thanks & Regards
Rohith Sharma K S

From: francexo83 [mailto:francexo83@gmail.com]
Sent: 18 November 2014 20:53
To: user@hadoop.apache.org
Subject: MR job fails with too many mappers

Hi All,

I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed on it.

The hadoop version is 2.3.0 and below my use case scenario.

I wrote a map reduce program that reads data from an hbase table and does some transformations on these data.
Jobs are very simple so they didn't need the  reduce phase. I also wrote a TableInputFormat  extension in order to maximize the number of concurrent maps on the cluster.
In other words, each  row should be processed by a single map task.

Everything goes well until the number of rows and consequently  mappers exceeds 300000 quota.

This is the only exception I see when the job fails:

Application application_1416304409718_0032 failed 2 times due to AM Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1 due to:


Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
at org.apache.hadoop.util.Shell.run(Shell.java:424)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1


Cluster configuration details:
Node1: 12 GB, 4 core
Node2: 6 GB, 4 core
Node3: 6 GB, 4 core

yarn.scheduler.minimum-allocation-mb=2048
yarn.scheduler.maximum-allocation-mb=4096
yarn.nodemanager.resource.memory-mb=6144



Regards

Re: MR job fails with too many mappers

Posted by Tsuyoshi OZAWA <oz...@gmail.com>.

Hi,

Could you share following configurations? It can be failures because
of out of memory at mapper side.

yarn.app.mapreduce.am.resource.mb
mapreduce.map.memory.mb
mapreduce.reduce.memory.mb
mapreduce.map.java.opts
mapreduce.reduce.java.opts

On Wed, Nov 19, 2014 at 12:23 AM, francexo83 <fr...@gmail.com> wrote:
> Hi All,
>
> I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed
> on it.
>
> The hadoop version is 2.3.0 and below my use case scenario.
>
> I wrote a map reduce program that reads data from an hbase table and does
> some transformations on these data.
> Jobs are very simple so they didn't need the  reduce phase. I also wrote a
> TableInputFormat  extension in order to maximize the number of concurrent
> maps on the cluster.
> In other words, each  row should be processed by a single map task.
>
> Everything goes well until the number of rows and consequently  mappers
> exceeds 300000 quota.
>
> This is the only exception I see when the job fails:
>
> Application application_1416304409718_0032 failed 2 times due to AM
> Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1
> due to:
>
>
> Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException:
> org.apache.hadoop.util.Shell$ExitCodeException:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
> at org.apache.hadoop.util.Shell.run(Shell.java:424)
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Container exited with a non-zero exit code 1
>
>
> Cluster configuration details:
> Node1: 12 GB, 4 core
> Node2: 6 GB, 4 core
> Node3: 6 GB, 4 core
>
> yarn.scheduler.minimum-allocation-mb=2048
> yarn.scheduler.maximum-allocation-mb=4096
> yarn.nodemanager.resource.memory-mb=6144
>
>
>
> Regards



-- 
- Tsuyoshi

Re: MR job fails with too many mappers

Posted by Tsuyoshi OZAWA <oz...@gmail.com>.

Hi,

Could you share following configurations? It can be failures because
of out of memory at mapper side.

yarn.app.mapreduce.am.resource.mb
mapreduce.map.memory.mb
mapreduce.reduce.memory.mb
mapreduce.map.java.opts
mapreduce.reduce.java.opts

On Wed, Nov 19, 2014 at 12:23 AM, francexo83 <fr...@gmail.com> wrote:
> Hi All,
>
> I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed
> on it.
>
> The hadoop version is 2.3.0 and below my use case scenario.
>
> I wrote a map reduce program that reads data from an hbase table and does
> some transformations on these data.
> Jobs are very simple so they didn't need the  reduce phase. I also wrote a
> TableInputFormat  extension in order to maximize the number of concurrent
> maps on the cluster.
> In other words, each  row should be processed by a single map task.
>
> Everything goes well until the number of rows and consequently  mappers
> exceeds 300000 quota.
>
> This is the only exception I see when the job fails:
>
> Application application_1416304409718_0032 failed 2 times due to AM
> Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1
> due to:
>
>
> Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException:
> org.apache.hadoop.util.Shell$ExitCodeException:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
> at org.apache.hadoop.util.Shell.run(Shell.java:424)
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Container exited with a non-zero exit code 1
>
>
> Cluster configuration details:
> Node1: 12 GB, 4 core
> Node2: 6 GB, 4 core
> Node3: 6 GB, 4 core
>
> yarn.scheduler.minimum-allocation-mb=2048
> yarn.scheduler.maximum-allocation-mb=4096
> yarn.nodemanager.resource.memory-mb=6144
>
>
>
> Regards



-- 
- Tsuyoshi

Re: MR job fails with too many mappers

Posted by Tsuyoshi OZAWA <oz...@gmail.com>.

Hi,

Could you share following configurations? It can be failures because
of out of memory at mapper side.

yarn.app.mapreduce.am.resource.mb
mapreduce.map.memory.mb
mapreduce.reduce.memory.mb
mapreduce.map.java.opts
mapreduce.reduce.java.opts

On Wed, Nov 19, 2014 at 12:23 AM, francexo83 <fr...@gmail.com> wrote:
> Hi All,
>
> I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed
> on it.
>
> The hadoop version is 2.3.0 and below my use case scenario.
>
> I wrote a map reduce program that reads data from an hbase table and does
> some transformations on these data.
> Jobs are very simple so they didn't need the  reduce phase. I also wrote a
> TableInputFormat  extension in order to maximize the number of concurrent
> maps on the cluster.
> In other words, each  row should be processed by a single map task.
>
> Everything goes well until the number of rows and consequently  mappers
> exceeds 300000 quota.
>
> This is the only exception I see when the job fails:
>
> Application application_1416304409718_0032 failed 2 times due to AM
> Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1
> due to:
>
>
> Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException:
> org.apache.hadoop.util.Shell$ExitCodeException:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
> at org.apache.hadoop.util.Shell.run(Shell.java:424)
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Container exited with a non-zero exit code 1
>
>
> Cluster configuration details:
> Node1: 12 GB, 4 core
> Node2: 6 GB, 4 core
> Node3: 6 GB, 4 core
>
> yarn.scheduler.minimum-allocation-mb=2048
> yarn.scheduler.maximum-allocation-mb=4096
> yarn.nodemanager.resource.memory-mb=6144
>
>
>
> Regards



-- 
- Tsuyoshi