You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by Bhallamudi kamesh <bh...@huawei.com> on 2011/02/22 12:18:58 UTC

Exception due to inproper configuration

Hi All,

When we submit a job through job client, job's jar in mapred.system.dir directory will be replicated as per configuration parameter mapred.submit.replication, which is present in mapred-default.xml.By default this value 10.

Now there are chances of configuring dfs.replication and dfs.replication.max, which are present in hdfs-site.xml, independent of mapred.submit.replication.

Suppose user has configured dfs.replication.max as 5(say), then the following exception will be thrown

org.apache.hadoop.ipc.RemoteException: java.io.IOException: file /home/kamesh/hadoop/hadoop-root/mapred/system/job_201102221545_0001/job.jar.
Requested replication 10 exceeds maximum 2
       at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.verifyReplication(FSNamesystem.java:1179)
       at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplicationInternal(FSNamesystem.java:1130)
       at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:1115)
       at org.apache.hadoop.hdfs.server.namenode.NameNode.setReplication(NameNode.java:630)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
       at java.lang.reflect.Method.invoke(Method.java:597)
       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:514)
       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:991)
       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:987)
       at java.security.AccessController.doPrivileged(Native Method)
       at javax.security.auth.Subject.doAs(Subject.java:396)
       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:985)

As per the code this absolutely correct, and it should be so.

However, I feel setting replication as maximum replication, when replication exceeds maximum replication. This even ensures application execution.

Even this behavior has both pros and cons.
It executes the application. However it will not replicate as per the user given configuration.

What do you think?

Thanks Regards,
Bh.V.S.Kamesh.