You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by "Liu, Raymond" <ra...@intel.com> on 2013/07/12 09:26:53 UTC

Failed to run wordcount on YARN

Hi 

I just start to try out hadoop2.0, I use the 2.0.5-alpha package

And follow 

http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-project-dist/hadoop-common/ClusterSetup.html

to setup a cluster in non-security mode. HDFS works fine with client tools.

While when I run wordcount example, there are errors :

./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar wordcount /tmp /out


13/07/12 15:05:53 INFO mapreduce.Job: Task Id : attempt_1373609123233_0004_m_000004_0, Status : FAILED
Error: java.io.FileNotFoundException: Path is not a file: /tmp/hadoop-yarn
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:42)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1317)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1276)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1252)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1225)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:403)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:239)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40728)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:986)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:974)
        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:157)
        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:124)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:117)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1131)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:244)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:77)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:713)
        at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:89)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:519)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)

I check the HDFS and found /tmp/hadoop-yarn is there , this dir's owner is the same as the job user.

And to ensure it works, I also create /tmp/hadoop-yarn on local fs. None of it works.

Any idea what might be the problem? Thx!


Best Regards,
Raymond Liu


Re: unsubscribe

Posted by Nitin Pawar <ni...@gmail.com>.
Manal,

Its a user based mailing list and not a topic based. so its a boolean
value, if you are subscribed to mails you will get all of them else you
will not get any


On Fri, Jul 12, 2013 at 4:24 PM, Manal Helal <ma...@gmail.com> wrote:

> how do I keep my subscription, but receive no notifications of threads I
> didn't initiate or subscribe to,
>
> thanks,
>
>
> On 12 July 2013 11:11, Devaraj k <de...@huawei.com> wrote:
>
>> You need to send a mail to user-unsubscribe@hadoop.apache.org for
>> unsubscribe.
>>
>> http://hadoop.apache.org/mailing_lists.html#User
>>
>> Thanks
>> Devaraj k
>>
>>
>> -----Original Message-----
>> From: Margusja [mailto:margus@roo.ee]
>> Sent: 12 July 2013 14:26
>> To: user@hadoop.apache.org
>> Subject: unsubscribe
>>
>>
>>
>
>
> --
> Kind Regards,
>
> Manal Helal
>



-- 
Nitin Pawar

Re: unsubscribe

Posted by Nitin Pawar <ni...@gmail.com>.
Manal,

Its a user based mailing list and not a topic based. so its a boolean
value, if you are subscribed to mails you will get all of them else you
will not get any


On Fri, Jul 12, 2013 at 4:24 PM, Manal Helal <ma...@gmail.com> wrote:

> how do I keep my subscription, but receive no notifications of threads I
> didn't initiate or subscribe to,
>
> thanks,
>
>
> On 12 July 2013 11:11, Devaraj k <de...@huawei.com> wrote:
>
>> You need to send a mail to user-unsubscribe@hadoop.apache.org for
>> unsubscribe.
>>
>> http://hadoop.apache.org/mailing_lists.html#User
>>
>> Thanks
>> Devaraj k
>>
>>
>> -----Original Message-----
>> From: Margusja [mailto:margus@roo.ee]
>> Sent: 12 July 2013 14:26
>> To: user@hadoop.apache.org
>> Subject: unsubscribe
>>
>>
>>
>
>
> --
> Kind Regards,
>
> Manal Helal
>



-- 
Nitin Pawar

Re: unsubscribe

Posted by Nitin Pawar <ni...@gmail.com>.
Manal,

Its a user based mailing list and not a topic based. so its a boolean
value, if you are subscribed to mails you will get all of them else you
will not get any


On Fri, Jul 12, 2013 at 4:24 PM, Manal Helal <ma...@gmail.com> wrote:

> how do I keep my subscription, but receive no notifications of threads I
> didn't initiate or subscribe to,
>
> thanks,
>
>
> On 12 July 2013 11:11, Devaraj k <de...@huawei.com> wrote:
>
>> You need to send a mail to user-unsubscribe@hadoop.apache.org for
>> unsubscribe.
>>
>> http://hadoop.apache.org/mailing_lists.html#User
>>
>> Thanks
>> Devaraj k
>>
>>
>> -----Original Message-----
>> From: Margusja [mailto:margus@roo.ee]
>> Sent: 12 July 2013 14:26
>> To: user@hadoop.apache.org
>> Subject: unsubscribe
>>
>>
>>
>
>
> --
> Kind Regards,
>
> Manal Helal
>



-- 
Nitin Pawar

Re: unsubscribe

Posted by Nitin Pawar <ni...@gmail.com>.
Manal,

Its a user based mailing list and not a topic based. so its a boolean
value, if you are subscribed to mails you will get all of them else you
will not get any


On Fri, Jul 12, 2013 at 4:24 PM, Manal Helal <ma...@gmail.com> wrote:

> how do I keep my subscription, but receive no notifications of threads I
> didn't initiate or subscribe to,
>
> thanks,
>
>
> On 12 July 2013 11:11, Devaraj k <de...@huawei.com> wrote:
>
>> You need to send a mail to user-unsubscribe@hadoop.apache.org for
>> unsubscribe.
>>
>> http://hadoop.apache.org/mailing_lists.html#User
>>
>> Thanks
>> Devaraj k
>>
>>
>> -----Original Message-----
>> From: Margusja [mailto:margus@roo.ee]
>> Sent: 12 July 2013 14:26
>> To: user@hadoop.apache.org
>> Subject: unsubscribe
>>
>>
>>
>
>
> --
> Kind Regards,
>
> Manal Helal
>



-- 
Nitin Pawar

Re: unsubscribe

Posted by Manal Helal <ma...@gmail.com>.
how do I keep my subscription, but receive no notifications of threads I
didn't initiate or subscribe to,

thanks,


On 12 July 2013 11:11, Devaraj k <de...@huawei.com> wrote:

> You need to send a mail to user-unsubscribe@hadoop.apache.org for
> unsubscribe.
>
> http://hadoop.apache.org/mailing_lists.html#User
>
> Thanks
> Devaraj k
>
>
> -----Original Message-----
> From: Margusja [mailto:margus@roo.ee]
> Sent: 12 July 2013 14:26
> To: user@hadoop.apache.org
> Subject: unsubscribe
>
>
>


-- 
Kind Regards,

Manal Helal

Re: unsubscribe

Posted by Manal Helal <ma...@gmail.com>.
how do I keep my subscription, but receive no notifications of threads I
didn't initiate or subscribe to,

thanks,


On 12 July 2013 11:11, Devaraj k <de...@huawei.com> wrote:

> You need to send a mail to user-unsubscribe@hadoop.apache.org for
> unsubscribe.
>
> http://hadoop.apache.org/mailing_lists.html#User
>
> Thanks
> Devaraj k
>
>
> -----Original Message-----
> From: Margusja [mailto:margus@roo.ee]
> Sent: 12 July 2013 14:26
> To: user@hadoop.apache.org
> Subject: unsubscribe
>
>
>


-- 
Kind Regards,

Manal Helal

Re: unsubscribe

Posted by Manal Helal <ma...@gmail.com>.
how do I keep my subscription, but receive no notifications of threads I
didn't initiate or subscribe to,

thanks,


On 12 July 2013 11:11, Devaraj k <de...@huawei.com> wrote:

> You need to send a mail to user-unsubscribe@hadoop.apache.org for
> unsubscribe.
>
> http://hadoop.apache.org/mailing_lists.html#User
>
> Thanks
> Devaraj k
>
>
> -----Original Message-----
> From: Margusja [mailto:margus@roo.ee]
> Sent: 12 July 2013 14:26
> To: user@hadoop.apache.org
> Subject: unsubscribe
>
>
>


-- 
Kind Regards,

Manal Helal

Re: unsubscribe

Posted by Manal Helal <ma...@gmail.com>.
how do I keep my subscription, but receive no notifications of threads I
didn't initiate or subscribe to,

thanks,


On 12 July 2013 11:11, Devaraj k <de...@huawei.com> wrote:

> You need to send a mail to user-unsubscribe@hadoop.apache.org for
> unsubscribe.
>
> http://hadoop.apache.org/mailing_lists.html#User
>
> Thanks
> Devaraj k
>
>
> -----Original Message-----
> From: Margusja [mailto:margus@roo.ee]
> Sent: 12 July 2013 14:26
> To: user@hadoop.apache.org
> Subject: unsubscribe
>
>
>


-- 
Kind Regards,

Manal Helal

RE: unsubscribe

Posted by Devaraj k <de...@huawei.com>.
You need to send a mail to user-unsubscribe@hadoop.apache.org for unsubscribe. 

http://hadoop.apache.org/mailing_lists.html#User

Thanks
Devaraj k


-----Original Message-----
From: Margusja [mailto:margus@roo.ee] 
Sent: 12 July 2013 14:26
To: user@hadoop.apache.org
Subject: unsubscribe



RE: unsubscribe

Posted by Devaraj k <de...@huawei.com>.
You need to send a mail to user-unsubscribe@hadoop.apache.org for unsubscribe. 

http://hadoop.apache.org/mailing_lists.html#User

Thanks
Devaraj k


-----Original Message-----
From: Margusja [mailto:margus@roo.ee] 
Sent: 12 July 2013 14:26
To: user@hadoop.apache.org
Subject: unsubscribe



RE: unsubscribe

Posted by Devaraj k <de...@huawei.com>.
You need to send a mail to user-unsubscribe@hadoop.apache.org for unsubscribe. 

http://hadoop.apache.org/mailing_lists.html#User

Thanks
Devaraj k


-----Original Message-----
From: Margusja [mailto:margus@roo.ee] 
Sent: 12 July 2013 14:26
To: user@hadoop.apache.org
Subject: unsubscribe



RE: unsubscribe

Posted by Devaraj k <de...@huawei.com>.
You need to send a mail to user-unsubscribe@hadoop.apache.org for unsubscribe. 

http://hadoop.apache.org/mailing_lists.html#User

Thanks
Devaraj k


-----Original Message-----
From: Margusja [mailto:margus@roo.ee] 
Sent: 12 July 2013 14:26
To: user@hadoop.apache.org
Subject: unsubscribe



unsubscribe

Posted by Margusja <ma...@roo.ee>.

unsubscribe

Posted by Margusja <ma...@roo.ee>.

unsubscribe

Posted by Margusja <ma...@roo.ee>.

RE: Failed to run wordcount on YARN

Posted by "Liu, Raymond" <ra...@intel.com>.
Hi Devaraj

Thanks a lot for the explanation in detail.

Best Regards,
Raymond Liu

-----Original Message-----
From: Devaraj k [mailto:devaraj.k@huawei.com] 
Sent: Friday, July 12, 2013 4:24 PM
To: user@hadoop.apache.org
Subject: RE: Failed to run wordcount on YARN

Hi Raymond, 

	In Hadoop 2.0.5 version, FileInputFormat new API doesn't support reading the files recursively in input dir. In supports only having the input dir with files. If the input dir has any child dir's then it throws below error.

This has been added in trunk with this JIRA https://issues.apache.org/jira/browse/MAPREDUCE-3193.

You can give input dir to the Job which doesn't have nested dir's or you can make use of the old FileInputFormat API to read files recursively in the sub dir's.

Thanks
Devaraj k

-----Original Message-----
From: Liu, Raymond [mailto:raymond.liu@intel.com] 
Sent: 12 July 2013 12:57
To: user@hadoop.apache.org
Subject: Failed to run wordcount on YARN

Hi 

I just start to try out hadoop2.0, I use the 2.0.5-alpha package

And follow 

http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-project-dist/hadoop-common/ClusterSetup.html

to setup a cluster in non-security mode. HDFS works fine with client tools.

While when I run wordcount example, there are errors :

./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar wordcount /tmp /out


13/07/12 15:05:53 INFO mapreduce.Job: Task Id : attempt_1373609123233_0004_m_000004_0, Status : FAILED
Error: java.io.FileNotFoundException: Path is not a file: /tmp/hadoop-yarn
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:42)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1317)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1276)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1252)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1225)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:403)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:239)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40728)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:986)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:974)
        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:157)
        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:124)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:117)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1131)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:244)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:77)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:713)
        at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:89)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:519)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)

I check the HDFS and found /tmp/hadoop-yarn is there , this dir's owner is the same as the job user.

And to ensure it works, I also create /tmp/hadoop-yarn on local fs. None of it works.

Any idea what might be the problem? Thx!


Best Regards,
Raymond Liu


RE: Failed to run wordcount on YARN

Posted by "Liu, Raymond" <ra...@intel.com>.
Hi Devaraj

Thanks a lot for the explanation in detail.

Best Regards,
Raymond Liu

-----Original Message-----
From: Devaraj k [mailto:devaraj.k@huawei.com] 
Sent: Friday, July 12, 2013 4:24 PM
To: user@hadoop.apache.org
Subject: RE: Failed to run wordcount on YARN

Hi Raymond, 

	In Hadoop 2.0.5 version, FileInputFormat new API doesn't support reading the files recursively in input dir. In supports only having the input dir with files. If the input dir has any child dir's then it throws below error.

This has been added in trunk with this JIRA https://issues.apache.org/jira/browse/MAPREDUCE-3193.

You can give input dir to the Job which doesn't have nested dir's or you can make use of the old FileInputFormat API to read files recursively in the sub dir's.

Thanks
Devaraj k

-----Original Message-----
From: Liu, Raymond [mailto:raymond.liu@intel.com] 
Sent: 12 July 2013 12:57
To: user@hadoop.apache.org
Subject: Failed to run wordcount on YARN

Hi 

I just start to try out hadoop2.0, I use the 2.0.5-alpha package

And follow 

http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-project-dist/hadoop-common/ClusterSetup.html

to setup a cluster in non-security mode. HDFS works fine with client tools.

While when I run wordcount example, there are errors :

./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar wordcount /tmp /out


13/07/12 15:05:53 INFO mapreduce.Job: Task Id : attempt_1373609123233_0004_m_000004_0, Status : FAILED
Error: java.io.FileNotFoundException: Path is not a file: /tmp/hadoop-yarn
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:42)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1317)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1276)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1252)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1225)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:403)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:239)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40728)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:986)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:974)
        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:157)
        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:124)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:117)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1131)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:244)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:77)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:713)
        at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:89)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:519)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)

I check the HDFS and found /tmp/hadoop-yarn is there , this dir's owner is the same as the job user.

And to ensure it works, I also create /tmp/hadoop-yarn on local fs. None of it works.

Any idea what might be the problem? Thx!


Best Regards,
Raymond Liu


RE: Failed to run wordcount on YARN

Posted by "Liu, Raymond" <ra...@intel.com>.
Hi Devaraj

Thanks a lot for the explanation in detail.

Best Regards,
Raymond Liu

-----Original Message-----
From: Devaraj k [mailto:devaraj.k@huawei.com] 
Sent: Friday, July 12, 2013 4:24 PM
To: user@hadoop.apache.org
Subject: RE: Failed to run wordcount on YARN

Hi Raymond, 

	In Hadoop 2.0.5 version, FileInputFormat new API doesn't support reading the files recursively in input dir. In supports only having the input dir with files. If the input dir has any child dir's then it throws below error.

This has been added in trunk with this JIRA https://issues.apache.org/jira/browse/MAPREDUCE-3193.

You can give input dir to the Job which doesn't have nested dir's or you can make use of the old FileInputFormat API to read files recursively in the sub dir's.

Thanks
Devaraj k

-----Original Message-----
From: Liu, Raymond [mailto:raymond.liu@intel.com] 
Sent: 12 July 2013 12:57
To: user@hadoop.apache.org
Subject: Failed to run wordcount on YARN

Hi 

I just start to try out hadoop2.0, I use the 2.0.5-alpha package

And follow 

http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-project-dist/hadoop-common/ClusterSetup.html

to setup a cluster in non-security mode. HDFS works fine with client tools.

While when I run wordcount example, there are errors :

./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar wordcount /tmp /out


13/07/12 15:05:53 INFO mapreduce.Job: Task Id : attempt_1373609123233_0004_m_000004_0, Status : FAILED
Error: java.io.FileNotFoundException: Path is not a file: /tmp/hadoop-yarn
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:42)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1317)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1276)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1252)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1225)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:403)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:239)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40728)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:986)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:974)
        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:157)
        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:124)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:117)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1131)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:244)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:77)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:713)
        at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:89)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:519)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)

I check the HDFS and found /tmp/hadoop-yarn is there , this dir's owner is the same as the job user.

And to ensure it works, I also create /tmp/hadoop-yarn on local fs. None of it works.

Any idea what might be the problem? Thx!


Best Regards,
Raymond Liu


unsubscribe

Posted by Margusja <ma...@roo.ee>.

RE: Failed to run wordcount on YARN

Posted by "Liu, Raymond" <ra...@intel.com>.
Hi Devaraj

Thanks a lot for the explanation in detail.

Best Regards,
Raymond Liu

-----Original Message-----
From: Devaraj k [mailto:devaraj.k@huawei.com] 
Sent: Friday, July 12, 2013 4:24 PM
To: user@hadoop.apache.org
Subject: RE: Failed to run wordcount on YARN

Hi Raymond, 

	In Hadoop 2.0.5 version, FileInputFormat new API doesn't support reading the files recursively in input dir. In supports only having the input dir with files. If the input dir has any child dir's then it throws below error.

This has been added in trunk with this JIRA https://issues.apache.org/jira/browse/MAPREDUCE-3193.

You can give input dir to the Job which doesn't have nested dir's or you can make use of the old FileInputFormat API to read files recursively in the sub dir's.

Thanks
Devaraj k

-----Original Message-----
From: Liu, Raymond [mailto:raymond.liu@intel.com] 
Sent: 12 July 2013 12:57
To: user@hadoop.apache.org
Subject: Failed to run wordcount on YARN

Hi 

I just start to try out hadoop2.0, I use the 2.0.5-alpha package

And follow 

http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-project-dist/hadoop-common/ClusterSetup.html

to setup a cluster in non-security mode. HDFS works fine with client tools.

While when I run wordcount example, there are errors :

./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar wordcount /tmp /out


13/07/12 15:05:53 INFO mapreduce.Job: Task Id : attempt_1373609123233_0004_m_000004_0, Status : FAILED
Error: java.io.FileNotFoundException: Path is not a file: /tmp/hadoop-yarn
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:42)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1317)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1276)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1252)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1225)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:403)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:239)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40728)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:986)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:974)
        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:157)
        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:124)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:117)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1131)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:244)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:77)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:713)
        at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:89)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:519)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)

I check the HDFS and found /tmp/hadoop-yarn is there , this dir's owner is the same as the job user.

And to ensure it works, I also create /tmp/hadoop-yarn on local fs. None of it works.

Any idea what might be the problem? Thx!


Best Regards,
Raymond Liu


RE: Failed to run wordcount on YARN

Posted by Devaraj k <de...@huawei.com>.
Hi Raymond, 

	In Hadoop 2.0.5 version, FileInputFormat new API doesn't support reading the files recursively in input dir. In supports only having the input dir with files. If the input dir has any child dir's then it throws below error.

This has been added in trunk with this JIRA https://issues.apache.org/jira/browse/MAPREDUCE-3193.

You can give input dir to the Job which doesn't have nested dir's or you can make use of the old FileInputFormat API to read files recursively in the sub dir's.

Thanks
Devaraj k

-----Original Message-----
From: Liu, Raymond [mailto:raymond.liu@intel.com] 
Sent: 12 July 2013 12:57
To: user@hadoop.apache.org
Subject: Failed to run wordcount on YARN

Hi 

I just start to try out hadoop2.0, I use the 2.0.5-alpha package

And follow 

http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-project-dist/hadoop-common/ClusterSetup.html

to setup a cluster in non-security mode. HDFS works fine with client tools.

While when I run wordcount example, there are errors :

./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar wordcount /tmp /out


13/07/12 15:05:53 INFO mapreduce.Job: Task Id : attempt_1373609123233_0004_m_000004_0, Status : FAILED
Error: java.io.FileNotFoundException: Path is not a file: /tmp/hadoop-yarn
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:42)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1317)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1276)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1252)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1225)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:403)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:239)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40728)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:986)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:974)
        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:157)
        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:124)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:117)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1131)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:244)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:77)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:713)
        at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:89)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:519)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)

I check the HDFS and found /tmp/hadoop-yarn is there , this dir's owner is the same as the job user.

And to ensure it works, I also create /tmp/hadoop-yarn on local fs. None of it works.

Any idea what might be the problem? Thx!


Best Regards,
Raymond Liu


RE: Failed to run wordcount on YARN

Posted by Devaraj k <de...@huawei.com>.
Hi Raymond, 

	In Hadoop 2.0.5 version, FileInputFormat new API doesn't support reading the files recursively in input dir. In supports only having the input dir with files. If the input dir has any child dir's then it throws below error.

This has been added in trunk with this JIRA https://issues.apache.org/jira/browse/MAPREDUCE-3193.

You can give input dir to the Job which doesn't have nested dir's or you can make use of the old FileInputFormat API to read files recursively in the sub dir's.

Thanks
Devaraj k

-----Original Message-----
From: Liu, Raymond [mailto:raymond.liu@intel.com] 
Sent: 12 July 2013 12:57
To: user@hadoop.apache.org
Subject: Failed to run wordcount on YARN

Hi 

I just start to try out hadoop2.0, I use the 2.0.5-alpha package

And follow 

http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-project-dist/hadoop-common/ClusterSetup.html

to setup a cluster in non-security mode. HDFS works fine with client tools.

While when I run wordcount example, there are errors :

./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar wordcount /tmp /out


13/07/12 15:05:53 INFO mapreduce.Job: Task Id : attempt_1373609123233_0004_m_000004_0, Status : FAILED
Error: java.io.FileNotFoundException: Path is not a file: /tmp/hadoop-yarn
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:42)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1317)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1276)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1252)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1225)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:403)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:239)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40728)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:986)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:974)
        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:157)
        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:124)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:117)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1131)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:244)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:77)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:713)
        at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:89)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:519)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)

I check the HDFS and found /tmp/hadoop-yarn is there , this dir's owner is the same as the job user.

And to ensure it works, I also create /tmp/hadoop-yarn on local fs. None of it works.

Any idea what might be the problem? Thx!


Best Regards,
Raymond Liu


RE: Failed to run wordcount on YARN

Posted by Devaraj k <de...@huawei.com>.
Hi Raymond, 

	In Hadoop 2.0.5 version, FileInputFormat new API doesn't support reading the files recursively in input dir. In supports only having the input dir with files. If the input dir has any child dir's then it throws below error.

This has been added in trunk with this JIRA https://issues.apache.org/jira/browse/MAPREDUCE-3193.

You can give input dir to the Job which doesn't have nested dir's or you can make use of the old FileInputFormat API to read files recursively in the sub dir's.

Thanks
Devaraj k

-----Original Message-----
From: Liu, Raymond [mailto:raymond.liu@intel.com] 
Sent: 12 July 2013 12:57
To: user@hadoop.apache.org
Subject: Failed to run wordcount on YARN

Hi 

I just start to try out hadoop2.0, I use the 2.0.5-alpha package

And follow 

http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-project-dist/hadoop-common/ClusterSetup.html

to setup a cluster in non-security mode. HDFS works fine with client tools.

While when I run wordcount example, there are errors :

./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar wordcount /tmp /out


13/07/12 15:05:53 INFO mapreduce.Job: Task Id : attempt_1373609123233_0004_m_000004_0, Status : FAILED
Error: java.io.FileNotFoundException: Path is not a file: /tmp/hadoop-yarn
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:42)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1317)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1276)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1252)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1225)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:403)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:239)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40728)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:986)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:974)
        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:157)
        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:124)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:117)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1131)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:244)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:77)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:713)
        at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:89)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:519)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)

I check the HDFS and found /tmp/hadoop-yarn is there , this dir's owner is the same as the job user.

And to ensure it works, I also create /tmp/hadoop-yarn on local fs. None of it works.

Any idea what might be the problem? Thx!


Best Regards,
Raymond Liu


RE: Failed to run wordcount on YARN

Posted by Devaraj k <de...@huawei.com>.
Hi Raymond, 

	In Hadoop 2.0.5 version, FileInputFormat new API doesn't support reading the files recursively in input dir. In supports only having the input dir with files. If the input dir has any child dir's then it throws below error.

This has been added in trunk with this JIRA https://issues.apache.org/jira/browse/MAPREDUCE-3193.

You can give input dir to the Job which doesn't have nested dir's or you can make use of the old FileInputFormat API to read files recursively in the sub dir's.

Thanks
Devaraj k

-----Original Message-----
From: Liu, Raymond [mailto:raymond.liu@intel.com] 
Sent: 12 July 2013 12:57
To: user@hadoop.apache.org
Subject: Failed to run wordcount on YARN

Hi 

I just start to try out hadoop2.0, I use the 2.0.5-alpha package

And follow 

http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-project-dist/hadoop-common/ClusterSetup.html

to setup a cluster in non-security mode. HDFS works fine with client tools.

While when I run wordcount example, there are errors :

./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar wordcount /tmp /out


13/07/12 15:05:53 INFO mapreduce.Job: Task Id : attempt_1373609123233_0004_m_000004_0, Status : FAILED
Error: java.io.FileNotFoundException: Path is not a file: /tmp/hadoop-yarn
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:42)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1317)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1276)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1252)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1225)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:403)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:239)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40728)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:986)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:974)
        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:157)
        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:124)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:117)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1131)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:244)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:77)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:713)
        at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:89)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:519)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)

I check the HDFS and found /tmp/hadoop-yarn is there , this dir's owner is the same as the job user.

And to ensure it works, I also create /tmp/hadoop-yarn on local fs. None of it works.

Any idea what might be the problem? Thx!


Best Regards,
Raymond Liu