You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Andrew Nguyen <an...@ucsfcti.org> on 2010/04/15 08:01:59 UTC

Running TestDFSIO on an EC2 instance

And, I'm getting the following errors:

10/04/15 06:00:50 INFO mapred.JobClient: Task Id : attempt_201004150557_0001_m_000000_1, Status : FAILED
java.io.IOException: Cannot open filename /benchmarks/TestDFSIO/io_data/test_io_0

A bunch show up and then the job fails.  Running the job directly on the cluster as the hadoop user.

Any ideas?

Thanks,
Andrew

Re: Running TestDFSIO on an EC2 instance

Posted by alex kamil <al...@gmail.com>.
Andrew,

there was a bug in my answer below
- if your job is failing to open the file for writing rather then reading
(which is not clear from the error below) then you will not see any files
under /benchmarks/TestDFSIO/io_data - in this case step #2 is to chk
read/write permissions in the directory where you store the data

- note that it is a good idea to install hadoop with the same user you are
going to run MR jobs
i remember having similar permissions issues when using a different user to
set it up (even though it was in the same group)
so its better to install hadoop with* hadoop* user and run jobs* *with the
same user

Alex

On Thu, Apr 15, 2010 at 2:28 AM, alex kamil <al...@gmail.com> wrote:

> Andrew,
>
> 1. make sure you run "TestDFSIO write" test before ""TestDFSIO read"
> 2. run hadoop fs -ls  /benchmarks/ and see if the files are actually there
> 3. run  hadoop dfsadmin -report see if the cluster is alive/no dead nodes
> 4. try a simple copyFromLocal and see if its works
>
> if the answers to all above are "yes"
>  - chk the file system, if you used the defaults it will probably write to
> /tmp (i'm not familiar with specific Hadoop/EC2 package you use)
> otherwise see if it writes into directory where your user/group has enough
> permissions
>
>  if you get stuck i  would even try a different hadoop image, i think there
> are a bunch of them on AWS and you can switch in a couple of min
> you can also try cloudera package with all the bells and whistles
> if it causes problems i would try a clean install from apache website
>
> this is more of a survival guide, may be there is a simpler fix that i'm
> not aware of, so pls share your findings
>
> Cheers
> Alex
>
>
> On Thu, Apr 15, 2010 at 8:01 AM, Andrew Nguyen <
> andrew-lists-hadoop@ucsfcti.org> wrote:
>
>> And, I'm getting the following errors:
>>
>> 10/04/15 06:00:50 INFO mapred.JobClient: Task Id :
>> attempt_201004150557_0001_m_000000_1, Status : FAILED
>> java.io.IOException: Cannot open filename
>> /benchmarks/TestDFSIO/io_data/test_io_0
>>
>> A bunch show up and then the job fails.  Running the job directly on the
>> cluster as the hadoop user.
>>
>> Any ideas?
>>
>> Thanks,
>> Andrew
>
>
>

Re: Running TestDFSIO on an EC2 instance

Posted by alex kamil <al...@gmail.com>.
Andrew,

1. make sure you run "TestDFSIO write" test before ""TestDFSIO read"
2. run hadoop fs -ls  /benchmarks/ and see if the files are actually there
3. run  hadoop dfsadmin -report see if the cluster is alive/no dead nodes
4. try a simple copyFromLocal and see if its works

if the answers to all above are "yes"
 - chk the file system, if you used the defaults it will probably write to
/tmp (i'm not familiar with specific Hadoop/EC2 package you use)
otherwise see if it writes into directory where your user/group has enough
permissions

 if you get stuck i  would even try a different hadoop image, i think there
are a bunch of them on AWS and you can switch in a couple of min
you can also try cloudera package with all the bells and whistles
if it causes problems i would try a clean install from apache website

this is more of a survival guide, may be there is a simpler fix that i'm not
aware of, so pls share your findings

Cheers
Alex


On Thu, Apr 15, 2010 at 8:01 AM, Andrew Nguyen <
andrew-lists-hadoop@ucsfcti.org> wrote:

> And, I'm getting the following errors:
>
> 10/04/15 06:00:50 INFO mapred.JobClient: Task Id :
> attempt_201004150557_0001_m_000000_1, Status : FAILED
> java.io.IOException: Cannot open filename
> /benchmarks/TestDFSIO/io_data/test_io_0
>
> A bunch show up and then the job fails.  Running the job directly on the
> cluster as the hadoop user.
>
> Any ideas?
>
> Thanks,
> Andrew