You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Utkarsh Rathore <ut...@gmail.com> on 2012/01/24 11:59:55 UTC

Hadoop Terasort Error- "File _partition.lst does not exist"

I have a Hadoop cluster on which I have generated some data using Teragen.
But while running Terasort on this data, it gives following error.

java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.
ReflectionUtils.setJobConf(ReflectionUtils.java:93)
        at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
        at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
        at
org.apache.hadoop.mapred.MapTask$OldOutputCollector.<init>(MapTask.java:481)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
        at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
        ... 6 more

I have generated some data using Teragen on my Hadoop cluster. But when I
run
Caused by: java.lang.IllegalArgumentException: can't read paritions file
        at
org.apache.hadoop.examples.terasort.TeraSort$TotalOrderPartitioner.configure(TeraSort.java:213)
        ... 11 more
Caused by: java.io.FileNotFoundException: File _partition.lst does not
exist.
        at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:383)
        at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
        at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:776)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
        at
org.apache.hadoop.examples.terasort.TeraSort$TotalOrderPartitioner.readPartitions(TeraSort.java:153)
        at
org.apache.hadoop.examples.terasort.TeraSort$TotalOrderPartitioner.configure(TeraSort.java:210)
        ... 11 more

HDFS *does* shows this file in its listing and I’m unable to understand why
is Terasort unable to find this flle.

bash-3.2$ hadoop dfs -lsr /user/hduser/
drwxrwxrwx   - hdfs supergroup          0 2012-01-24 14:12
/user/hduser/terasort-input1
-rw-r--r--   1 hdfs supergroup             0 2012-01-24 00:38
/user/hduser/terasort-input1/_SUCCESS
-rw-r--r--   1 hdfs supergroup           129 2012-01-24 14:12
/user/hduser/terasort-input1/_partition.lst
-rw-r--r--   1 hdfs supergroup 1000000000000 2012-01-23 15:25
/user/hduser/terasort-input1/part-00000

I tried changing the file permissions, ownership and copying the
_partition.lst file at root of HDFS (so that relative path does not matter)
but nothing seems to work. Online forums/mailing lists also don't help.

Any help/pointers on this will be appreciated.

~Utkarsh