You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "Erdong (Roger) CHEN" <ed...@csail.mit.edu> on 2007/06/02 06:10:21 UTC

[Hadoop DFS] hadoop does not see my input file

Hi all,

Could anyone help me to figure out why hadoop does not see my input file?

I have three computers rosetta8, rosetta9,and rosetta10. rosetta8 is
listed in masters, rosetta9 and rosetta10 are listed in slaves. I run
bin/start-dfs.sh and bin/start-mapred.sh on rosetta8. This is my
hapood-site.xml. I am pretty sure that I followed the installation and
configuration online and the folder /tmp/in-dir/ is not empty.

I tried the following two commands:
./bin/hadoop dfs -ls /tmp/in-dir/
Found 0 items
./bin/hadoop dfs -ls /tmp/
Found 0 items

I tried both two settings for mapred.job.tracker, local and
rosetta8:50034. Both don't work.

<property>
  <name>mapred.job.tracker</name>
  <value>local</value>
  <value>rosetta8:50034</value>
</property>

<property>
  <name>fs.default.name</name>
  <value>rosetta8:50033</value>
</property>

Command that I run:
./bin/hadoop jar hadoop-0.12.3-examples.jar wordcount -m 3 -r 2
/tmp/in-dir/ /tmp/out-dir/

Error message that I get:
org.apache.hadoop.mapred.InvalidInputException: Input path doesnt
exist : /tmp/in-dir
        at org.apache.hadoop.mapred.InputFormatBase.validateInput(InputFormatBase.java:138)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:326)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
        at org.apache.hadoop.examples.WordCount.main(WordCount.java:148)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
edc@rosetta8:~/hadoop-install/hadoop$ ./bin/hadoop jar
hadoop-0.12.3-examples.jar wordcount -m 3 -r
2/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/
/tmp/out-dir/
ERROR: Integer expected instead of
2/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/
wordcount [-m <maps>] [-r <reduces>] <input> <output>
edc@rosetta8:~/hadoop-install/hadoop$ ./bin/hadoop jar
hadoop-0.12.3-examples.jar wordcount -m 3 -r 2
/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/ /tmp/out-dir/
org.apache.hadoop.mapred.InvalidInputException: Input path doesnt
exist : /afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir
        at org.apache.hadoop.mapred.InputFormatBase.validateInput(InputFormatBase.java:138)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:326)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
        at org.apache.hadoop.examples.WordCount.main(WordCount.java:148)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)


Erdong Chen

[Hadoop DFS] hadoop does not see my input file

Posted by "Erdong (Roger) CHEN" <ro...@gmail.com>.
Hi all,

Could anyone help me to figure out why hadoop does not see my input file?

I have three computers rosetta8, rosetta9,and rosetta10. rosetta8 is
listed in masters, rosetta9 and rosetta10 are listed in slaves. I run
bin/start-dfs.sh and bin/start-mapred.sh on rosetta8. This is my
hapood-site.xml. I am pretty sure that I followed the installation and
configuration online and the folder /tmp/in-dir/ is not empty.

I tried the following two commands:
./bin/hadoop dfs -ls /tmp/in-dir/
Found 0 items
./bin/hadoop dfs -ls /tmp/
Found 0 items

I tried both two settings for mapred.job.tracker, local and
rosetta8:50034. Both don't work.

<property>
  <name>mapred.job.tracker</name>
  <value>local</value>
  <value>rosetta8:50034</value>
</property>

<property>
  <name>fs.default.name</name>
  <value>rosetta8:50033</value>
</property>

Command that I run:
./bin/hadoop jar hadoop-0.12.3-examples.jar wordcount -m 3 -r 2
/tmp/in-dir/ /tmp/out-dir/

Error message that I get:
org.apache.hadoop.mapred.InvalidInputException: Input path doesnt
exist : /tmp/in-dir
        at org.apache.hadoop.mapred.InputFormatBase.validateInput(InputFormatBase.java:138)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:326)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
        at org.apache.hadoop.examples.WordCount.main(WordCount.java:148)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
edc@rosetta8:~/hadoop-install/hadoop$ ./bin/hadoop jar
hadoop-0.12.3-examples.jar wordcount -m 3 -r
2/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/
/tmp/out-dir/
ERROR: Integer expected instead of
2/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/
wordcount [-m <maps>] [-r <reduces>] <input> <output>
edc@rosetta8:~/hadoop-install/hadoop$ ./bin/hadoop jar
hadoop-0.12.3-examples.jar wordcount -m 3 -r 2
/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/ /tmp/out-dir/
org.apache.hadoop.mapred.InvalidInputException: Input path doesnt
exist : /afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir
        at org.apache.hadoop.mapred.InputFormatBase.validateInput(InputFormatBase.java:138)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:326)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
        at org.apache.hadoop.examples.WordCount.main(WordCount.java:148)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)


Erdong Chen

RE: hadoop does not see my input file

Posted by Victor Gao <ga...@gmail.com>.
Hi, I think you should copy source files to the HDFS like this:
./bin/hadoop dfs -cp <some file> /text

And remember the path in the your wordcount command should be a path in HDFS
rather than ordinary path in your local filesystem.

Besides, a tiny suggestion: turn off the firewall if possilbe. I found the
firewall would cause some trouble. Good luck.

Liqi Gao

-----Original Message-----
From: Erdong (Roger) CHEN [mailto:roger.edchen@gmail.com] 
Sent: Sunday, June 03, 2007 6:47 AM
To: hadoop-user@lucene.apache.org
Subject: hadoop does not see my input file

Hi all,

Could anyone help me to figure out why hadoop does not see my input file?

I have three computers rosetta8, rosetta9,and rosetta10. rosetta8 is
listed in masters, rosetta9 and rosetta10 are listed in slaves. I run
bin/start-dfs.sh and bin/start-mapred.sh on rosetta8. This is my
hapood-site.xml. I am pretty sure that I followed the installation and
configuration online and the folder /tmp/in-dir/ is not empty.

I tried the following two commands:
./bin/hadoop dfs -ls /tmp/in-dir/
Found 0 items
./bin/hadoop dfs -ls /tmp/
Found 0 items

I tried both two settings for mapred.job.tracker, local and
rosetta8:50034. Both don't work.

<property>
  <name>mapred.job.tracker</name>
  <value>local</value>
  <value>rosetta8:50034</value>
</property>

<property>
  <name>fs.default.name</name>
  <value>rosetta8:50033</value>
</property>

Command that I run:
./bin/hadoop jar hadoop-0.12.3-examples.jar wordcount -m 3 -r 2
/tmp/in-dir/ /tmp/out-dir/

Error message that I get:
org.apache.hadoop.mapred.InvalidInputException: Input path doesnt
exist : /tmp/in-dir
        at
org.apache.hadoop.mapred.InputFormatBase.validateInput(InputFormatBase.java:
138)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:326)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
        at org.apache.hadoop.examples.WordCount.main(WordCount.java:148)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver
.java:71)
        at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
        at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
edc@rosetta8:~/hadoop-install/hadoop$ ./bin/hadoop jar
hadoop-0.12.3-examples.jar wordcount -m 3 -r
2/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/
/tmp/out-dir/
ERROR: Integer expected instead of
2/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/
wordcount [-m <maps>] [-r <reduces>] <input> <output>
edc@rosetta8:~/hadoop-install/hadoop$ ./bin/hadoop jar
hadoop-0.12.3-examples.jar wordcount -m 3 -r 2
/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/ /tmp/out-dir/
org.apache.hadoop.mapred.InvalidInputException: Input path doesnt
exist : /afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir
        at
org.apache.hadoop.mapred.InputFormatBase.validateInput(InputFormatBase.java:
138)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:326)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
        at org.apache.hadoop.examples.WordCount.main(WordCount.java:148)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver
.java:71)
        at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
        at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)


Erdong Chen


hadoop does not see my input file

Posted by "Erdong (Roger) CHEN" <ro...@gmail.com>.
Hi all,

Could anyone help me to figure out why hadoop does not see my input file?

I have three computers rosetta8, rosetta9,and rosetta10. rosetta8 is
listed in masters, rosetta9 and rosetta10 are listed in slaves. I run
bin/start-dfs.sh and bin/start-mapred.sh on rosetta8. This is my
hapood-site.xml. I am pretty sure that I followed the installation and
configuration online and the folder /tmp/in-dir/ is not empty.

I tried the following two commands:
./bin/hadoop dfs -ls /tmp/in-dir/
Found 0 items
./bin/hadoop dfs -ls /tmp/
Found 0 items

I tried both two settings for mapred.job.tracker, local and
rosetta8:50034. Both don't work.

<property>
  <name>mapred.job.tracker</name>
  <value>local</value>
  <value>rosetta8:50034</value>
</property>

<property>
  <name>fs.default.name</name>
  <value>rosetta8:50033</value>
</property>

Command that I run:
./bin/hadoop jar hadoop-0.12.3-examples.jar wordcount -m 3 -r 2
/tmp/in-dir/ /tmp/out-dir/

Error message that I get:
org.apache.hadoop.mapred.InvalidInputException: Input path doesnt
exist : /tmp/in-dir
        at org.apache.hadoop.mapred.InputFormatBase.validateInput(InputFormatBase.java:138)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:326)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
        at org.apache.hadoop.examples.WordCount.main(WordCount.java:148)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
edc@rosetta8:~/hadoop-install/hadoop$ ./bin/hadoop jar
hadoop-0.12.3-examples.jar wordcount -m 3 -r
2/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/
/tmp/out-dir/
ERROR: Integer expected instead of
2/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/
wordcount [-m <maps>] [-r <reduces>] <input> <output>
edc@rosetta8:~/hadoop-install/hadoop$ ./bin/hadoop jar
hadoop-0.12.3-examples.jar wordcount -m 3 -r 2
/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/ /tmp/out-dir/
org.apache.hadoop.mapred.InvalidInputException: Input path doesnt
exist : /afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir
        at org.apache.hadoop.mapred.InputFormatBase.validateInput(InputFormatBase.java:138)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:326)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
        at org.apache.hadoop.examples.WordCount.main(WordCount.java:148)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)


Erdong Chen

Re: hadoop does not see my input file

Posted by "Erdong (Roger) CHEN" <ro...@gmail.com>.
Hi Devaraj & Victor, Thanks so much! It's working now.

Erdong (Roger) Chen
Electrical Engineering & Computer Science
MIT
Phone: (857) 998-2749
Addr: 32 Vassar St. 32-G369, Cambridge,MA, 02139


On 6/3/07, Devaraj Das <dd...@yahoo-inc.com> wrote:
> I would like to start from scratch on this one and here are the steps I
> would like you to follow (you might be already doing all the steps below but
> let's be sure we are at the same page):
> 1) Do "bin/hadoop namenode -format"
> 2) Run bin/start-dfs.sh
> 3) Check to make sure that dfs started up fine. Access the link
> http://rosetta8:50070/ from a browser and see whether you see the datanodes
> -rosetta9 & rosetta10.
> 4) Now run "bin/hadoop dfs -put <path-to-some-local-dir> <dfs-dir>" So as an
> example you could run "bin/hadoop dfs -put $HADOOP_HOME/conf /tmp/in-dir" .
> This must not complain about anything.
> 5) Assuming that your local-dir is non-empty, you should see some entries if
> you do "bin/hadoop dfs -ls /tmp/in-dir" and that will mean that your dfs is
> working fine.
>
> If not, then do "tail <path-to-log-file-of-namenode>" and see what
> exceptions you see there. In the default settings, the log directory is
> $HADOOP_HOME/logs and the namenode log file is easily identifiable from the
> file names there. Let us know the exceptions.
>
> -----Original Message-----
> From: Erdong (Roger) CHEN [mailto:roger.edchen@gmail.com]
> Sent: Sunday, June 03, 2007 2:46 AM
> To: hadoop-user@lucene.apache.org
> Subject: hadoop does not see my input file
>
> Hi all,
>
> Could anyone help me to figure out why hadoop does not see my input file?
>
> I have three computers rosetta8, rosetta9,and rosetta10. rosetta8 is listed
> in masters, rosetta9 and rosetta10 are listed in slaves. I run
> bin/start-dfs.sh and bin/start-mapred.sh on rosetta8. This is my
> hapood-site.xml. I am pretty sure that I followed the installation and
> configuration online and the folder /tmp/in-dir/ is not empty.
>
> I tried the following two commands:
> ./bin/hadoop dfs -ls /tmp/in-dir/
> Found 0 items
> ./bin/hadoop dfs -ls /tmp/
> Found 0 items
>
> I tried both two settings for mapred.job.tracker, local and rosetta8:50034.
> Both don't work.
>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>local</value>
>   <value>rosetta8:50034</value>
> </property>
>
> <property>
>   <name>fs.default.name</name>
>   <value>rosetta8:50033</value>
> </property>
>
> Command that I run:
> ./bin/hadoop jar hadoop-0.12.3-examples.jar wordcount -m 3 -r 2 /tmp/in-dir/
> /tmp/out-dir/
>
> Error message that I get:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist :
> /tmp/in-dir
>         at
> org.apache.hadoop.mapred.InputFormatBase.validateInput(InputFormatBase.java:
> 138)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:326)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
>         at org.apache.hadoop.examples.WordCount.main(WordCount.java:148)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
> )
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
> .java:25)
>         at java.lang.reflect.Method.invoke(Method.java:585)
>         at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver
> .java:71)
>         at
> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
>         at
> org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
> )
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
> .java:25)
>         at java.lang.reflect.Method.invoke(Method.java:585)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
> edc@rosetta8:~/hadoop-install/hadoop$ ./bin/hadoop jar
> hadoop-0.12.3-examples.jar wordcount -m 3 -r
> 2/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/
> /tmp/out-dir/
> ERROR: Integer expected instead of
> 2/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/
> wordcount [-m <maps>] [-r <reduces>] <input> <output>
> edc@rosetta8:~/hadoop-install/hadoop$ ./bin/hadoop jar
> hadoop-0.12.3-examples.jar wordcount -m 3 -r 2
> /afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/ /tmp/out-dir/
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist :
> /afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir
>         at
> org.apache.hadoop.mapred.InputFormatBase.validateInput(InputFormatBase.java:
> 138)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:326)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
>         at org.apache.hadoop.examples.WordCount.main(WordCount.java:148)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
> )
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
> .java:25)
>         at java.lang.reflect.Method.invoke(Method.java:585)
>         at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver
> .java:71)
>         at
> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
>         at
> org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
> )
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
> .java:25)
>         at java.lang.reflect.Method.invoke(Method.java:585)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
>
>
> Erdong Chen
>
>

RE: hadoop does not see my input file

Posted by Devaraj Das <dd...@yahoo-inc.com>.
I would like to start from scratch on this one and here are the steps I
would like you to follow (you might be already doing all the steps below but
let's be sure we are at the same page):
1) Do "bin/hadoop namenode -format"
2) Run bin/start-dfs.sh
3) Check to make sure that dfs started up fine. Access the link
http://rosetta8:50070/ from a browser and see whether you see the datanodes
-rosetta9 & rosetta10. 
4) Now run "bin/hadoop dfs -put <path-to-some-local-dir> <dfs-dir>" So as an
example you could run "bin/hadoop dfs -put $HADOOP_HOME/conf /tmp/in-dir" .
This must not complain about anything.
5) Assuming that your local-dir is non-empty, you should see some entries if
you do "bin/hadoop dfs -ls /tmp/in-dir" and that will mean that your dfs is
working fine.

If not, then do "tail <path-to-log-file-of-namenode>" and see what
exceptions you see there. In the default settings, the log directory is
$HADOOP_HOME/logs and the namenode log file is easily identifiable from the
file names there. Let us know the exceptions.

-----Original Message-----
From: Erdong (Roger) CHEN [mailto:roger.edchen@gmail.com] 
Sent: Sunday, June 03, 2007 2:46 AM
To: hadoop-user@lucene.apache.org
Subject: hadoop does not see my input file

Hi all,

Could anyone help me to figure out why hadoop does not see my input file?

I have three computers rosetta8, rosetta9,and rosetta10. rosetta8 is listed
in masters, rosetta9 and rosetta10 are listed in slaves. I run
bin/start-dfs.sh and bin/start-mapred.sh on rosetta8. This is my
hapood-site.xml. I am pretty sure that I followed the installation and
configuration online and the folder /tmp/in-dir/ is not empty.

I tried the following two commands:
./bin/hadoop dfs -ls /tmp/in-dir/
Found 0 items
./bin/hadoop dfs -ls /tmp/
Found 0 items

I tried both two settings for mapred.job.tracker, local and rosetta8:50034.
Both don't work.

<property>
  <name>mapred.job.tracker</name>
  <value>local</value>
  <value>rosetta8:50034</value>
</property>

<property>
  <name>fs.default.name</name>
  <value>rosetta8:50033</value>
</property>

Command that I run:
./bin/hadoop jar hadoop-0.12.3-examples.jar wordcount -m 3 -r 2 /tmp/in-dir/
/tmp/out-dir/

Error message that I get:
org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist :
/tmp/in-dir
        at
org.apache.hadoop.mapred.InputFormatBase.validateInput(InputFormatBase.java:
138)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:326)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
        at org.apache.hadoop.examples.WordCount.main(WordCount.java:148)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver
.java:71)
        at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
        at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
edc@rosetta8:~/hadoop-install/hadoop$ ./bin/hadoop jar
hadoop-0.12.3-examples.jar wordcount -m 3 -r
2/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/
/tmp/out-dir/
ERROR: Integer expected instead of
2/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/
wordcount [-m <maps>] [-r <reduces>] <input> <output>
edc@rosetta8:~/hadoop-install/hadoop$ ./bin/hadoop jar
hadoop-0.12.3-examples.jar wordcount -m 3 -r 2
/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/ /tmp/out-dir/
org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist :
/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir
        at
org.apache.hadoop.mapred.InputFormatBase.validateInput(InputFormatBase.java:
138)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:326)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
        at org.apache.hadoop.examples.WordCount.main(WordCount.java:148)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver
.java:71)
        at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
        at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)


Erdong Chen


hadoop does not see my input file

Posted by "Erdong (Roger) CHEN" <ro...@gmail.com>.
Hi all,

Could anyone help me to figure out why hadoop does not see my input file?

I have three computers rosetta8, rosetta9,and rosetta10. rosetta8 is
listed in masters, rosetta9 and rosetta10 are listed in slaves. I run
bin/start-dfs.sh and bin/start-mapred.sh on rosetta8. This is my
hapood-site.xml. I am pretty sure that I followed the installation and
configuration online and the folder /tmp/in-dir/ is not empty.

I tried the following two commands:
./bin/hadoop dfs -ls /tmp/in-dir/
Found 0 items
./bin/hadoop dfs -ls /tmp/
Found 0 items

I tried both two settings for mapred.job.tracker, local and
rosetta8:50034. Both don't work.

<property>
  <name>mapred.job.tracker</name>
  <value>local</value>
  <value>rosetta8:50034</value>
</property>

<property>
  <name>fs.default.name</name>
  <value>rosetta8:50033</value>
</property>

Command that I run:
./bin/hadoop jar hadoop-0.12.3-examples.jar wordcount -m 3 -r 2
/tmp/in-dir/ /tmp/out-dir/

Error message that I get:
org.apache.hadoop.mapred.InvalidInputException: Input path doesnt
exist : /tmp/in-dir
        at org.apache.hadoop.mapred.InputFormatBase.validateInput(InputFormatBase.java:138)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:326)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
        at org.apache.hadoop.examples.WordCount.main(WordCount.java:148)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
edc@rosetta8:~/hadoop-install/hadoop$ ./bin/hadoop jar
hadoop-0.12.3-examples.jar wordcount -m 3 -r
2/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/
/tmp/out-dir/
ERROR: Integer expected instead of
2/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/
wordcount [-m <maps>] [-r <reduces>] <input> <output>
edc@rosetta8:~/hadoop-install/hadoop$ ./bin/hadoop jar
hadoop-0.12.3-examples.jar wordcount -m 3 -r 2
/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/ /tmp/out-dir/
org.apache.hadoop.mapred.InvalidInputException: Input path doesnt
exist : /afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir
        at org.apache.hadoop.mapred.InputFormatBase.validateInput(InputFormatBase.java:138)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:326)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
        at org.apache.hadoop.examples.WordCount.main(WordCount.java:148)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)


Erdong Chen