You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Dima Mazmanov <nu...@proservice.ge> on 2006/03/01 10:18:28 UTC
Crawl Exception(NullPointerException)
This is what am I doing :
./hadoop namenode &
./hadoop datanode &
./hadoop jobtracker &
./hadoop tasktracker &
(in directrory seeds I placed url.txt)
./hadoop dfs - put seeds seeds
./nutch crawl seeds -dir tmpdir -depth 3
And after all of this I get following
060301 140839 parsing
file:/usr/home/duche/nutch-nightly/conf/hadoop-default.xml
060301 140839 parsing
file:/usr/home/duche/nutch-nightly/conf/nutch-default.xml
060301 140839 parsing file:/usr/home/duche/nutch-nightly/conf/crawl-tool.xml
060301 140839 parsing
file:/usr/home/duche/nutch-nightly/conf/mapred-default.xml
060301 140839 parsing file:/usr/home/duche/nutch-nightly/conf/nutch-site.xml
060301 140839 parsing
file:/usr/home/duche/nutch-nightly/conf/hadoop-site.xml
060301 140839 Client connection to 127.0.0.1:9000: starting
060301 140839 crawl started in: crawled
060301 140839 rootUrlDir = seeds
060301 140839 threads = 10
060301 140839 depth = 3
060301 140839 Injector: starting
060301 140839 Injector: crawlDb: crawled/crawldb
060301 140839 Injector: urlDir: seeds
060301 140839 Injector: Converting injected urls to crawl db entries.
060301 140839 parsing
file:/usr/home/duche/nutch-nightly/conf/hadoop-default.xml
060301 140839 parsing
file:/usr/home/duche/nutch-nightly/conf/nutch-default.xml
060301 140839 parsing file:/usr/home/duche/nutch-nightly/conf/crawl-tool.xml
060301 140839 parsing
file:/usr/home/duche/nutch-nightly/conf/mapred-default.xml
060301 140839 parsing
file:/usr/home/duche/nutch-nightly/conf/mapred-default.xml
060301 140839 parsing file:/usr/home/duche/nutch-nightly/conf/nutch-site.xml
060301 140839 parsing
file:/usr/home/duche/nutch-nightly/conf/hadoop-site.xml
060301 140839 Client connection to 127.0.0.1:9001: starting
060301 140839 Client connection to 127.0.0.1:9000: starting
060301 140839 parsing
file:/usr/home/duche/nutch-nightly/conf/hadoop-default.xml
060301 140839 parsing
file:/usr/home/duche/nutch-nightly/conf/hadoop-site.xml
060301 140844 Running job: job_oewx5k
060301 140845 map 0% reduce 0%
java.io.IOException: java.lang.NullPointerException
at
org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask(JobInProgress.java:274)
at
org.apache.hadoop.mapred.JobTracker.pollForNewTask(JobTracker.java:534)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at org.apache.hadoop.ipc.RPC$1.call(RPC.java:208)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:200)
java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.mapred.$Proxy0.pollForNewTask(Unknown Source)
at
org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:250)
at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:313)
at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:684)
Caused by: java.io.IOException: java.lang.NullPointerException
at org.apache.hadoop.ipc.Client.call(Client.java:301)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:141)
... 4 more
060301 140844 Lost connection to JobTracker [localhost/127.0.0.1:9001].
Retrying...
060301 140849 parsing
file:/usr/home/duche/nutch-nightly/conf/hadoop-default.xml
060301 140849 parsing
file:/usr/home/duche/nutch-nightly/conf/mapred-default.xml
060301 140849 parsing
/usr/home/duche/nutch-nightly/hadoop/mapred/local/jobTracker/job_oewx5k.xml
060301 140849 parsing
file:/usr/home/duche/nutch-nightly/conf/hadoop-site.xml
java.io.IOException: Not a file: /user/root/seeds/seeds
at
org.apache.hadoop.mapred.InputFormatBase.getSplits(InputFormatBase.java:99)
at
org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:125)
at
org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask(JobInProgress.java:256)
at
org.apache.hadoop.mapred.JobTracker.pollForNewTask(JobTracker.java:534)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at org.apache.hadoop.ipc.RPC$1.call(RPC.java:208)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:200)
060301 140849 Cannot create task split for job_oewx5k
060301 140849 Server handler 4 on 9001 call error: java.io.IOException:
java.lang.NullPointerException
java.io.IOException: java.lang.NullPointerException
at
org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask(JobInProgress.java:274)
at
org.apache.hadoop.mapred.JobTracker.pollForNewTask(JobTracker.java:534)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at org.apache.hadoop.ipc.RPC$1.call(RPC.java:208)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:200)
java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.mapred.$Proxy0.pollForNewTask(Unknown Source)
at
org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:250)
at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:313)
at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:684)
Caused by: java.io.IOException: java.lang.NullPointerException
at org.apache.hadoop.ipc.Client.call(Client.java:301)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:141)
... 4 more
060301 140849 Lost connection to JobTracker [localhost/127.0.0.1:9001].
Retrying...
Any clues? Please help me.
I think , I'm doing everything right.
If not, please explain me.
----- Original Message -----
From: "Dima Mazmanov" <di...@proservice.ge>
To: <nu...@lucene.apache.org>
Sent: Tuesday, February 28, 2006 10:15 AM
Subject: Problems with hadoop
I have a problem during executing hadoop scripts
./start-all.sh
gives me following
source: not found
Password:
What does it mean??? what kind of source wasn't found, and what password I
must type?
I configured ssh like it was written it tutorial, but with no result.
Could you tell me what am I doing wrong?
Thanks.
Regards, Dima
__________ NOD32 1.1421 (20060228) Information __________
This message was checked by NOD32 antivirus system.
http://www.eset.com