You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "sreemanth pulagam (JIRA)" <ji...@apache.org> on 2014/05/13 08:22:15 UTC
[jira] [Created] (NUTCH-1774) Crawling from REST API giving
NullPointerException
sreemanth pulagam created NUTCH-1774:
----------------------------------------
Summary: Crawling from REST API giving NullPointerException
Key: NUTCH-1774
URL: https://issues.apache.org/jira/browse/NUTCH-1774
Project: Nutch
Issue Type: Bug
Components: REST_api
Affects Versions: 2.2.1
Reporter: sreemanth pulagam
Crawling is not working from REST API.
Steps to reproduce.
-----------------------
1. Start the Nutch server (port 9000).
2. Submit the PUT request , to create/initiate crawl job.
eg:
URL: http://localhost:9000/nutch/jobs
HTTP METHOD: PUT
Content:
{
"crawl":"123",
"type":"crawl",
"conf":"default",
"args":{
"class":"org.apache.nutch.crawl.Crawler",
"seed":"http://www.somesite.com",
"seedDir":"runtime/local/url/url.txt",
"depth":2
}
}
3. Getting the following exception in Generator phase.
2014-05-13 11:37:57,863 WARN mapred.LocalJobRunner (LocalJobRunner.java:run(435)) - job_local1326997137_0002
java.lang.NullPointerException
at org.apache.avro.util.Utf8.<init>(Utf8.java:37)
at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)
--
This message was sent by Atlassian JIRA
(v6.2#6252)