You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "sreemanth pulagam (JIRA)" <ji...@apache.org> on 2014/05/13 08:22:15 UTC

[jira] [Created] (NUTCH-1774) Crawling from REST API giving NullPointerException

sreemanth pulagam created NUTCH-1774:
----------------------------------------

             Summary: Crawling from REST API giving NullPointerException
                 Key: NUTCH-1774
                 URL: https://issues.apache.org/jira/browse/NUTCH-1774
             Project: Nutch
          Issue Type: Bug
          Components: REST_api
    Affects Versions: 2.2.1
            Reporter: sreemanth pulagam


Crawling is not working from REST API.

Steps to reproduce.
-----------------------
1. Start the Nutch server (port 9000).
2. Submit the PUT request , to create/initiate crawl job.
   eg: 
           URL: http://localhost:9000/nutch/jobs  
           HTTP METHOD: PUT
           Content: 
                {
                   "crawl":"123",
                   "type":"crawl",
                   "conf":"default",
                   "args":{
                      "class":"org.apache.nutch.crawl.Crawler",
                      "seed":"http://www.somesite.com",
                      "seedDir":"runtime/local/url/url.txt",
                      "depth":2
                   }
                }
3. Getting the following exception in Generator phase. 
2014-05-13 11:37:57,863 WARN  mapred.LocalJobRunner (LocalJobRunner.java:run(435)) - job_local1326997137_0002
java.lang.NullPointerException
	at org.apache.avro.util.Utf8.<init>(Utf8.java:37)
	at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)
	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)







--
This message was sent by Atlassian JIRA
(v6.2#6252)