You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/05/17 20:36:15 UTC

[jira] [Resolved] (NUTCH-1774) Crawling from REST API giving NullPointerException

     [ https://issues.apache.org/jira/browse/NUTCH-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lewis John McGibbney resolved NUTCH-1774.
-----------------------------------------

    Resolution: Won't Fix

For the time being we should not commit this patch as is. It effectively fixes a bug in the Crawler.class which no longer exists and changes GeneratorJob in order to do so.
Once we progress with work on the REST API over the summer we can reassess this issue and possibly re-open is needed.

> Crawling from REST API giving NullPointerException
> --------------------------------------------------
>
>                 Key: NUTCH-1774
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1774
>             Project: Nutch
>          Issue Type: Bug
>          Components: REST_api
>    Affects Versions: 2.2.1
>            Reporter: sreemanth pulagam
>             Fix For: 2.3
>
>         Attachments: NUTCH-1774.patch
>
>
> Crawling is not working from REST API.
> Steps to reproduce.
> -----------------------
> 1. Start the Nutch server (port 9000).
> 2. Submit the PUT request , to create/initiate crawl job.
>    eg: 
>            URL: http://localhost:9000/nutch/jobs  
>            HTTP METHOD: PUT
>            Content: 
>                 {
>                    "crawl":"123",
>                    "type":"crawl",
>                    "conf":"default",
>                    "args":{
>                       "class":"org.apache.nutch.crawl.Crawler",
>                       "seed":"http://www.somesite.com",
>                       "seedDir":"runtime/local/url/url.txt",
>                       "depth":2
>                    }
>                 }
> 3. Getting the following exception in Generator phase. 
> 2014-05-13 11:37:57,863 WARN  mapred.LocalJobRunner (LocalJobRunner.java:run(435)) - job_local1326997137_0002
> java.lang.NullPointerException
> 	at org.apache.avro.util.Utf8.<init>(Utf8.java:37)
> 	at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)
> 	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
> 	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)



--
This message was sent by Atlassian JIRA
(v6.2#6252)