You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Peter Lundberg (JIRA)" <ji...@apache.org> on 2010/09/17 06:17:34 UTC

[jira] Commented: (NUTCH-862) HttpClient null pointer exception

    [ https://issues.apache.org/jira/browse/NUTCH-862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910439#action_12910439 ] 

Peter Lundberg commented on NUTCH-862:
--------------------------------------

Same happens in 1.1 and 1.2rc2.
It is a simple safe fix that shold make it to 1.2. Please?

> HttpClient null pointer exception
> ---------------------------------
>
>                 Key: NUTCH-862
>                 URL: https://issues.apache.org/jira/browse/NUTCH-862
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 1.0.0
>         Environment: linux, java 6
>            Reporter: Sebastian Nagel
>            Priority: Minor
>         Attachments: NUTCH-862.patch
>
>
> When re-fetching a document (a continued crawl) HttpClient throws an null pointer exception causing the document to be emptied:
> 2010-07-27 12:45:09,199 INFO  fetcher.Fetcher - fetching http://localhost/doc/selfhtml/html/index.htm
> 2010-07-27 12:45:09,203 ERROR httpclient.Http - java.lang.NullPointerException
> 2010-07-27 12:45:09,204 ERROR httpclient.Http - at org.apache.nutch.protocol.httpclient.HttpResponse.<init>(HttpResponse.java:138)
> 2010-07-27 12:45:09,204 ERROR httpclient.Http - at org.apache.nutch.protocol.httpclient.Http.getResponse(Http.java:154)
> 2010-07-27 12:45:09,204 ERROR httpclient.Http - at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:220)
> 2010-07-27 12:45:09,204 ERROR httpclient.Http - at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:537)
> 2010-07-27 12:45:09,204 INFO  fetcher.Fetcher - fetch of http://localhost/doc/selfhtml/html/index.htm failed with: java.lang.NullPointerException
> Because the document is re-fetched the server answers "304" (not modified):
> 127.0.0.1 - - [27/Jul/2010:12:45:09 +0200] "GET /doc/selfhtml/html/index.htm HTTP/1.0" 304 174 "-" "Nutch-1.0"
> No content is sent in this case (empty http body).
> Index: trunk/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/HttpResponse.java
> ===================================================================
> --- trunk/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/HttpResponse.java        (revision 979647)
> +++ trunk/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/HttpResponse.java        (working copy)
> @@ -134,7 +134,8 @@
>          if (code == 200) throw new IOException(e.toString());
>          // for codes other than 200 OK, we are fine with empty content
>        } finally {
> -        in.close();
> +        if (in != null)
> +          in.close();
>          get.abort();
>        }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.