You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "simone frenzel (JIRA)" <ji...@apache.org> on 2011/08/23 16:51:29 UTC
[jira] [Created] (NUTCH-1089) short compressed pages caused
Exception
short compressed pages caused Exception
-----------------------------------------
Key: NUTCH-1089
URL: https://issues.apache.org/jira/browse/NUTCH-1089
Project: Nutch
Issue Type: Bug
Reporter: simone frenzel
Hi,
tested nutch on compressed pages, and on pages with Basic Auth and compression. On short compressed pages this Exception is thrown:
2011-08-19 17:06:55,190 ERROR httpclient.Http - java.io.IOException: unzipBestEffort returned null
2011-08-19 17:06:55,190 ERROR httpclient.Http - at org.apache.nutch.protocol.http.api.HttpBase.processGzipEncoded(HttpBase.java:310)
2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.protocol.httpclient.HttpResponse.<init>(HttpResponse.java:163)
2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.protocol.httpclient.Http.getResponse(Http.java:154)
2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:138)
2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:628)
In same cases Basic Auth failt also.
Works fine with the patch.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1089) short compressed pages caused
Exception
Posted by "simone frenzel (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
simone frenzel updated NUTCH-1089:
----------------------------------
Attachment: HttpResponsePatch.patch
> short compressed pages caused Exception
> -----------------------------------------
>
> Key: NUTCH-1089
> URL: https://issues.apache.org/jira/browse/NUTCH-1089
> Project: Nutch
> Issue Type: Bug
> Reporter: simone frenzel
> Labels: patch
> Attachments: HttpResponsePatch.patch
>
>
> Hi,
> tested nutch on compressed pages, and on pages with Basic Auth and compression. On short compressed pages this Exception is thrown:
> 2011-08-19 17:06:55,190 ERROR httpclient.Http - java.io.IOException: unzipBestEffort returned null
> 2011-08-19 17:06:55,190 ERROR httpclient.Http - at org.apache.nutch.protocol.http.api.HttpBase.processGzipEncoded(HttpBase.java:310)
> 2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.protocol.httpclient.HttpResponse.<init>(HttpResponse.java:163)
> 2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.protocol.httpclient.Http.getResponse(Http.java:154)
> 2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:138)
> 2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:628)
> In same cases Basic Auth failt also.
> Works fine with the patch.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (NUTCH-1089) short compressed pages caused
Exception
Posted by "Julien Nioche (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Julien Nioche resolved NUTCH-1089.
----------------------------------
Resolution: Fixed
1.4 Committed revision 1160753.
trunk Committed revision 1160754
Thanks Simone!
> short compressed pages caused Exception
> -----------------------------------------
>
> Key: NUTCH-1089
> URL: https://issues.apache.org/jira/browse/NUTCH-1089
> Project: Nutch
> Issue Type: Bug
> Reporter: simone frenzel
> Labels: patch
> Attachments: HttpResponsePatch.patch
>
>
> Hi,
> tested nutch on compressed pages, and on pages with Basic Auth and compression. On short compressed pages this Exception is thrown:
> 2011-08-19 17:06:55,190 ERROR httpclient.Http - java.io.IOException: unzipBestEffort returned null
> 2011-08-19 17:06:55,190 ERROR httpclient.Http - at org.apache.nutch.protocol.http.api.HttpBase.processGzipEncoded(HttpBase.java:310)
> 2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.protocol.httpclient.HttpResponse.<init>(HttpResponse.java:163)
> 2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.protocol.httpclient.Http.getResponse(Http.java:154)
> 2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:138)
> 2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:628)
> In same cases Basic Auth failt also.
> Works fine with the patch.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (NUTCH-1089) short compressed pages caused
Exception
Posted by "Julien Nioche (Closed) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Julien Nioche closed NUTCH-1089.
--------------------------------
NUTCH-1089, NUTCH-990 and NUTCH-1112 were all related to the same issue which has been fixed thanks to Simone's patch.
> short compressed pages caused Exception
> -----------------------------------------
>
> Key: NUTCH-1089
> URL: https://issues.apache.org/jira/browse/NUTCH-1089
> Project: Nutch
> Issue Type: Bug
> Reporter: simone frenzel
> Labels: patch
> Attachments: HttpResponsePatch.patch
>
>
> Hi,
> tested nutch on compressed pages, and on pages with Basic Auth and compression. On short compressed pages this Exception is thrown:
> 2011-08-19 17:06:55,190 ERROR httpclient.Http - java.io.IOException: unzipBestEffort returned null
> 2011-08-19 17:06:55,190 ERROR httpclient.Http - at org.apache.nutch.protocol.http.api.HttpBase.processGzipEncoded(HttpBase.java:310)
> 2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.protocol.httpclient.HttpResponse.<init>(HttpResponse.java:163)
> 2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.protocol.httpclient.Http.getResponse(Http.java:154)
> 2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:138)
> 2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:628)
> In same cases Basic Auth failt also.
> Works fine with the patch.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira