You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Rohit Kulkarni <ro...@gmail.com> on 2005/04/24 00:24:43 UTC

Possible bug in HttpResponse.java in protocol-http plugin

Hello everyone,

In protocol-http plugin,
In HttpResponse.java file, Line 148

if the file is gzip compressed, I found the following code un-compressing it,
but I think the "Content-Length" property in not being re-set to the
un-compressed
length. I found that the appropriate plugins that are being called get
the compressed
file size as the value for Content-Length. 
I also observed that content.length is set to 0. 

     String contentEncoding= getHeader("Content-Encoding");
      if ("gzip".equals(contentEncoding) || "x-gzip".equals(contentEncoding)) {
        Http.LOG.fine("uncompressing....");
        byte[] compressed = content;

        content = GZIPUtils.unzipBestEffort(compressed, Http.MAX_CONTENT);

        if (content == null)
          throw new HttpException("unzipBestEffort returned null");

        if (Http.LOG.isLoggable(Level.FINE))
          Http.LOG.fine("fetched " + compressed.length
                        + " bytes of compressed content (expanded to "
                        + content.length + " bytes) from " + url);
      } else {
        if (Http.LOG.isLoggable(Level.FINE))
          Http.LOG.fine("fetched " + content.length + " bytes from " + url);
      }

Hope I am not wrong..please advice..

 Rohit