You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Gal Nitzan <gn...@usa.net> on 2005/09/25 08:19:14 UTC

Response content length is not known

Hi,

I find many messages: Response content length is not known

in my log file. When checking the headers I see that content-length does 
not exist.

Was the page fetched and parsed?

Regards,

Gal

Re: Response content length is not known

Posted by Gal Nitzan <gn...@usa.net>.
Andrzej Bialecki wrote:
> gekkokid wrote:
>> might have been a large file size, too much to download maybe
>
> No. This comes from the httpclient library, when the response length 
> is not known ;-) i.e. when it is not reported in the server response, 
> as it should. This means that there is a danger that the response size 
> may exceed available resources or limits.
>
>
What does Nutch do? does it save it/index it ?

Beside that message nutch doesn't seem to be complaining

Thanks,

Gal

Re: Response content length is not known

Posted by Andrzej Bialecki <ab...@getopt.org>.
gekkokid wrote:
> might have been a large file size, too much to download maybe

No. This comes from the httpclient library, when the response length is 
not known ;-) i.e. when it is not reported in the server response, as it 
should. This means that there is a danger that the response size may 
exceed available resources or limits.


-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Re: Response content length is not known

Posted by Andrzej Bialecki <ab...@getopt.org>.
gekkokid wrote:
> might have been a large file size, too much to download maybe

No. This comes from the httpclient library, when the response length is 
not known ;-) i.e. when it is not reported in the server response, as it 
should. This means that there is a danger that the response size may 
exceed available resources or limits.


-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Re: Response content length is not known

Posted by gekkokid <me...@gekkokid.org.uk>.
might have been a large file size, too much to download maybe

_gk
----- Original Message ----- 
From: "Gal Nitzan" <gn...@usa.net>
To: <nu...@lucene.apache.org>
Sent: Sunday, September 25, 2005 7:19 AM
Subject: Response content length is not known


> Hi,
> 
> I find many messages: Response content length is not known
> 
> in my log file. When checking the headers I see that content-length does 
> not exist.
> 
> Was the page fetched and parsed?
> 
> Regards,
> 
> Gal
>