You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by kazam <az...@gmail.com> on 2009/04/27 18:25:56 UTC
Nutch fetch creates too many http sessions
Hi there,
I am generating nutch indexes for our site which is running off a websphere
server. The indexing takes about 20 hours to complete. However, after about
15-16 hours the websphere server crashes, because of too many sessions being
created.
It seems that each fetch creates a new session. Is there a way that all
nutch fetches can be done via a single session.
Has anyone else encountered such problem? All ideas are welcome.
Thanks.
--
View this message in context: http://www.nabble.com/Nutch-fetch-creates-too-many-http-sessions-tp23259993p23259993.html
Sent from the Nutch - User mailing list archive at Nabble.com.
Re: Nutch fetch creates too many http sessions
Posted by kazam <az...@gmail.com>.
Thanks Dennis, you are right. I have bumped up the RAM for the webserver and
increased the number of allowed sessions, plus reduced the time for a
session timeout. Hopefully, this will allow for the indexing to complete.
Dennis Kubes-2 wrote:
>
> This seems to be more of a session handling issue on the websphere
> server than a nutch fetching issue. Nutch doesn't actually create the
> session, it just doesn't store cookies or session information so
> websphere is creating a new session per fetch.
>
> While having a single stored session for fetching the same domain in
> Nutch seems like it might be interesting functionality, I don't believe
> that currently exists. My suggestion is to look into tuning websphere
> session timeouts. My guess would be they are set to a very high level.
>
> Dennis
>
> kazam wrote:
>> Hi there,
>> I am generating nutch indexes for our site which is running off a
>> websphere
>> server. The indexing takes about 20 hours to complete. However, after
>> about
>> 15-16 hours the websphere server crashes, because of too many sessions
>> being
>> created.
>>
>> It seems that each fetch creates a new session. Is there a way that all
>> nutch fetches can be done via a single session.
>>
>> Has anyone else encountered such problem? All ideas are welcome.
>>
>> Thanks.
>
>
--
View this message in context: http://www.nabble.com/Nutch-fetch-creates-too-many-http-sessions-tp23259993p23287083.html
Sent from the Nutch - User mailing list archive at Nabble.com.
Re: Nutch fetch creates too many http sessions
Posted by Dennis Kubes <ku...@apache.org>.
This seems to be more of a session handling issue on the websphere
server than a nutch fetching issue. Nutch doesn't actually create the
session, it just doesn't store cookies or session information so
websphere is creating a new session per fetch.
While having a single stored session for fetching the same domain in
Nutch seems like it might be interesting functionality, I don't believe
that currently exists. My suggestion is to look into tuning websphere
session timeouts. My guess would be they are set to a very high level.
Dennis
kazam wrote:
> Hi there,
> I am generating nutch indexes for our site which is running off a websphere
> server. The indexing takes about 20 hours to complete. However, after about
> 15-16 hours the websphere server crashes, because of too many sessions being
> created.
>
> It seems that each fetch creates a new session. Is there a way that all
> nutch fetches can be done via a single session.
>
> Has anyone else encountered such problem? All ideas are welcome.
>
> Thanks.