You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by Sal Mangano <sa...@into-technology.com> on 2004/08/05 21:44:39 UTC

Preventing session from timing out during search indexing

I am using Cocoon 2.1.5 and Tomcat 4.1.3

My site is constructed such that a user must be logged in to access old
content. A protected pipeline is set up using <map:match
type="regexp-session" .../> to control access. This all works fine.

However, when it comes time to build my Lucene search index, trouble begins.
On my dev box the search index can take 1 hour to build. Since the index
involves gaining access to these protected pipelines the session must stay
valid until the indexing is done. I use an xsp to kick off the search
indexing and the relevant part looks like:

      <!--Make sure session does not expire before indexing is finished -->
      <xsp-session:set-max-inactive-interval interval="-1"/>
      <xsp-session:set-attribute name="role">USER
PUBLISHER</xsp-session:set-attribute>
      createIndex(baseURL, create );

As I tail the access.log I can see the index building process is going along
fine on its merry way for a time. The all of a sudden I see all accesses
being redirected to a URL restricted.html which is exactly what will happen
when there is no session or the session timed out.

Why is this not fixed by <xsp-session:set-max-inactive-interval
interval="-1"/> or <xsp-session:set-max-inactive-interval interval="8000"/>?

Any hints or alternate strategies would be appreciated. 

-Sal

---------------------------------------------------------
Sal Mangano
Into Technology Inc.
www.into-technology.com

Use XSLT? Try the XSLT Cookbook
http://www.oreilly.com/catalog/xsltckbk/  


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


RE: Preventing session from timing out during search indexing

Posted by Sal Mangano <sm...@ureach.com>.
So as usual when I post a question to this list I end up finding the
solution myself, after much pain, ... sigh.

The solution is nice because it will solve the Lucene problem and the
problem of external robots, like google, trying to index the site.
Basically I treat the user-agent name of the robot as an user id. (This
allows me to control which robots index the site and what they see which has
other nice side effects that I won't get into here cause they are peculiar
to my situation). In any case, the robot is authenicated in the normal way
and information is stored in the session. I then arrange for all URL's that
the robot will crawl to be URL encodded with the session id. This is done
using a xslt transform on <a href="someurl"> elements. 


> -----Original Message-----
> From: Sal Mangano [mailto:smangano@ureach.com] 
> Sent: Thursday, August 05, 2004 4:59 PM
> To: users@cocoon.apache.org
> Subject: RE: Preventing session from timing out during search indexing
> 
> 
> Okay, further investigation shows that the session is not 
> timing out. It is that the indexer that is crawling the site 
> is not attached to the session. I still am not sure how to 
> fix but have some ideas. Would appreciate help just the same, PLEASE!
> 
> > -----Original Message-----
> > From: Sal Mangano [mailto:sal.mangano@into-technology.com]
> > Sent: Thursday, August 05, 2004 3:45 PM
> > To: users@cocoon.apache.org
> > Subject: Preventing session from timing out during search indexing
> > 
> > 
> > I am using Cocoon 2.1.5 and Tomcat 4.1.3
> > 
> > My site is constructed such that a user must be logged in to
> > access old content. A protected pipeline is set up using 
> > <map:match type="regexp-session" .../> to control access. 
> > This all works fine.
> > 
> > However, when it comes time to build my Lucene search index,
> > trouble begins. On my dev box the search index can take 1 
> > hour to build. Since the index involves gaining access to 
> > these protected pipelines the session must stay valid until 
> > the indexing is done. I use an xsp to kick off the search 
> > indexing and the relevant part looks like:
> > 
> >       <!--Make sure session does not expire before indexing
> > is finished -->
> >       <xsp-session:set-max-inactive-interval interval="-1"/>
> >       <xsp-session:set-attribute name="role">USER 
> > PUBLISHER</xsp-session:set-attribute>
> >       createIndex(baseURL, create );
> > 
> > As I tail the access.log I can see the index building process
> > is going along fine on its merry way for a time. The all of a 
> > sudden I see all accesses being redirected to a URL 
> > restricted.html which is exactly what will happen when there 
> > is no session or the session timed out.
> > 
> > Why is this not fixed by <xsp-session:set-max-inactive-interval
> > interval="-1"/> or <xsp-session:set-max-inactive-interval
> > interval="8000"/>?
> > 
> > Any hints or alternate strategies would be appreciated.
> > 
> > -Sal
> > 
> > ---------------------------------------------------------
> > Sal Mangano
> > Into Technology Inc.
> > www.into-technology.com
> > 
> > Use XSLT? Try the XSLT Cookbook
> > http://www.oreilly.com/catalog/xsltckbk/  
> > 
> > 
> > 
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
> > For additional commands, e-mail: users-help@cocoon.apache.org
> > 
> > 
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
> For additional commands, e-mail: users-help@cocoon.apache.org
> 
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


RE: Preventing session from timing out during search indexing

Posted by Sal Mangano <sm...@ureach.com>.
Okay, further investigation shows that the session is not timing out. It is
that the indexer that is crawling the site is not attached to the session. I
still am not sure how to fix but have some ideas. Would appreciate help just
the same, PLEASE!

> -----Original Message-----
> From: Sal Mangano [mailto:sal.mangano@into-technology.com] 
> Sent: Thursday, August 05, 2004 3:45 PM
> To: users@cocoon.apache.org
> Subject: Preventing session from timing out during search indexing
> 
> 
> I am using Cocoon 2.1.5 and Tomcat 4.1.3
> 
> My site is constructed such that a user must be logged in to 
> access old content. A protected pipeline is set up using 
> <map:match type="regexp-session" .../> to control access. 
> This all works fine.
> 
> However, when it comes time to build my Lucene search index, 
> trouble begins. On my dev box the search index can take 1 
> hour to build. Since the index involves gaining access to 
> these protected pipelines the session must stay valid until 
> the indexing is done. I use an xsp to kick off the search 
> indexing and the relevant part looks like:
> 
>       <!--Make sure session does not expire before indexing 
> is finished -->
>       <xsp-session:set-max-inactive-interval interval="-1"/>
>       <xsp-session:set-attribute name="role">USER 
> PUBLISHER</xsp-session:set-attribute>
>       createIndex(baseURL, create );
> 
> As I tail the access.log I can see the index building process 
> is going along fine on its merry way for a time. The all of a 
> sudden I see all accesses being redirected to a URL 
> restricted.html which is exactly what will happen when there 
> is no session or the session timed out.
> 
> Why is this not fixed by <xsp-session:set-max-inactive-interval
> interval="-1"/> or <xsp-session:set-max-inactive-interval 
> interval="8000"/>?
> 
> Any hints or alternate strategies would be appreciated. 
> 
> -Sal
> 
> ---------------------------------------------------------
> Sal Mangano
> Into Technology Inc.
> www.into-technology.com
> 
> Use XSLT? Try the XSLT Cookbook 
> http://www.oreilly.com/catalog/xsltckbk/  
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
> For additional commands, e-mail: users-help@cocoon.apache.org
> 
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org