You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@turbine.apache.org by Heiko Braun <he...@fork.de> on 2001/11/13 20:55:14 UTC

[OT] search engine & session id's

This might be slightly off topic,
but i am facing a problem with page internal search engines
and i couldn't find an answer in the archives.

When crawling through the pages with a std. search engine
(like ht:dig) the session id's get encoded in the search indexes,
which makes them unusable. 

One possible solution would be to make turbine deliver the session id's 
as post values (like "?session=23847"), instead of a post val prefix,
which it does right now.  That would be a way to get rid of it on the search engine side.

Is there a way, of recoding the mechanismen which encodes the url's?
If so, which classes does it affect? Or has someone else already a working solution?

Any help appreciated,

-- 

 heiko braun, fork unstable media
 http://www.unstablemedia.com



  






--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [OT] search engine & session id's

Posted by Heiko Braun <he...@fork.de>.
the problem is that the id's show up in the search results.
Every indexed link carrries the session id from when the 
engine crawled over the pages.

I always thougt that the servlet container gets into trouble,
when an active session accesses an URL which carries 
a foreign session id.

heiko

On Tue, 13 Nov 2001 15:37:07 -0800
John McNally <jm...@collab.net> wrote:

> tomcat (or whatever servlet engine you are using) is encoding the urls. 
> Why is it a problem to save the session id?
> 
> john mcnally
> 
> Heiko Braun wrote:
> > 
> > This might be slightly off topic,
> > but i am facing a problem with page internal search engines
> > and i couldn't find an answer in the archives.
> > 
> > When crawling through the pages with a std. search engine
> > (like ht:dig) the session id's get encoded in the search indexes,
> > which makes them unusable.
> > 
> > One possible solution would be to make turbine deliver the session id's
> > as post values (like "?session=23847"), instead of a post val prefix,
> > which it does right now.  That would be a way to get rid of it on the search engine side.
> > 
> > Is there a way, of recoding the mechanismen which encodes the url's?
> > If so, which classes does it affect? Or has someone else already a working solution?
> > 
> > Any help appreciated,
> > 
> > --
> > 
> >  heiko braun, fork unstable media
> >  http://www.unstablemedia.com
> > 
> > 
> > 
> > --
> > To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
> > For additional commands, e-mail: <ma...@jakarta.apache.org>
> 
> --
> To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
> For additional commands, e-mail: <ma...@jakarta.apache.org>
> 

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [OT] search engine & session id's

Posted by John McNally <jm...@collab.net>.
tomcat (or whatever servlet engine you are using) is encoding the urls. 
Why is it a problem to save the session id?

john mcnally

Heiko Braun wrote:
> 
> This might be slightly off topic,
> but i am facing a problem with page internal search engines
> and i couldn't find an answer in the archives.
> 
> When crawling through the pages with a std. search engine
> (like ht:dig) the session id's get encoded in the search indexes,
> which makes them unusable.
> 
> One possible solution would be to make turbine deliver the session id's
> as post values (like "?session=23847"), instead of a post val prefix,
> which it does right now.  That would be a way to get rid of it on the search engine side.
> 
> Is there a way, of recoding the mechanismen which encodes the url's?
> If so, which classes does it affect? Or has someone else already a working solution?
> 
> Any help appreciated,
> 
> --
> 
>  heiko braun, fork unstable media
>  http://www.unstablemedia.com
> 
> 
> 
> --
> To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
> For additional commands, e-mail: <ma...@jakarta.apache.org>

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>