You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@manifoldcf.apache.org by Julien Massiera <ju...@francelabs.com> on 2019/06/03 14:26:16 UTC

Web connector empty session cookie cache

Hi all,

I was doing some tests with the Web connector, and after several tries 
with different configurations of my job to crawl a session based 
website, I noticed that one configuration was not working. So I debugged 
the job and noticed that the connector was using a wrong session cookie. 
In fact the session cookie that the connector was using, was the one of 
the previous configuration of the job. I started to find how to empty 
the session cookies cache but found nothing else than either suppress 
and recreate the connector and job, or to manually empty the postgres 
table containing the saved cookies.

Did I miss something to easily empty the session cookies cache ? If not, 
wouldn't it make sense to add a button to allow it ?

Regards,
Julien


RE: Web connector empty session cookie cache

Posted by Julien <ju...@francelabs.com>.
Hi Karl,

I understand, I’ll check that.

Thanks
Julien

De : Karl Wright
Envoyé le :lundi 3 juin 2019 20:37
À : user@manifoldcf.apache.org
Objet :Re: Web connector empty session cookie cache

Hi Julien,

When the session-based web crawl detects entry into a login sequence, the session cookies are cleared at that point.  Essentially your symptom means that you haven't been complete about setting up your login sequence.  If you make it detect the case when the session cookie is wrong, all should work properly.

This is necessary in any case because sessions do expire and there is otherwise no way to recover from that during a crawl.

Karl


On Mon, Jun 3, 2019 at 10:26 AM Julien Massiera <ju...@francelabs.com> wrote:
Hi all,

I was doing some tests with the Web connector, and after several tries 
with different configurations of my job to crawl a session based 
website, I noticed that one configuration was not working. So I debugged 
the job and noticed that the connector was using a wrong session cookie. 
In fact the session cookie that the connector was using, was the one of 
the previous configuration of the job. I started to find how to empty 
the session cookies cache but found nothing else than either suppress 
and recreate the connector and job, or to manually empty the postgres 
table containing the saved cookies.

Did I miss something to easily empty the session cookies cache ? If not, 
wouldn't it make sense to add a button to allow it ?

Regards,
Julien



---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
https://www.avast.com/antivirus

Re: Web connector empty session cookie cache

Posted by Karl Wright <da...@gmail.com>.
Hi Julien,

When the session-based web crawl detects entry into a login sequence, the
session cookies are cleared at that point.  Essentially your symptom means
that you haven't been complete about setting up your login sequence.  If
you make it detect the case when the session cookie is wrong, all should
work properly.

This is necessary in any case because sessions do expire and there is
otherwise no way to recover from that during a crawl.

Karl


On Mon, Jun 3, 2019 at 10:26 AM Julien Massiera <
julien.massiera@francelabs.com> wrote:

> Hi all,
>
> I was doing some tests with the Web connector, and after several tries
> with different configurations of my job to crawl a session based
> website, I noticed that one configuration was not working. So I debugged
> the job and noticed that the connector was using a wrong session cookie.
> In fact the session cookie that the connector was using, was the one of
> the previous configuration of the job. I started to find how to empty
> the session cookies cache but found nothing else than either suppress
> and recreate the connector and job, or to manually empty the postgres
> table containing the saved cookies.
>
> Did I miss something to easily empty the session cookies cache ? If not,
> wouldn't it make sense to add a button to allow it ?
>
> Regards,
> Julien
>
>