You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tapestry.apache.org by an...@di.uoa.gr on 2005/11/16 00:17:53 UTC

Re: {SPAM?} Re: Best way to screenscrape tapestry

Αρχικό μήνυμα από  Ted Steen <te...@gmail.com>:

> There should not be any problems with the Cache component if one uses
> cookies, right?

True, but can you really make sure that your clients never disable them?

Fact is that up to now, I try not to mix the cache with components that generate
links.
Imagine caching a contrib:Table component, that generates pagination links.
If you use a simple key like 'myTable' for the cache, then the link to the second
page of the table will work, but the data displayed will be the same (since the
cache has 
been filled in the first page, and it wouldn't know that you know want to see
the second!)

In those cases, you'ld have to use a key that somehow contains all the
parameters (that may change)
of all the enclosed components.

Even though these problems seem difficult to solve, I'm certain that a general
solution exists. 
It will allow us to create a really really great cache component.
(I have to admit that up to now I've used quick hacks, such as playing with the
cache key,
 - they get the job done, complex page rendering has gone from 400ms to 3 or 4,
BUT they are
not general purpose and need lots of tweeking).

> Problem is that sometimes I see a jsessionid=xxx even though I use cookies.
> Do anyone know why?
> 
> On 11/15/05, andyhot@di.uoa.gr <an...@di.uoa.gr> wrote:
> > > Wow!
> > >
> > > This is a really powerful component!
> > > Simple and powerful.
> >
> > Well, thanks but you have to take care when using it.
> >
> > For instance, you'll run into problems if you want to support sessions
> > with url-rewriting (instead of cookies)
> > and you also want to cache html content that contains direct - page -
> external
> > links,
> > because the JSESSIONID parameter of every link will also be cached - and
> > returned to
> > every user!
> >
> > I'm sure there's a way to create a smarter version of the cache which will
> take
> > into account
> > these issues and regenerate (only) the links every time it's used, but I'll
> have
> > to investigate more
> > on this.
> >
> > If someone has done this before and wants to extend and/or contribute to
> this
> > component, he's
> > more than wellcome to do so :)
> >
> >
> > > I think it should be part of the Tapestry Framework.
> > >
> > >
> > > On 11/15/05, andyhot@di.uoa.gr <an...@di.uoa.gr> wrote:
> > > > I also needed to cache parts of some pages, and so I created the cache
> > > component of
> > > > TapFX ( tapfx.sf.net) which uses EhCache. I use this component a lot in
> my
> > > own apps.
> > > >
> > > > However, the latest version for Tapestry 3 contains a bug (so instead
> get
> > > v0.30)
> > > > and the update for Tapestry beta-13 has broken the Tapestry 4 version
> :)
> > > >
> > > > I have fixed both in the CVS so, expect a new release later today or
> > > tomorrow
> > > > morning!
> > > >
> > > > The idea is that you surround the content to be cached with a <span
> > > > jwcid="@Cache" key="report1">
> > > > and then you define the report1 cache in the ehcache.xml configuration
> > > file. The
> > > > cache component
> > > > also has public static methods for clearing or quering a cache.
> Finally
> > > take a
> > > > look at this FAQ:
> > > > http://tapfx.sourceforge.net/multiproject/tapfx-components/faq.html
> > > >
> > > > Hope this helps...
> > > >
> > > >
> > > > From Patrick Casey <pa...@adelphia.net>:
> > > >
> > > > >
> > > > >
> > > > >             I've got a couple of pages that change very rarely, but
> are
> > > > > rather expensive to generate (lots of conditional logic, db lookups,
> > > etc).
> > > > > I'd like to have tapestry generate them once, capture the output,
> and
> > > then
> > > > > either serve them as static pages, or at least serve them out of
> > > internal
> > > > > cache rather than going through the whole render cycle over again.
> > > > >
> > > > >
> > > > >
> > > > >             What's the best way to go about doing this in a Tapestry
> > > > > friendly fashion?
> > > > >
> > > > >
> > > > >
> > > > >             --- Pat
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> > > > For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
> > > >
> > > >
> > >
> > >
> > > --
> > > /ted
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> > > For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
> > >
> > >
> > >
> >
> >
> > --
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
> >
> >
> 
> 
> --
> /ted
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
> 
> 
> 


-- 



---------------------------------------------------------------------
To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tapestry-user-help@jakarta.apache.org


Re: {SPAM?} Re: Best way to screenscrape tapestry

Posted by Markus Joschko <ma...@gmail.com>.
is the sessionId really a problem? Shouldn't the web container strip
the jsessionid before passing it to the webapplication. Only the
container is interested in the jsessionId to assign the session to a
request.

markus

On 11/16/05, Ted Steen <te...@gmail.com> wrote:
> Well, a general Cache component would be gold!
> Another thing that come to mind is that if the cache components body
> contains some kind of state (e.g. when persisting something
> clientside) things could get messed up..
> So I guess you should avoid both links and client side state.
>
>
> On 11/16/05, andyhot@di.uoa.gr <an...@di.uoa.gr> wrote:
> > Αρχικό μήνυμα από  Ted Steen <te...@gmail.com>:
> >
> > > There should not be any problems with the Cache component if one uses
> > > cookies, right?
> >
> > True, but can you really make sure that your clients never disable them?
> >
> > Fact is that up to now, I try not to mix the cache with components that generate
> > links.
> > Imagine caching a contrib:Table component, that generates pagination links.
> > If you use a simple key like 'myTable' for the cache, then the link to the second
> > page of the table will work, but the data displayed will be the same (since the
> > cache has
> > been filled in the first page, and it wouldn't know that you know want to see
> > the second!)
> >
> > In those cases, you'ld have to use a key that somehow contains all the
> > parameters (that may change)
> > of all the enclosed components.
> >
> > Even though these problems seem difficult to solve, I'm certain that a general
> > solution exists.
> > It will allow us to create a really really great cache component.
> > (I have to admit that up to now I've used quick hacks, such as playing with the
> > cache key,
> >  - they get the job done, complex page rendering has gone from 400ms to 3 or 4,
> > BUT they are
> > not general purpose and need lots of tweeking).
> >
> > > Problem is that sometimes I see a jsessionid=xxx even though I use cookies.
> > > Do anyone know why?
> > >
> > > On 11/15/05, andyhot@di.uoa.gr <an...@di.uoa.gr> wrote:
> > > > > Wow!
> > > > >
> > > > > This is a really powerful component!
> > > > > Simple and powerful.
> > > >
> > > > Well, thanks but you have to take care when using it.
> > > >
> > > > For instance, you'll run into problems if you want to support sessions
> > > > with url-rewriting (instead of cookies)
> > > > and you also want to cache html content that contains direct - page -
> > > external
> > > > links,
> > > > because the JSESSIONID parameter of every link will also be cached - and
> > > > returned to
> > > > every user!
> > > >
> > > > I'm sure there's a way to create a smarter version of the cache which will
> > > take
> > > > into account
> > > > these issues and regenerate (only) the links every time it's used, but I'll
> > > have
> > > > to investigate more
> > > > on this.
> > > >
> > > > If someone has done this before and wants to extend and/or contribute to
> > > this
> > > > component, he's
> > > > more than wellcome to do so :)
> > > >
> > > >
> > > > > I think it should be part of the Tapestry Framework.
> > > > >
> > > > >
> > > > > On 11/15/05, andyhot@di.uoa.gr <an...@di.uoa.gr> wrote:
> > > > > > I also needed to cache parts of some pages, and so I created the cache
> > > > > component of
> > > > > > TapFX ( tapfx.sf.net) which uses EhCache. I use this component a lot in
> > > my
> > > > > own apps.
> > > > > >
> > > > > > However, the latest version for Tapestry 3 contains a bug (so instead
> > > get
> > > > > v0.30)
> > > > > > and the update for Tapestry beta-13 has broken the Tapestry 4 version
> > > :)
> > > > > >
> > > > > > I have fixed both in the CVS so, expect a new release later today or
> > > > > tomorrow
> > > > > > morning!
> > > > > >
> > > > > > The idea is that you surround the content to be cached with a <span
> > > > > > jwcid="@Cache" key="report1">
> > > > > > and then you define the report1 cache in the ehcache.xml configuration
> > > > > file. The
> > > > > > cache component
> > > > > > also has public static methods for clearing or quering a cache.
> > > Finally
> > > > > take a
> > > > > > look at this FAQ:
> > > > > > http://tapfx.sourceforge.net/multiproject/tapfx-components/faq.html
> > > > > >
> > > > > > Hope this helps...
> > > > > >
> > > > > >
> > > > > > From Patrick Casey <pa...@adelphia.net>:
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > >             I've got a couple of pages that change very rarely, but
> > > are
> > > > > > > rather expensive to generate (lots of conditional logic, db lookups,
> > > > > etc).
> > > > > > > I'd like to have tapestry generate them once, capture the output,
> > > and
> > > > > then
> > > > > > > either serve them as static pages, or at least serve them out of
> > > > > internal
> > > > > > > cache rather than going through the whole render cycle over again.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >             What's the best way to go about doing this in a Tapestry
> > > > > > > friendly fashion?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >             --- Pat
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > >
> > > > > >
> > > > > > ---------------------------------------------------------------------
> > > > > > To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> > > > > > For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > /ted
> > > > >
> > > > > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> > > > > For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> > > > For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
> > > >
> > > >
> > >
> > >
> > > --
> > > /ted
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> > > For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
> > >
> > >
> > >
> >
> >
> > --
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
> >
> >
>
>
> --
> /ted
>

Re: {SPAM?} Re: Best way to screenscrape tapestry

Posted by Ted Steen <te...@gmail.com>.
Well, a general Cache component would be gold!
Another thing that come to mind is that if the cache components body
contains some kind of state (e.g. when persisting something
clientside) things could get messed up..
So I guess you should avoid both links and client side state.


On 11/16/05, andyhot@di.uoa.gr <an...@di.uoa.gr> wrote:
> Αρχικό μήνυμα από  Ted Steen <te...@gmail.com>:
>
> > There should not be any problems with the Cache component if one uses
> > cookies, right?
>
> True, but can you really make sure that your clients never disable them?
>
> Fact is that up to now, I try not to mix the cache with components that generate
> links.
> Imagine caching a contrib:Table component, that generates pagination links.
> If you use a simple key like 'myTable' for the cache, then the link to the second
> page of the table will work, but the data displayed will be the same (since the
> cache has
> been filled in the first page, and it wouldn't know that you know want to see
> the second!)
>
> In those cases, you'ld have to use a key that somehow contains all the
> parameters (that may change)
> of all the enclosed components.
>
> Even though these problems seem difficult to solve, I'm certain that a general
> solution exists.
> It will allow us to create a really really great cache component.
> (I have to admit that up to now I've used quick hacks, such as playing with the
> cache key,
>  - they get the job done, complex page rendering has gone from 400ms to 3 or 4,
> BUT they are
> not general purpose and need lots of tweeking).
>
> > Problem is that sometimes I see a jsessionid=xxx even though I use cookies.
> > Do anyone know why?
> >
> > On 11/15/05, andyhot@di.uoa.gr <an...@di.uoa.gr> wrote:
> > > > Wow!
> > > >
> > > > This is a really powerful component!
> > > > Simple and powerful.
> > >
> > > Well, thanks but you have to take care when using it.
> > >
> > > For instance, you'll run into problems if you want to support sessions
> > > with url-rewriting (instead of cookies)
> > > and you also want to cache html content that contains direct - page -
> > external
> > > links,
> > > because the JSESSIONID parameter of every link will also be cached - and
> > > returned to
> > > every user!
> > >
> > > I'm sure there's a way to create a smarter version of the cache which will
> > take
> > > into account
> > > these issues and regenerate (only) the links every time it's used, but I'll
> > have
> > > to investigate more
> > > on this.
> > >
> > > If someone has done this before and wants to extend and/or contribute to
> > this
> > > component, he's
> > > more than wellcome to do so :)
> > >
> > >
> > > > I think it should be part of the Tapestry Framework.
> > > >
> > > >
> > > > On 11/15/05, andyhot@di.uoa.gr <an...@di.uoa.gr> wrote:
> > > > > I also needed to cache parts of some pages, and so I created the cache
> > > > component of
> > > > > TapFX ( tapfx.sf.net) which uses EhCache. I use this component a lot in
> > my
> > > > own apps.
> > > > >
> > > > > However, the latest version for Tapestry 3 contains a bug (so instead
> > get
> > > > v0.30)
> > > > > and the update for Tapestry beta-13 has broken the Tapestry 4 version
> > :)
> > > > >
> > > > > I have fixed both in the CVS so, expect a new release later today or
> > > > tomorrow
> > > > > morning!
> > > > >
> > > > > The idea is that you surround the content to be cached with a <span
> > > > > jwcid="@Cache" key="report1">
> > > > > and then you define the report1 cache in the ehcache.xml configuration
> > > > file. The
> > > > > cache component
> > > > > also has public static methods for clearing or quering a cache.
> > Finally
> > > > take a
> > > > > look at this FAQ:
> > > > > http://tapfx.sourceforge.net/multiproject/tapfx-components/faq.html
> > > > >
> > > > > Hope this helps...
> > > > >
> > > > >
> > > > > From Patrick Casey <pa...@adelphia.net>:
> > > > >
> > > > > >
> > > > > >
> > > > > >             I've got a couple of pages that change very rarely, but
> > are
> > > > > > rather expensive to generate (lots of conditional logic, db lookups,
> > > > etc).
> > > > > > I'd like to have tapestry generate them once, capture the output,
> > and
> > > > then
> > > > > > either serve them as static pages, or at least serve them out of
> > > > internal
> > > > > > cache rather than going through the whole render cycle over again.
> > > > > >
> > > > > >
> > > > > >
> > > > > >             What's the best way to go about doing this in a Tapestry
> > > > > > friendly fashion?
> > > > > >
> > > > > >
> > > > > >
> > > > > >             --- Pat
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > >
> > > > >
> > > > > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> > > > > For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > /ted
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> > > > For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
> > > >
> > > >
> > > >
> > >
> > >
> > > --
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> > > For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
> > >
> > >
> >
> >
> > --
> > /ted
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
> >
> >
> >
>
>
> --
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
>
>


--
/ted