You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sling.apache.org by Felix Meschberger <Fe...@day.com> on 2010/04/28 13:13:23 UTC

Caching Support Request Filter

Hi all,

I have been resonating with a collegue about a request level Filter
for Sling to support caching.

The idea (and partly implemented by a prototype) is to have the
request filter setup default caching behaviour of the response (if the
response is cacheable at, that is the request method must be GET and
there are no request parameters):

* The Cache-Control header is preset with values from configuration
matching the request URI (or resource path)
* The Last-Modified header is preset with the jcr:lastModified
property of the requet's resource
* Eager responding with 304/NOT MODIFIED if the If-Modified-Since
header is set and a last modification time of the resource can be
resolved.

ll these settings can be overwritten/replaced by the request
processing scripts but it at least sets some defaults for caching:

* Caching of the response enabled at all
* Whether revalidation by clients and/or proxies is required
* etc...

Configuration could be something like:
* Enable/Disable caching support functionality completely
* List of URL (or resource path) regexps with their setup, e.g.
       ^/(apps|libs)\/.*=must-revalidate
* Enabled/Disable support for eager 304 responses

This would probably just be a first step in a new infrastructure
around supporting response caching.

WDYT ?

Regards
Felix

Re: Caching Support Request Filter

Posted by Vidar Ramdal <vi...@idium.no>.
On Wed, Apr 28, 2010 at 5:23 PM, Felix Meschberger <fm...@gmail.com> wrote:
> Hi,
>
> Wow !
>
> I didn't expect to have this discussion get in this direction, but excellent !
>
> For illustration what I originally had in mind, I have commited my
> prototype in [1].
>
> Please note, that this *only* is about setting the Last-Modified and
> Cache-Control headers.
>
> Now, taken a step further: do we really want to build a cache into
> Sling ? Shouldn't we rather rely on some existing caching proxy for
> this, like Squid or mod_cache/mod_proxy ?

Again from our project, we use Varnish [1] in front of our Sling app,
and then remove objects from Varnish when there's a resource change
event in Sling.

Caching HTTP proxies are highly optimized to do just that, and trying
to implement a good enough system in Sling seems to be duplicating
work.

What *would* be useful, is more low-level caching. Without having
investigated any such possibilities, I can think of resource
resolution, script results, access control ...

[1] http://varnish-cache.org/

-- 
Vidar S. Ramdal <vi...@idium.no> - http://www.idium.no
Sommerrogata 13-15, N-0255 Oslo, Norway
+ 47 22 00 84 00 / +47 21 531941, ext 2070

Re: Caching Support Request Filter

Posted by Felix Meschberger <fm...@gmail.com>.
Hi,

On 06.05.2010 16:04, Vidar Ramdal wrote:
> On Wed, Apr 28, 2010 at 5:23 PM, Felix Meschberger <fm...@gmail.com> wrote:
>> Now, taken a step further: do we really want to build a cache into
>> Sling ? Shouldn't we rather rely on some existing caching proxy for
>> this, like Squid or mod_cache/mod_proxy ?
> 
> Thinking about this a bit more; instead of building a cache mecanism
> ourselves, how about providing a good integration point for caching
> proxies?
> 
> This component would be a filter or similar, which must track which
> resources are being accessed during a request. The component would
> register as a resource event listener, to be notified when resources
> are changed, so that the corresponding request URIs can be
> invalidated.
> When invalidating a request URI, the component should call a pluggable
> adapter (e.g. a SquidAdapter), which in turn will notify Squid of the
> invalidation.

Exactly, what I had in mind ;-)

And since all resources are accessed through the request's
ResourceResolver, it would probably be easiest to track these accesses
in the resource resolver and make it available to the cache
infrastructure ...

Regards
Felix

Re: Caching Support Request Filter

Posted by Vidar Ramdal <vi...@idium.no>.
On Wed, Apr 28, 2010 at 5:23 PM, Felix Meschberger <fm...@gmail.com> wrote:
> Now, taken a step further: do we really want to build a cache into
> Sling ? Shouldn't we rather rely on some existing caching proxy for
> this, like Squid or mod_cache/mod_proxy ?

Thinking about this a bit more; instead of building a cache mecanism
ourselves, how about providing a good integration point for caching
proxies?

This component would be a filter or similar, which must track which
resources are being accessed during a request. The component would
register as a resource event listener, to be notified when resources
are changed, so that the corresponding request URIs can be
invalidated.
When invalidating a request URI, the component should call a pluggable
adapter (e.g. a SquidAdapter), which in turn will notify Squid of the
invalidation.

-- 
Vidar S. Ramdal <vi...@idium.no> - http://www.idium.no
Sommerrogata 13-15, N-0255 Oslo, Norway
+ 47 22 00 84 00 / +47 21 531941, ext 2070

Re: Caching Support Request Filter

Posted by Ian Boston <ie...@tfd.co.uk>.
On 29 Apr 2010, at 01:23, Felix Meschberger wrote:

> Hi,
> 
> Wow !
> 
> I didn't expect to have this discussion get in this direction, but excellent !
> 
> For illustration what I originally had in mind, I have commited my
> prototype in [1].
> 
> Please note, that this *only* is about setting the Last-Modified and
> Cache-Control headers.
> 
> Now, taken a step further: do we really want to build a cache into
> Sling ? Shouldn't we rather rely on some existing caching proxy for
> this, like Squid or mod_cache/mod_proxy ?
> 
> As for what to cache (if we cache): I think we should not cache
> requests with Queries, such requests are by definition not cacheable.
> Having a multi-dimensional cache taking requesting users into account
> is also an interesting thing.

It *should* be possible to specify which responses can be cached, with a request attribute being set by the thing creating the response, caching off by default.
The key to the cache should be based on a subset of request headers (including cookies, since that will contain user entropy).
The key needs to be a multi level key so that all instances of a cached response can be invalidated to all users in 1 operation.

Only if those criteria are met should the headers and byte array representing the response be cached (I have been using ehcache, which can be configuration for putting the cache down to disk).

It then becomes the responsibility of the thing creating the response to invalidate the cache for items that it "knows" about.

 


> 
> My fear is, that we run into a performance drain just to manage the cache ....

Agreed, imho, caching should be selective and at the discretion of the application rather than a blanket operation. Certainly in Sakai Nakamura we have many Sling Servlets where the invalidation is complex and not something that could be automated.

> 
> Regards
> Felix
> 
> [1] http://svn.apache.org/repos/asf/sling/whiteboard/fmeschbe/cachecontrol
> 
> 
> On Wed, Apr 28, 2010 at 4:02 PM, Eric Norman <er...@gmail.com> wrote:
>> Hi all,
>> 
>> In general, I like the idea of a server side cache.  However, I agree with
>> Vidar that a cache without resource tracking has limited usefulness in a
>> real system.
>> 
>> In the past I had implemented something similar.
>> 
>> The key parts I remember were:
>> 
>>   - I used a (slightly) modified version of the OSCache library for
>>   managing the cache: http://www.opensymphony.com/oscache/
>>   - Cache only for GET requests
>>   - The cacheKey had to contain (at a minimum) the following information:
>>      1. Is the current user logged in? (anonymous vs. real user)
>>      2. What groups is the current user a member of (in case ACLs affect
>>      what is rendered).  Also, the ACEs for all the resources used to
>> render the
>>      response would need to use group principals instead of
>> individual userids to
>>      make the cache value reusable by more users.
>>      3. The current theme, language, or other options from the user
>>      preferences that may affect how the page is rendered.
>>      4. A version of the requested query string that has been sorted (in
>>      case the params come in a different order).
>>      5. Filter out "jsessionid" if it is present on the url
>>   - When rendering the page keep track of all the resources used to render
>>   the page.  Using the OSCache APIs, the resources were tracked by adding the
>>   resource path as a 'group' on the cache entry.
>>   - Special handling is need for cache invalidation during ACL changes in
>>   case changing the ACL causes the content of the page to change.
>>   - Sometimes tracking resources used is not sufficient as you may have a
>>   page that is listing the children of a container.  Adding a new child to the
>>   container would also need to invalidate the cache entry.  To handle this,
>>   pages that do such things would need to add a container 'group' to the cache
>>   entry (cacheEntry.addGroup(container:[resourcePath]).
>>   - Use a (Synchronous) JCR Observer to listen for changes to resources.
>>    If a change is detected, invalidate any cache entries that reference the
>>   changed resource (or entries that track the parent container). In OSCache
>>   this is done by flushing the group (the resource path) to invalidate any
>>   entries that reference the group path
>>   - During the rendering of the page there should be some way for the
>>   script to indicate that it should not be cached.
>>   - Sometimes caching the whole page is not possible if the page contains
>>   user specific text (for example, username in the page header) but it may be
>>   possible to cache fragments of the page instead.
>> 
>> 
>> Anyways, that's my 2 cents.
>> 
>> Regards,
>> Eric
>> 
>> On Wed, Apr 28, 2010 at 4:35 AM, Vidar Ramdal <vi...@idium.no> wrote:
>> 
>>> On Wed, Apr 28, 2010 at 1:13 PM, Felix Meschberger
>>> <Fe...@day.com> wrote:
>>>> Hi all,
>>>> 
>>>> I have been resonating with a collegue about a request level Filter
>>>> for Sling to support caching.
>>>> 
>>>> The idea (and partly implemented by a prototype) is to have the
>>>> request filter setup default caching behaviour of the response (if the
>>>> response is cacheable at, that is the request method must be GET and
>>>> there are no request parameters):
>>>> 
>>>> * The Cache-Control header is preset with values from configuration
>>>> matching the request URI (or resource path)
>>>> * The Last-Modified header is preset with the jcr:lastModified
>>>> property of the requet's resource
>>>> * Eager responding with 304/NOT MODIFIED if the If-Modified-Since
>>>> header is set and a last modification time of the resource can be
>>>> resolved.
>>> 
>>> The question is how useful such a filter would be if only the
>>> last-modified date of the requested resource is used.
>>> 
>>> In our application at least, there is a large number of resources
>>> involved when serving a request. Most CMSs list out menus, for
>>> example, where the menu items are other resources. If one of those
>>> resources have changed, or if there has been a new menu item created,
>>> it means the menu will be out of date if the requested resource itself
>>> is unmodified.
>>> 
>>> To solve this, we could introduce a resource tracker, which tracks
>>> which resources are being invoked on a request. The latest
>>> last-modified date of these resources will then be matched with the
>>> requests If-Modified-Since header.
>>> 
>>> --
>>> Vidar S. Ramdal <vi...@idium.no> - http://www.idium.no
>>> Sommerrogata 13-15, N-0255 Oslo, Norway
>>> + 47 22 00 84 00 / +47 21 531941, ext 2070
>>> 
>> 


Re: Caching Support Request Filter

Posted by Felix Meschberger <fm...@gmail.com>.
Hi,

Wow !

I didn't expect to have this discussion get in this direction, but excellent !

For illustration what I originally had in mind, I have commited my
prototype in [1].

Please note, that this *only* is about setting the Last-Modified and
Cache-Control headers.

Now, taken a step further: do we really want to build a cache into
Sling ? Shouldn't we rather rely on some existing caching proxy for
this, like Squid or mod_cache/mod_proxy ?

As for what to cache (if we cache): I think we should not cache
requests with Queries, such requests are by definition not cacheable.
Having a multi-dimensional cache taking requesting users into account
is also an interesting thing.

My fear is, that we run into a performance drain just to manage the cache ....

Regards
Felix

[1] http://svn.apache.org/repos/asf/sling/whiteboard/fmeschbe/cachecontrol


On Wed, Apr 28, 2010 at 4:02 PM, Eric Norman <er...@gmail.com> wrote:
> Hi all,
>
> In general, I like the idea of a server side cache.  However, I agree with
> Vidar that a cache without resource tracking has limited usefulness in a
> real system.
>
> In the past I had implemented something similar.
>
> The key parts I remember were:
>
>   - I used a (slightly) modified version of the OSCache library for
>   managing the cache: http://www.opensymphony.com/oscache/
>   - Cache only for GET requests
>   - The cacheKey had to contain (at a minimum) the following information:
>      1. Is the current user logged in? (anonymous vs. real user)
>      2. What groups is the current user a member of (in case ACLs affect
>      what is rendered).  Also, the ACEs for all the resources used to
> render the
>      response would need to use group principals instead of
> individual userids to
>      make the cache value reusable by more users.
>      3. The current theme, language, or other options from the user
>      preferences that may affect how the page is rendered.
>      4. A version of the requested query string that has been sorted (in
>      case the params come in a different order).
>      5. Filter out "jsessionid" if it is present on the url
>   - When rendering the page keep track of all the resources used to render
>   the page.  Using the OSCache APIs, the resources were tracked by adding the
>   resource path as a 'group' on the cache entry.
>   - Special handling is need for cache invalidation during ACL changes in
>   case changing the ACL causes the content of the page to change.
>   - Sometimes tracking resources used is not sufficient as you may have a
>   page that is listing the children of a container.  Adding a new child to the
>   container would also need to invalidate the cache entry.  To handle this,
>   pages that do such things would need to add a container 'group' to the cache
>   entry (cacheEntry.addGroup(container:[resourcePath]).
>   - Use a (Synchronous) JCR Observer to listen for changes to resources.
>    If a change is detected, invalidate any cache entries that reference the
>   changed resource (or entries that track the parent container). In OSCache
>   this is done by flushing the group (the resource path) to invalidate any
>   entries that reference the group path
>   - During the rendering of the page there should be some way for the
>   script to indicate that it should not be cached.
>   - Sometimes caching the whole page is not possible if the page contains
>   user specific text (for example, username in the page header) but it may be
>   possible to cache fragments of the page instead.
>
>
> Anyways, that's my 2 cents.
>
> Regards,
> Eric
>
> On Wed, Apr 28, 2010 at 4:35 AM, Vidar Ramdal <vi...@idium.no> wrote:
>
>> On Wed, Apr 28, 2010 at 1:13 PM, Felix Meschberger
>> <Fe...@day.com> wrote:
>> > Hi all,
>> >
>> > I have been resonating with a collegue about a request level Filter
>> > for Sling to support caching.
>> >
>> > The idea (and partly implemented by a prototype) is to have the
>> > request filter setup default caching behaviour of the response (if the
>> > response is cacheable at, that is the request method must be GET and
>> > there are no request parameters):
>> >
>> > * The Cache-Control header is preset with values from configuration
>> > matching the request URI (or resource path)
>> > * The Last-Modified header is preset with the jcr:lastModified
>> > property of the requet's resource
>> > * Eager responding with 304/NOT MODIFIED if the If-Modified-Since
>> > header is set and a last modification time of the resource can be
>> > resolved.
>>
>> The question is how useful such a filter would be if only the
>> last-modified date of the requested resource is used.
>>
>> In our application at least, there is a large number of resources
>> involved when serving a request. Most CMSs list out menus, for
>> example, where the menu items are other resources. If one of those
>> resources have changed, or if there has been a new menu item created,
>> it means the menu will be out of date if the requested resource itself
>> is unmodified.
>>
>> To solve this, we could introduce a resource tracker, which tracks
>> which resources are being invoked on a request. The latest
>> last-modified date of these resources will then be matched with the
>> requests If-Modified-Since header.
>>
>> --
>> Vidar S. Ramdal <vi...@idium.no> - http://www.idium.no
>> Sommerrogata 13-15, N-0255 Oslo, Norway
>> + 47 22 00 84 00 / +47 21 531941, ext 2070
>>
>

Re: Caching Support Request Filter

Posted by Eric Norman <er...@gmail.com>.
Hi all,

In general, I like the idea of a server side cache.  However, I agree with
Vidar that a cache without resource tracking has limited usefulness in a
real system.

In the past I had implemented something similar.

The key parts I remember were:

   - I used a (slightly) modified version of the OSCache library for
   managing the cache: http://www.opensymphony.com/oscache/
   - Cache only for GET requests
   - The cacheKey had to contain (at a minimum) the following information:
      1. Is the current user logged in? (anonymous vs. real user)
      2. What groups is the current user a member of (in case ACLs affect
      what is rendered).  Also, the ACEs for all the resources used to
render the
      response would need to use group principals instead of
individual userids to
      make the cache value reusable by more users.
      3. The current theme, language, or other options from the user
      preferences that may affect how the page is rendered.
      4. A version of the requested query string that has been sorted (in
      case the params come in a different order).
      5. Filter out "jsessionid" if it is present on the url
   - When rendering the page keep track of all the resources used to render
   the page.  Using the OSCache APIs, the resources were tracked by adding the
   resource path as a 'group' on the cache entry.
   - Special handling is need for cache invalidation during ACL changes in
   case changing the ACL causes the content of the page to change.
   - Sometimes tracking resources used is not sufficient as you may have a
   page that is listing the children of a container.  Adding a new child to the
   container would also need to invalidate the cache entry.  To handle this,
   pages that do such things would need to add a container 'group' to the cache
   entry (cacheEntry.addGroup(container:[resourcePath]).
   - Use a (Synchronous) JCR Observer to listen for changes to resources.
    If a change is detected, invalidate any cache entries that reference the
   changed resource (or entries that track the parent container). In OSCache
   this is done by flushing the group (the resource path) to invalidate any
   entries that reference the group path
   - During the rendering of the page there should be some way for the
   script to indicate that it should not be cached.
   - Sometimes caching the whole page is not possible if the page contains
   user specific text (for example, username in the page header) but it may be
   possible to cache fragments of the page instead.


Anyways, that's my 2 cents.

Regards,
Eric

On Wed, Apr 28, 2010 at 4:35 AM, Vidar Ramdal <vi...@idium.no> wrote:

> On Wed, Apr 28, 2010 at 1:13 PM, Felix Meschberger
> <Fe...@day.com> wrote:
> > Hi all,
> >
> > I have been resonating with a collegue about a request level Filter
> > for Sling to support caching.
> >
> > The idea (and partly implemented by a prototype) is to have the
> > request filter setup default caching behaviour of the response (if the
> > response is cacheable at, that is the request method must be GET and
> > there are no request parameters):
> >
> > * The Cache-Control header is preset with values from configuration
> > matching the request URI (or resource path)
> > * The Last-Modified header is preset with the jcr:lastModified
> > property of the requet's resource
> > * Eager responding with 304/NOT MODIFIED if the If-Modified-Since
> > header is set and a last modification time of the resource can be
> > resolved.
>
> The question is how useful such a filter would be if only the
> last-modified date of the requested resource is used.
>
> In our application at least, there is a large number of resources
> involved when serving a request. Most CMSs list out menus, for
> example, where the menu items are other resources. If one of those
> resources have changed, or if there has been a new menu item created,
> it means the menu will be out of date if the requested resource itself
> is unmodified.
>
> To solve this, we could introduce a resource tracker, which tracks
> which resources are being invoked on a request. The latest
> last-modified date of these resources will then be matched with the
> requests If-Modified-Since header.
>
> --
> Vidar S. Ramdal <vi...@idium.no> - http://www.idium.no
> Sommerrogata 13-15, N-0255 Oslo, Norway
> + 47 22 00 84 00 / +47 21 531941, ext 2070
>

Re: Caching Support Request Filter

Posted by Carsten Ziegeler <cz...@apache.org>.
Bertrand Delacretaz  wrote
> On Wed, Apr 28, 2010 at 5:00 PM, Felix Meschberger <fm...@gmail.com> wrote:
>> On Wed, Apr 28, 2010 at 12:35 PM, Vidar Ramdal <vi...@idium.no> wrote:
>>> ... The question is how useful such a filter would be if only the
>>> last-modified date of the requested resource is used.
>>
>> That *is* in fact a valid concern, which my proposal does not account
>> for yet. I think this also aligns with what Bertrand has in mind with
>> extensibility of the basic mechanism....
> 
> Cocoon has a similar problem when caching xml processing pipelines.
> 
> That's solved there by having the pipeline compute a cache key (etag
> in our case?) where each pipeline element contributes part of the
> cache key, from left to right. If the cache key is different once you
> reach the end of the pipeline, that means the content must not share
> the same cache location.
> 
> We could imagine something similar when several resources are
> aggregated to produce a response.
> 
What about an event based cache?

Just a rough idea...

During processing we know which resources are used to render the page,
this info is stored in the cache. We already have events when resources
change. The cache listens for the events and invalidates accordingly.

Carsten


-- 
Carsten Ziegeler
cziegeler@apache.org

Re: Caching Support Request Filter

Posted by Bertrand Delacretaz <bd...@apache.org>.
On Wed, Apr 28, 2010 at 5:00 PM, Felix Meschberger <fm...@gmail.com> wrote:
> On Wed, Apr 28, 2010 at 12:35 PM, Vidar Ramdal <vi...@idium.no> wrote:
>>... The question is how useful such a filter would be if only the
>> last-modified date of the requested resource is used.
>
> That *is* in fact a valid concern, which my proposal does not account
> for yet. I think this also aligns with what Bertrand has in mind with
> extensibility of the basic mechanism....

Cocoon has a similar problem when caching xml processing pipelines.

That's solved there by having the pipeline compute a cache key (etag
in our case?) where each pipeline element contributes part of the
cache key, from left to right. If the cache key is different once you
reach the end of the pipeline, that means the content must not share
the same cache location.

We could imagine something similar when several resources are
aggregated to produce a response.

-Bertrand

Re: Caching Support Request Filter

Posted by Felix Meschberger <fm...@gmail.com>.
On Wed, Apr 28, 2010 at 12:35 PM, Vidar Ramdal <vi...@idium.no> wrote:
> On Wed, Apr 28, 2010 at 1:13 PM, Felix Meschberger
> <Fe...@day.com> wrote:
>> Hi all,
>>
>> I have been resonating with a collegue about a request level Filter
>> for Sling to support caching.
>>
>> The idea (and partly implemented by a prototype) is to have the
>> request filter setup default caching behaviour of the response (if the
>> response is cacheable at, that is the request method must be GET and
>> there are no request parameters):
>>
>> * The Cache-Control header is preset with values from configuration
>> matching the request URI (or resource path)
>> * The Last-Modified header is preset with the jcr:lastModified
>> property of the requet's resource
>> * Eager responding with 304/NOT MODIFIED if the If-Modified-Since
>> header is set and a last modification time of the resource can be
>> resolved.
>
> The question is how useful such a filter would be if only the
> last-modified date of the requested resource is used.

That *is* in fact a valid concern, which my proposal does not account
for yet. I think this also aligns with what Bertrand has in mind with
extensibility of the basic mechanism.

>
> In our application at least, there is a large number of resources
> involved when serving a request. Most CMSs list out menus, for
> example, where the menu items are other resources. If one of those
> resources have changed, or if there has been a new menu item created,
> it means the menu will be out of date if the requested resource itself
> is unmodified.
>
> To solve this, we could introduce a resource tracker, which tracks
> which resources are being invoked on a request. The latest
> last-modified date of these resources will then be matched with the
> requests If-Modified-Since header.

If you have to actually run all scripts to check the If-Modified-Since
header before actually processing the request, this would effectively
double the request processing time. I would say in this case it would
probably be better to switch of eager If-Modified-Since check and just
process the request (albeit with Cache-Control and Last-Modified
preset).

Regards
Felix

>
> --
> Vidar S. Ramdal <vi...@idium.no> - http://www.idium.no
> Sommerrogata 13-15, N-0255 Oslo, Norway
> + 47 22 00 84 00 / +47 21 531941, ext 2070
>

Re: Caching Support Request Filter

Posted by Vidar Ramdal <vi...@idium.no>.
On Wed, Apr 28, 2010 at 1:13 PM, Felix Meschberger
<Fe...@day.com> wrote:
> Hi all,
>
> I have been resonating with a collegue about a request level Filter
> for Sling to support caching.
>
> The idea (and partly implemented by a prototype) is to have the
> request filter setup default caching behaviour of the response (if the
> response is cacheable at, that is the request method must be GET and
> there are no request parameters):
>
> * The Cache-Control header is preset with values from configuration
> matching the request URI (or resource path)
> * The Last-Modified header is preset with the jcr:lastModified
> property of the requet's resource
> * Eager responding with 304/NOT MODIFIED if the If-Modified-Since
> header is set and a last modification time of the resource can be
> resolved.

The question is how useful such a filter would be if only the
last-modified date of the requested resource is used.

In our application at least, there is a large number of resources
involved when serving a request. Most CMSs list out menus, for
example, where the menu items are other resources. If one of those
resources have changed, or if there has been a new menu item created,
it means the menu will be out of date if the requested resource itself
is unmodified.

To solve this, we could introduce a resource tracker, which tracks
which resources are being invoked on a request. The latest
last-modified date of these resources will then be matched with the
requests If-Modified-Since header.

-- 
Vidar S. Ramdal <vi...@idium.no> - http://www.idium.no
Sommerrogata 13-15, N-0255 Oslo, Norway
+ 47 22 00 84 00 / +47 21 531941, ext 2070

Re: Caching Support Request Filter

Posted by Felix Meschberger <fm...@gmail.com>.
Hi,

On Wed, Apr 28, 2010 at 12:32 PM, Bertrand Delacretaz
<bd...@apache.org> wrote:
> Hi Felix,
>
> On Wed, Apr 28, 2010 at 1:13 PM, Felix Meschberger
> <Fe...@day.com> wrote:
>> ...Configuration could be something like:
>> * Enable/Disable caching support functionality completely
>> * List of URL (or resource path) regexps with their setup, e.g.
>>       ^/(apps|libs)\/.*=must-revalidate
>> * Enabled/Disable support for eager 304 responses...
>
> Wondering if that info should rather be provided by an OSGi service -
> the default implementation could be config-based as you suggest, and
> people could then plugin their own caching strategies.

Well, there is of course room for improvement, one thing certainly is
to provide some API to overwrite the Cache-Control header other than
calling setHeader or addHeader (which are overwritten to merge script
provided Cache-Control setting with preset settings).

Regards
Felix

>
> Apart from that I like the idea, and like Carsten think that if that
> can be done without using a Filter that might be easier.
>
> -Bertrand
>

Re: Caching Support Request Filter

Posted by Bertrand Delacretaz <bd...@apache.org>.
Hi Felix,

On Wed, Apr 28, 2010 at 1:13 PM, Felix Meschberger
<Fe...@day.com> wrote:
> ...Configuration could be something like:
> * Enable/Disable caching support functionality completely
> * List of URL (or resource path) regexps with their setup, e.g.
>       ^/(apps|libs)\/.*=must-revalidate
> * Enabled/Disable support for eager 304 responses...

Wondering if that info should rather be provided by an OSGi service -
the default implementation could be config-based as you suggest, and
people could then plugin their own caching strategies.

Apart from that I like the idea, and like Carsten think that if that
can be done without using a Filter that might be easier.

-Bertrand

Re: Caching Support Request Filter

Posted by Felix Meschberger <fm...@gmail.com>.
Hi,

Why filter ? This is the easiest and least intrusive way to add this
functionality ;-)

We don't have to extend anything in the Engine, we can add it and
remove it at will.

Regards
Felix

On Wed, Apr 28, 2010 at 12:28 PM, Carsten Ziegeler <cz...@apache.org> wrote:
> Felix Meschberger  wrote
>> Hi all,
>>
>> I have been resonating with a collegue about a request level Filter
>> for Sling to support caching.
>>
>> The idea (and partly implemented by a prototype) is to have the
>> request filter setup default caching behaviour of the response (if the
>> response is cacheable at, that is the request method must be GET and
>> there are no request parameters):
>>
>> * The Cache-Control header is preset with values from configuration
>> matching the request URI (or resource path)
>> * The Last-Modified header is preset with the jcr:lastModified
>> property of the requet's resource
>> * Eager responding with 304/NOT MODIFIED if the If-Modified-Since
>> header is set and a last modification time of the resource can be
>> resolved.
>>
>> ll these settings can be overwritten/replaced by the request
>> processing scripts but it at least sets some defaults for caching:
>>
>> * Caching of the response enabled at all
>> * Whether revalidation by clients and/or proxies is required
>> * etc...
>>
>> Configuration could be something like:
>> * Enable/Disable caching support functionality completely
>> * List of URL (or resource path) regexps with their setup, e.g.
>>        ^/(apps|libs)\/.*=must-revalidate
>> * Enabled/Disable support for eager 304 responses
>>
>> This would probably just be a first step in a new infrastructure
>> around supporting response caching.
>>
> Sounds cool to me, especially the ootb if-modified-since support
> (we have the same in Cocoon and it works pretty well).
>
> The only question I have, why a filter? :) I know this can be added
> transparently, but do we need that?
> I'm not against this, just curious - especially as I hate debugging a
> large filter chain :)
>
> Regards
> Carsten
> --
> Carsten Ziegeler
> cziegeler@apache.org
>

Re: Caching Support Request Filter

Posted by Carsten Ziegeler <cz...@apache.org>.
Felix Meschberger  wrote
> Hi all,
> 
> I have been resonating with a collegue about a request level Filter
> for Sling to support caching.
> 
> The idea (and partly implemented by a prototype) is to have the
> request filter setup default caching behaviour of the response (if the
> response is cacheable at, that is the request method must be GET and
> there are no request parameters):
> 
> * The Cache-Control header is preset with values from configuration
> matching the request URI (or resource path)
> * The Last-Modified header is preset with the jcr:lastModified
> property of the requet's resource
> * Eager responding with 304/NOT MODIFIED if the If-Modified-Since
> header is set and a last modification time of the resource can be
> resolved.
> 
> ll these settings can be overwritten/replaced by the request
> processing scripts but it at least sets some defaults for caching:
> 
> * Caching of the response enabled at all
> * Whether revalidation by clients and/or proxies is required
> * etc...
> 
> Configuration could be something like:
> * Enable/Disable caching support functionality completely
> * List of URL (or resource path) regexps with their setup, e.g.
>        ^/(apps|libs)\/.*=must-revalidate
> * Enabled/Disable support for eager 304 responses
> 
> This would probably just be a first step in a new infrastructure
> around supporting response caching.
> 
Sounds cool to me, especially the ootb if-modified-since support
(we have the same in Cocoon and it works pretty well).

The only question I have, why a filter? :) I know this can be added
transparently, but do we need that?
I'm not against this, just curious - especially as I hate debugging a
large filter chain :)

Regards
Carsten
-- 
Carsten Ziegeler
cziegeler@apache.org