You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Martynas Jusevičius <ma...@atomgraph.com> on 2020/03/19 11:05:47 UTC

Fuseki's Cache-Control

Hi,

why is Fuseki sending these headers by default?

Cache-Control: must-revalidate,no-cache,no-store
Pragma: no-cache

They effectively disable HTTP caching, as I understand.

Can they be configured somewhere?

Re: Fuseki's Cache-Control

Posted by Andy Seaborne <an...@apache.org>.
I've been told caching proxies do not normally cache URLs with a query 
string. Sometimes it's configurable, but that only works if you control 
the intermediate. And an intermediate may be remapping from resource to 
query string.

That gets us to Adrian's point about being up-to-date and that the ideal 
cache setup is going to involve application semantics.

Whatever cache control Fuseki has must work for everyone by default. 
Conditional GET would be better, using epochs to note updates (that is 
e-tags). That isn't there ATM, and anyway it assumes a more 
sophisticated client.

Also, knowing about read-only data service could influence the caching 
... when the application use case is tolerant of stale data should the 
backing data be updated out-of-band.

     Andy

On 19/03/2020 14:08, Martynas Jusevičius wrote:
> Adrian,
> 
> indeed, I'm asking because I'm looking at using Varnish as a proxy
> cache in front of Fuseki.
> 
> However, best practices [1] say:
> 
> 6.2 Cache policy
> 
> Define a cache policy
> A cache / expiration policy is the rationale behind cache control for
> every resource served by HTTP/1.1 servers.. Content managers should
> decide, globally and/or locally, what can or can not be cached, how
> long caches should keep the document before trying to get a new
> version, etc. These decisions may be made depending on the frequency
> at which the documents may be updated.
> 
> Allow the Content Manager to set up cache control according to a Cache Policy
> The content manager should be able to set the max-age parameter for
> any resource served according to a cache policy.
> 
> 6.3: Caching generated content
> 
> Provide actual caching information for content generated dynamically
> Most dynamic content generation systems act as if the documents they
> generate and serve were "fresh" (i.e as if the resource was last
> modified at the date it is served), whether the information itself is,
> or not.
> This is a harmful lie for caching engines and should be avoided.
> Regardless of the technology used, it should be possible to provide
> age information by retrieving the actual information from whatever
> source is used to generate the dynamic content: file,database, etc.
> 
> https://www.w3.org/TR/chips/#gl6
> 
> On Thu, Mar 19, 2020 at 2:50 PM Adrian Gschwend <ml...@netlabs.org> wrote:
>>
>> On 19.03.20 12:05, Martynas Jusevičius wrote:
>>
>>> Cache-Control: must-revalidate,no-cache,no-store
>>> Pragma: no-cache
>>
>> YMMV, but my take here is:
>>
>> - a SPARQL endpoint should always return the latest results
>> - If caching is needed, it should be transparent for the user, as in the
>> SPARQL endpoint can have its caching indexes internally
>> - it is the middleware/developers job to add HTTP caching layers where
>> appropriate
>>
>> We do that with our SPARQL proxy for example, the middleware there sets
>> caching headers that are configurable.
>>
>> And as usual, cache invalidation is the hard part :)
>>
>> regards
>>
>> Adrian

Re: Fuseki's Cache-Control

Posted by Martynas Jusevičius <ma...@atomgraph.com>.
Adrian,

indeed, I'm asking because I'm looking at using Varnish as a proxy
cache in front of Fuseki.

However, best practices [1] say:

6.2 Cache policy

Define a cache policy
A cache / expiration policy is the rationale behind cache control for
every resource served by HTTP/1.1 servers.. Content managers should
decide, globally and/or locally, what can or can not be cached, how
long caches should keep the document before trying to get a new
version, etc. These decisions may be made depending on the frequency
at which the documents may be updated.

Allow the Content Manager to set up cache control according to a Cache Policy
The content manager should be able to set the max-age parameter for
any resource served according to a cache policy.

6.3: Caching generated content

Provide actual caching information for content generated dynamically
Most dynamic content generation systems act as if the documents they
generate and serve were "fresh" (i.e as if the resource was last
modified at the date it is served), whether the information itself is,
or not.
This is a harmful lie for caching engines and should be avoided.
Regardless of the technology used, it should be possible to provide
age information by retrieving the actual information from whatever
source is used to generate the dynamic content: file,database, etc.

https://www.w3.org/TR/chips/#gl6

On Thu, Mar 19, 2020 at 2:50 PM Adrian Gschwend <ml...@netlabs.org> wrote:
>
> On 19.03.20 12:05, Martynas Jusevičius wrote:
>
> > Cache-Control: must-revalidate,no-cache,no-store
> > Pragma: no-cache
>
> YMMV, but my take here is:
>
> - a SPARQL endpoint should always return the latest results
> - If caching is needed, it should be transparent for the user, as in the
> SPARQL endpoint can have its caching indexes internally
> - it is the middleware/developers job to add HTTP caching layers where
> appropriate
>
> We do that with our SPARQL proxy for example, the middleware there sets
> caching headers that are configurable.
>
> And as usual, cache invalidation is the hard part :)
>
> regards
>
> Adrian

Re: Fuseki's Cache-Control

Posted by Adrian Gschwend <ml...@netlabs.org>.
On 19.03.20 12:05, Martynas Jusevičius wrote:

> Cache-Control: must-revalidate,no-cache,no-store
> Pragma: no-cache

YMMV, but my take here is:

- a SPARQL endpoint should always return the latest results
- If caching is needed, it should be transparent for the user, as in the
SPARQL endpoint can have its caching indexes internally
- it is the middleware/developers job to add HTTP caching layers where
appropriate

We do that with our SPARQL proxy for example, the middleware there sets
caching headers that are configurable.

And as usual, cache invalidation is the hard part :)

regards

Adrian