You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Andy Seaborne <an...@epimorphics.com> on 2011/08/08 16:35:42 UTC
Re: SPARQL queries and paging
On 08/08/11 14:35, Simon Helsen wrote:
> On this topic, I'd like to point out that we have a separate outside
> mechanism for paging which behaves like you suggest, however, the
> difference with OFFSET/LIMIT is that the next time someone makes the query
> e.g. to obtain the next page, we do expect the query to be recalculated
> since the state of the store may have changed.
>
> So, if you plan to change the behavior by introducing a caching model, you
> may actually alter the behavior unless you are able to determine that a
> subsequent execution of a query would not have changed results (e.g. by
> having the actions isolated in a transaction?)
>
> Simon
Transactions, or just Fuseki noting updates (language or grpah store
protocol), can be used to give a version id to each state and then ETags
can be used to drive cache invalidation.
It's a three-layer model:
client
SPARQL cache
core DB.
ETags is between SPARQL cache and code DB.
The protocol between each layer is the SPARQL protocol.
The SPARQL cache can keep whole result sets for pseudo paging using
ORDER/OFFSET/LIMIT with different policies on
The protocol between each layer is the SPARQL protocol but it coudl also
augment the SPARQL protocol with parameters like ?page= or
?liveness=uselastquery for better control (in addition to trying to
intuit from requests and version ids).
Being able to set different consistency/cache efficiency tradeoffs in
client of cache server might be useful.
Andy
Re: SPARQL queries and paging
Posted by Simon Helsen <sh...@ca.ibm.com>.
Andy,
yes. We don't use Fuseki, but have a similar 3-tier architecture. I guess
my issue was more that the core DB should not employ an implicit cache
which is uncontrollable for its clients, but from your explanation below,
I think we are on the same page. In fact, we also use etags and augment
"sparql" by having query parameters for cached paging. The only difference
is that we move cached paging entirely outside of sparql, E.g.
we have something like POST ...?query&pageSize=... (where the query is in
the body)
and the answer provides a unique token which can be used to browse the
pages until they expire, e.g.
GET ....?token=<myToken>&pageSize=...&page=...
If our clients do not want the caching behavior, we tell them to use
OFFSET/LIMIT instead.
Simon
From:
Andy Seaborne <an...@epimorphics.com>
To:
jena-dev@incubator.apache.org
Date:
08/08/2011 10:36 AM
Subject:
Re: SPARQL queries and paging
On 08/08/11 14:35, Simon Helsen wrote:
> On this topic, I'd like to point out that we have a separate outside
> mechanism for paging which behaves like you suggest, however, the
> difference with OFFSET/LIMIT is that the next time someone makes the
query
> e.g. to obtain the next page, we do expect the query to be recalculated
> since the state of the store may have changed.
>
> So, if you plan to change the behavior by introducing a caching model,
you
> may actually alter the behavior unless you are able to determine that a
> subsequent execution of a query would not have changed results (e.g. by
> having the actions isolated in a transaction?)
>
> Simon
Transactions, or just Fuseki noting updates (language or grpah store
protocol), can be used to give a version id to each state and then ETags
can be used to drive cache invalidation.
It's a three-layer model:
client
SPARQL cache
core DB.
ETags is between SPARQL cache and code DB.
The protocol between each layer is the SPARQL protocol.
The SPARQL cache can keep whole result sets for pseudo paging using
ORDER/OFFSET/LIMIT with different policies on
The protocol between each layer is the SPARQL protocol but it coudl also
augment the SPARQL protocol with parameters like ?page= or
?liveness=uselastquery for better control (in addition to trying to
intuit from requests and version ids).
Being able to set different consistency/cache efficiency tradeoffs in
client of cache server might be useful.
Andy