You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Sarven Capadisli <in...@csarven.ca> on 2012/10/05 20:08:55 UTC

Evaluation of variables prior to SERVICE

Hi, I'm trying to understanding the way federated queries are executed 
in Fuseki/TDB [1].

My query at http://transparency.270a.info/sparql is as follows:

SELECT *
WHERE {
   ?country a dbo:Country .
   ?country owl:sameAs ?wbcountry .
   FILTER (regex(str(?wbcountry), "^http://worldbank.270a.info/"))

   SERVICE <http://worldbank.270a.info/sparql> {
       ?wbcountry skos:prefLabel ?label
   }
}

I'm expecting to get all of the labels from the WB SPARQL endpoint.

This query works fine, however, I'm puzzled about the way ?wbcountry is 
evaluated before it gets sent over SERVICE. What happens is that, FILTER 
is not applied prior to SERVICE. On the http://worldbank.270a.info/ 
server-side, I can see that all possible values of ?wbcountry are being 
matched for skos:prefLabel ?label.

My intuition tells me that, the FILTER should have prevented that by 
having ?wbcountry only evaluate to IRIs which start with 
"http://worldbank.270a.info/". I think this is inline with 
http://www.w3.org/TR/sparql11-federated-query/#variableService - even 
though it is not an official way of evaluating.

Any feedback is appreciated.

[1] $ java tdb.tdbquery --version
Jena:       VERSION: 2.7.4-SNAPSHOT
Jena:       BUILD_DATE: 20121005-1119
ARQ:        VERSION: 2.9.4-SNAPSHOT
ARQ:        BUILD_DATE: 20121005-1119
TDB:        VERSION: 0.9.4-SNAPSHOT
TDB:        BUILD_DATE: 20121005-1119

-Sarven

Re: Evaluation of variables prior to SERVICE

Posted by Andy Seaborne <an...@apache.org>.
On 05/10/12 19:08, Sarven Capadisli wrote:
> Hi, I'm trying to understanding the way federated queries are executed
> in Fuseki/TDB [1].
>
> My query at http://transparency.270a.info/sparql is as follows:
>
> SELECT *
> WHERE {
>    ?country a dbo:Country .
>    ?country owl:sameAs ?wbcountry .
>    FILTER (regex(str(?wbcountry), "^http://worldbank.270a.info/"))
>
>    SERVICE <http://worldbank.270a.info/sparql> {
>        ?wbcountry skos:prefLabel ?label
>    }
> }
>
> I'm expecting to get all of the labels from the WB SPARQL endpoint.
>
> This query works fine, however, I'm puzzled about the way ?wbcountry is
> evaluated before it gets sent over SERVICE. What happens is that, FILTER
> is not applied prior to SERVICE. On the http://worldbank.270a.info/
> server-side, I can see that all possible values of ?wbcountry are being
> matched for skos:prefLabel ?label.
>
> My intuition tells me that, the FILTER should have prevented that by
> having ?wbcountry only evaluate to IRIs which start with
> "http://worldbank.270a.info/". I think this is inline with
> http://www.w3.org/TR/sparql11-federated-query/#variableService - even
> though it is not an official way of evaluating.
>
> Any feedback is appreciated.

The filter applies to the group - everything between the {...}.

The optimizer does not do anything clever with filter in this situation 
(it fails to push the filter into the pattern to put it where ?wbcountry 
is first defined - the BGP).

You can control it by having a group for the first part.

SELECT *
WHERE {
   {
     ?country a dbo:Country .
     ?country owl:sameAs ?wbcountry .
     FILTER (regex(str(?wbcountry), "^http://worldbank.270a.info/"))
   }

   SERVICE <http://worldbank.270a.info/sparql> {
       ?wbcountry skos:prefLabel ?label
   }
}

You can see the effect of the extra {} using

http://www.sparql.org/query-validator.html

and ticking "SPARQL algebra (general optimizations)" although the futher 
step of placing filters within BGPs happens in the storage layer so does 
not show up here.

	Andy

> [1] $ java tdb.tdbquery --version
> Jena:       VERSION: 2.7.4-SNAPSHOT
> Jena:       BUILD_DATE: 20121005-1119
> ARQ:        VERSION: 2.9.4-SNAPSHOT
> ARQ:        BUILD_DATE: 20121005-1119
> TDB:        VERSION: 0.9.4-SNAPSHOT
> TDB:        BUILD_DATE: 20121005-1119
>
> -Sarven