You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Olivier Rossel <ol...@gmail.com> on 2012/08/01 17:50:24 UTC
Re: Basic federation in Jena
> 4/ You can use a subselect to restrict the remote query part:
>
>
> SERVICE <...> {
> SELECT * {
> ...
> } LIMIT 300
> }
I tried this query:
SELECT DISTINCT ?comment WHERE {
SERVICE
<http://api.talis.com/stores/bbc-backstage/services/sparql>
{ ?thCenturyClassicalComposers0
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://dbpedia.org/class/yago/20thCenturyClassicalComposers>
}
SERVICE <http://dbpedia.org/sparql> {SELECT
?thCenturyClassicalComposers0 ?comment WHERE {
?thCenturyClassicalComposers0
<http://www.w3.org/2000/01/rdf-schema#comment> ?comment } }
}
It returns results in a very correct time.
Then I remove ?thCenturyClassicalComposers0 from the sub-SELECT:
SELECT DISTINCT ?comment WHERE {
SERVICE
<http://api.talis.com/stores/bbc-backstage/services/sparql>
{ ?thCenturyClassicalComposers0
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://dbpedia.org/class/yago/20thCenturyClassicalComposers>
}
SERVICE <http://dbpedia.org/sparql> {SELECT ?comment WHERE {
?thCenturyClassicalComposers0
<http://www.w3.org/2000/01/rdf-schema#comment> ?comment } }
}
This query now takes MUCH MUCH longer. And eventually fizzles in a 509
HttpException.
Any idea why the query plan goes so wrong when
?thCenturyClassicalComposers0 is absent of the sub-SELECT.
?
Re: Basic federation in Jena
Posted by Andy Seaborne <an...@apache.org>.
On 01/08/12 16:50, Olivier Rossel wrote:
>> 4/ You can use a subselect to restrict the remote query part:
>>
>>
>> SERVICE <...> {
>> SELECT * {
>> ...
>> } LIMIT 300
>> }
>
> I tried this query:
> SELECT DISTINCT ?comment WHERE {
> SERVICE
> <http://api.talis.com/stores/bbc-backstage/services/sparql>
> { ?thCenturyClassicalComposers0
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
> <http://dbpedia.org/class/yago/20thCenturyClassicalComposers>
> }
> SERVICE <http://dbpedia.org/sparql> {SELECT
> ?thCenturyClassicalComposers0 ?comment WHERE {
> ?thCenturyClassicalComposers0
> <http://www.w3.org/2000/01/rdf-schema#comment> ?comment } }
> }
>
> It returns results in a very correct time.
>
> Then I remove ?thCenturyClassicalComposers0 from the sub-SELECT:
>
>
> SELECT DISTINCT ?comment WHERE {
> SERVICE
> <http://api.talis.com/stores/bbc-backstage/services/sparql>
> { ?thCenturyClassicalComposers0
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
> <http://dbpedia.org/class/yago/20thCenturyClassicalComposers>
> }
> SERVICE <http://dbpedia.org/sparql> {SELECT ?comment WHERE {
> ?thCenturyClassicalComposers0
> <http://www.w3.org/2000/01/rdf-schema#comment> ?comment } }
> }
>
> This query now takes MUCH MUCH longer. And eventually fizzles in a 509
> HttpException.
>
> Any idea why the query plan goes so wrong when
> ?thCenturyClassicalComposers0 is absent of the sub-SELECT.
> ?
>
Because in the second query you are joining the intermediate results of
SERVICE 1:
?thCenturyClassicalComposers0
with
SERVICE 2:
?comment
i.e. an unconstrained join which happens to be done inefficiently.
The inner SERVICE/2 ?thCenturyClassicalComposers0 is not the same as one
in SERVICE/1 if you remove it from the sub-select.
Try looking at it with
http://www.sparql.org/query-validator.html
and set "SPARQL algebra (general optimizations)" and you will see the
?/thCenturyClassicalComposers0 (note the ?/) which is a
renamed-because-its-hidden variable).
Any chance of readable queries? A few prefixed perhaps?
Andy