You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Neubert Joachim <J....@zbw.eu> on 2012/07/10 07:05:51 UTC

Variables not bound in subquery

In the following query 
 
PREFIX gnd:     <http://d-nb.info/standards/elementset/gnd#>
 
SELECT ?uri
WHERE {
  BIND (<http://d-nb.info/gnd/10244669> AS ?uri1)
  BIND (<http://d-nb.info/gnd/1024466-9> AS ?uri2)
  { {
      SELECT (?uri1 AS ?uri)
      WHERE {
        ?uri1 a gnd:CorporateBody .
      }
    } UNION {
      SELECT (?uri2 AS ?uri)
      WHERE {
        ?uri2 a gnd:CorporateBody .
      }
  } }
}

I'd expect that the ?uri1 and ?uri2 variables are bound in the
subqueries, and as a result to get zero, one or two values for ?uri.
However, I get every possible gnd:CorporateBody (more than a million).
 
It would be nice if somebody could point out why this happens, and how I
could work arround it. (Duplicating the BIND part and moving it into the
subquery works, but since it involves a query to a remote service and
some function calls, I'd prefer not to).
 
Cheers, Joachim
 

Re: Variables not bound in subquery

Posted by Andy Seaborne <an...@apache.org>.
On 10/07/12 06:05, Neubert Joachim wrote:
> In the following query
>
> PREFIX gnd:     <http://d-nb.info/standards/elementset/gnd#>
>
> SELECT ?uri
> WHERE {
>    BIND (<http://d-nb.info/gnd/10244669> AS ?uri1)
>    BIND (<http://d-nb.info/gnd/1024466-9> AS ?uri2)
>    { {
>        SELECT (?uri1 AS ?uri)
>        WHERE {
>          ?uri1 a gnd:CorporateBody .
>        }
>      } UNION {
>        SELECT (?uri2 AS ?uri)
>        WHERE {
>          ?uri2 a gnd:CorporateBody .
>        }
>    } }
> }
>
> I'd expect that the ?uri1 and ?uri2 variables are bound in the
> subqueries, and as a result to get zero, one or two values for ?uri.
> However, I get every possible gnd:CorporateBody (more than a million).
>
> It would be nice if somebody could point out why this happens, and how I
> could work arround it. (Duplicating the BIND part and moving it into the
> subquery works, but since it involves a query to a remote service and
> some function calls, I'd prefer not to).
>
> Cheers, Joachim
>

Evaluation is bottom-up - subparts are evaluated then combined.

SELECT (?uri1 AS ?uri) exposes ?uri and any mention of ?uri1 inside the 
SELECT is hidden (it's a different ?uri -- strictly it's the same name 
but it will never meet the ?uri1 BIND

So the only thing coming out of SELECT (?uri1 AS ?uri) is a result row 
of one variable, ?uri.  There is no ?uri1 outside the projection.

You have the structure:

BIND ... ?uri1
BIND ... ?uri2
{
SELECT ... ?uri
    union
SELECT ... ?uri
}


This query

PREFIX gnd:     <http://d-nb.info/standards/elementset/gnd#>

SELECT ?uri
WHERE {
   ?uri a gnd:CorporateBody .
   FILTER ( <http://d-nb.info/gnd/10244669> = ?uri ||
            <http://d-nb.info/gnd/1024466-9> = ?uri )
}


finds the ?uri that are one of the two URIs.

Or
SELECT ?uri
WHERE {
   ?uri a gnd:CorporateBody .
   FILTER ( ?uri IN (<http://d-nb.info/gnd/10244669>,
                     <http://d-nb.info/gnd/1024466-9> ))
}

which gets to the same execution plan --

It's optimized as well:

(project (?uri)
   (disjunction
     (assign ((?uri <http://d-nb.info/gnd/10244669>))
       (bgp (triple <http://d-nb.info/gnd/10244669>
                    rdf:type gnd:CorporateBody)))
     (assign ((?uri <http://d-nb.info/gnd/1024466-9>))
       (bgp (triple <http://d-nb.info/gnd/1024466-9>
                    rdf:type gnd:CorporateBody)))))

i.e. it tries one case

{ <http://d-nb.info/gnd/10244669> rdf:type gnd:CorporateBody }

then tries the other

{ <http://d-nb.info/gnd/10244669-9> rdf:type gnd:CorporateBody }

which is two probes of the database, not filtering a million items.

	Andy