You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jena.apache.org by Holger Knublauch <ho...@knublauch.com> on 2015/06/16 05:20:06 UTC

Definition of SPARQL variable pre-binding

Hi,

(this question is motivated by the ongoing Data Shapes WG, but I don't 
speak on their behalf).

Jena and other APIs such as Sesame support the concept of pre-binding 
variables prior to SPARQL execution, using 
QueryExecution.setInitialBinding(). This is convenient to reuse 
parameterized queries, especially with blank nodes.

Question: is there any formal basis of this functionality, formulated so 
that it can be implemented by other platforms too? I can see that it 
populates the original bindings that are passed through the algebra 
objects, but what would be the best way to explain this by means of 
concepts from the SPARQL 1.1 spec?

Thanks
Holger

Re: Definition of SPARQL variable pre-binding

Posted by Andy Seaborne <an...@apache.org>.

On 17/06/15 00:18, Holger Knublauch wrote:
> On 6/16/2015 22:03, Osma Suominen wrote:
>> Here's a slightly relevant discussion about how to support something
>> like pre-bound variables / parametrized queries in YASQE, a graphical
>> SPARQL editor component in the YASGUI suite (and used by Fuseki among
>> others): https://github.com/YASGUI/YASQE/issues/24
>
> Thanks for the pointer.
>
>>
>> I'm not sure I understand all the issues here very deeply, but it
>> would seem useful to have a standard way of expressing and executing
>> parametrized SPARQL queries, which could then be applied by YASQE and
>> SHACL among others.
>
> Indeed. Maybe the SHACL templates [1] could be one solution to that,
> assuming SHACL becomes a W3C standard. In the current draft you would
> specify a template as
>
> ex:MyTemplate
>      a sh:Template ;
>      rdfs:label "My template" ;
>      rdfs:comment "Gets a list of all people born in a given country" ;
>      sh:argument [
>          sh:predicate ex:country ;
>          sh:valueType schema:Country ;
>          rdfs:comment "The country to get all people for" ;
>      ] ;
>      sh:sparql """
>          SELECT ?person
>          WHERE {
>              ?person ex:bornIn ?country .
>          } """ ;
> .
>
> This structure provides enough metadata to drive user interfaces, e.g.
> input forms where users select a country from a list. The semantics in
> the current draft are that variables become pre-bound (ex:country ->
> ?country). This approach has the advantage that each query can be
> instantiated as a naturally valid RDF instance, e.g.
>
> ex:ExampleQuery
>      a ex:MyTemplate ;
>      ex:country ex:Germany .
>
> This can then be used as a high level language for all kinds of query
> calls as constraints, rules or whatever - experts can prepare the SPARQL
> while end users just fill in the blanks.
>
> The semantics are intended to be like inserting a VALUES clause into the
> "beginning" of the query, i.e. they wouldn't be visible in sub-selects
> etc. In contrast to text-substitution algorithms, this also makes sure
> that queries are always syntactically valid and can be pre-compiled.

"prepared queries" are certainly the way to go. The phrase 
"text-substitution algorithms" is probably a placeholder for a spectrum 
of possibilities from string bashing (i.e. before parsing) to modifying 
the post-parser AST (so no firing up the parser - good for the 
high-frequency queries I'm guessing that SHACL will be doing).  In ARQ 
you can modify the query plan (the algebra) in the same style as AST 
modification making it post-optimizer.

Is anyone looking at a language (based on JSON, YAML whatever) as a DSL 
targeting the RDF formal language?

	Andy

>
> Holger
>
> [1] http://w3c.github.io/data-shapes/shacl/#templates
>

Re: Definition of SPARQL variable pre-binding

Posted by Holger Knublauch <ho...@knublauch.com>.

On 6/16/2015 22:03, Osma Suominen wrote:
> Here's a slightly relevant discussion about how to support something 
> like pre-bound variables / parametrized queries in YASQE, a graphical 
> SPARQL editor component in the YASGUI suite (and used by Fuseki among 
> others): https://github.com/YASGUI/YASQE/issues/24

Thanks for the pointer.

>
> I'm not sure I understand all the issues here very deeply, but it 
> would seem useful to have a standard way of expressing and executing 
> parametrized SPARQL queries, which could then be applied by YASQE and 
> SHACL among others.

Indeed. Maybe the SHACL templates [1] could be one solution to that, 
assuming SHACL becomes a W3C standard. In the current draft you would 
specify a template as

ex:MyTemplate
     a sh:Template ;
     rdfs:label "My template" ;
     rdfs:comment "Gets a list of all people born in a given country" ;
     sh:argument [
         sh:predicate ex:country ;
         sh:valueType schema:Country ;
         rdfs:comment "The country to get all people for" ;
     ] ;
     sh:sparql """
         SELECT ?person
         WHERE {
             ?person ex:bornIn ?country .
         } """ ;
.

This structure provides enough metadata to drive user interfaces, e.g. 
input forms where users select a country from a list. The semantics in 
the current draft are that variables become pre-bound (ex:country -> 
?country). This approach has the advantage that each query can be 
instantiated as a naturally valid RDF instance, e.g.

ex:ExampleQuery
     a ex:MyTemplate ;
     ex:country ex:Germany .

This can then be used as a high level language for all kinds of query 
calls as constraints, rules or whatever - experts can prepare the SPARQL 
while end users just fill in the blanks.

The semantics are intended to be like inserting a VALUES clause into the 
"beginning" of the query, i.e. they wouldn't be visible in sub-selects 
etc. In contrast to text-substitution algorithms, this also makes sure 
that queries are always syntactically valid and can be pre-compiled.

Holger

[1] http://w3c.github.io/data-shapes/shacl/#templates

>
> -Osma
>
>
>
>
> On 16/06/15 12:51, Andy Seaborne wrote:
>> On 16/06/15 09:33, Holger Knublauch wrote:
>>> Thanks, Andy.
>>>
>>> On 6/16/15 6:03 PM, Andy Seaborne wrote:
>>>> On 16/06/15 04:20, Holger Knublauch wrote:
>>>>> Hi,
>>>>>
>>>>> (this question is motivated by the ongoing Data Shapes WG, but I 
>>>>> don't
>>>>> speak on their behalf).
>>>>
>>>> Ptr?
>>> http://w3c.github.io/data-shapes/shacl/
>>>
>>> esp http://w3c.github.io/data-shapes/shacl/#sparql-constraints-prebound
>>>
>>> http://www.w3.org/2014/data-shapes/track/issues/68
>>
>> Thanks.
>>
>>>
>>>
>>>>
>>>>>
>>>>> Jena and other APIs such as Sesame support the concept of pre-binding
>>>>> variables prior to SPARQL execution, using
>>>>> QueryExecution.setInitialBinding(). This is convenient to reuse
>>>>> parameterized queries, especially with blank nodes.
>>>>>
>>>>> Question: is there any formal basis of this functionality,
>>>>> formulated so
>>>>> that it can be implemented by other platforms too? I can see that it
>>>>> populates the original bindings that are passed through the algebra
>>>>> objects, but what would be the best way to explain this by means of
>>>>> concepts from the SPARQL 1.1 spec?
>>>>>
>>>>> Thanks
>>>>> Holger
>>>>>
>>>>
>>>> There are two possible explanations - they are not quite the same.
>>>>
>>>> 1/ It's a substitution of a variable for a value execution. This is
>>>> very like parameterized queries. It's a pre-execution step.
>>>
>>> Do you mean syntactic insertion like the ParameterizedQuery class? This
>>> would not support bnodes, and the shapes and focus nodes of a SHACL
>>> constraint will frequently be bnodes. It should also avoid repeated
>>> query parsing, for performance reasons it would be better to operate on
>>> Query objects and their general equivalents (Algebra objects).
>>
>> Substitution does not have to be in syntax - it's rewriting the AST with
>> the real, actual bnode.
>>
>>>> 2/ VALUES
>>>>
>>>> There is a binding as a one row VALUES table and it's join'ed into the
>>>> query as usual.
>>>
>>> I guess inserting a VALUES clause into the beginning would work, but
>>> then again what about bnodes? I guess instead of the VALUES keyword (as
>>> a string), it would need to rely on the equivalent algebra object?
>>>
>>> Just to be clear, this only needs to work in local datasets, not
>>> necessarily with SPARQL endpoints where all we have is a http string
>>> interface. I am looking for a couple of sentences that would provide a
>>> generic implementation strategy that most SPARQL engines either already
>>> have, or could easily add to support SHACL.
>>>
>>> Thanks
>>> Holger
>>>
>>
>> Firstly - I'm talking about principles and execution, not syntax. VALUES
>> is the way to get a data table into a SPARQL execution.
>> setInitialBinding happens after parsing - injecting the preset row into
>> execution.
>>
>> The real (first) issue with blank nodes isn't putting them back in a
>> query; it's getting them in the first place.
>>
>> As soon as a blank node is serialized in all W3C formats (RDF, any
>> SPARQL results), it isn't the same blank node.  There is an equivalent
>> one in the document.
>>
>> If you are thinking of local API use, where the results are never
>> serialized, then it's not an issue - like setInitialBinding, it's an API
>> issue.  setInitialBinding is working after parsing.
>>
>> I'm afraid that section 12.1.1 is sliding towards mixing up syntax
>> issues with abstraction and execution.  To keep to standards, you have
>> to talk about SPARQL as a syntax.  You may get away with something like
>> "?this has the value from <how you found it>" or
>> "SPARQL execution must ensure that ?this has a value XXX in the
>> answers". Though XXX and blank nodes will cause the usual reactions. You
>> and I can probably macro-generate the debate ahead of time.
>> Perma-thread-37.
>>
>> The perfect answer is (might be) to repeat the pattern that found ?this
>> in the first place.  Obvious efficiency issues if done naively. But
>> otherwise, there is no way to connect the results of one SPARQL query to
>> another query within the standards only.
>>
>> [Now - may I do a 50% rules, 50% procedural language that wires together
>> multiple SPARQL queries and updates, please?]
>>
>> ARQ's solution to this is <_:...> URIs.  They name the bnode and the
>> parser replaces them with the real blank node.
>>
>> In fact, RDF 1.1 says:
>> [[
>> 3.4 Blank Nodes
>>
>> Blank nodes are disjoint from IRIs and literals. Otherwise, the set of
>> possible blank nodes is arbitrary. RDF makes no reference to any
>> internal structure of blank nodes.
>> ]]
>> so you could say, for the RDF abstract syntax, there is a 1-1 labelling
>> of all bnodes in use (i.e. finite - none of this axion-of-choice stuff)
>> by UUID and just be done with it.  Given the UUID, you can find the
>> blank node.
>>
>> Some people mix RDF abstract syntax with meaning of blank nodes
>> (entailment) but they are different.  abstract syntax == data structure.
>>
>> As a data structure, blank nodes are just nodes in a graph.  So invent a
>> reference for them (not a URI, not a literal).  Every RDF systems does
>> anyway even if it is implicitly there like a java object reference (not
>> Jena - blank nodes are the same by .equals, not ==; usual java stuff 
>> here).
>>
>>      Andy
>>
>>>> Differences in these viewpoints can occur in nested patetrns -
>>>> sub-queries (you can have different variables with the same name - a
>>>> textual substitution viewpoint breaks that) and OPTIONALs inside
>>>> OPTIONALs (bottom up execution is not the same as top down execution).
>>>>
>>>> This has existed in ARQ for a very long time.  ARQ actually takes the
>>>> initial binding and seeds the execution from there so it's like (2)
>>>> but not exactly; it does respect non-projected variables inside nested
>>>> SELECTS; it does not complete respect certain cases of
>>>> OPTIONAL-inside-OPTIONAL.
>>
>> [[
>> Actually - it isn't even as simple as that as the optimizer is aware of
>> these tricky OPTIONAL-OPTIONAL cases and may do the right thing.
>>
>> The case of nested optionals, with a variable being mentioned only in
>> the inner most and outer most patterns, but not intermediate ones, are
>> rare even for generated queries from compositions in my experience.
>> ]]
>>
>>>>
>>>>     Andy
>>>>
>>>
>>
>
>

Re: Definition of SPARQL variable pre-binding

Posted by Osma Suominen <os...@helsinki.fi>.

Here's a slightly relevant discussion about how to support something 
like pre-bound variables / parametrized queries in YASQE, a graphical 
SPARQL editor component in the YASGUI suite (and used by Fuseki among 
others): https://github.com/YASGUI/YASQE/issues/24

I'm not sure I understand all the issues here very deeply, but it would 
seem useful to have a standard way of expressing and executing 
parametrized SPARQL queries, which could then be applied by YASQE and 
SHACL among others.

-Osma




On 16/06/15 12:51, Andy Seaborne wrote:
> On 16/06/15 09:33, Holger Knublauch wrote:
>> Thanks, Andy.
>>
>> On 6/16/15 6:03 PM, Andy Seaborne wrote:
>>> On 16/06/15 04:20, Holger Knublauch wrote:
>>>> Hi,
>>>>
>>>> (this question is motivated by the ongoing Data Shapes WG, but I don't
>>>> speak on their behalf).
>>>
>>> Ptr?
>> http://w3c.github.io/data-shapes/shacl/
>>
>> esp http://w3c.github.io/data-shapes/shacl/#sparql-constraints-prebound
>>
>> http://www.w3.org/2014/data-shapes/track/issues/68
>
> Thanks.
>
>>
>>
>>>
>>>>
>>>> Jena and other APIs such as Sesame support the concept of pre-binding
>>>> variables prior to SPARQL execution, using
>>>> QueryExecution.setInitialBinding(). This is convenient to reuse
>>>> parameterized queries, especially with blank nodes.
>>>>
>>>> Question: is there any formal basis of this functionality,
>>>> formulated so
>>>> that it can be implemented by other platforms too? I can see that it
>>>> populates the original bindings that are passed through the algebra
>>>> objects, but what would be the best way to explain this by means of
>>>> concepts from the SPARQL 1.1 spec?
>>>>
>>>> Thanks
>>>> Holger
>>>>
>>>
>>> There are two possible explanations - they are not quite the same.
>>>
>>> 1/ It's a substitution of a variable for a value execution.  This is
>>> very like parameterized queries. It's a pre-execution step.
>>
>> Do you mean syntactic insertion like the ParameterizedQuery class? This
>> would not support bnodes, and the shapes and focus nodes of a SHACL
>> constraint will frequently be bnodes. It should also avoid repeated
>> query parsing, for performance reasons it would be better to operate on
>> Query objects and their general equivalents (Algebra objects).
>
> Substitution does not have to be in syntax - it's rewriting the AST with
> the real, actual bnode.
>
>>> 2/ VALUES
>>>
>>> There is a binding as a one row VALUES table and it's join'ed into the
>>> query as usual.
>>
>> I guess inserting a VALUES clause into the beginning would work, but
>> then again what about bnodes? I guess instead of the VALUES keyword (as
>> a string), it would need to rely on the equivalent algebra object?
>>
>> Just to be clear, this only needs to work in local datasets, not
>> necessarily with SPARQL endpoints where all we have is a http string
>> interface. I am looking for a couple of sentences that would provide a
>> generic implementation strategy that most SPARQL engines either already
>> have, or could easily add to support SHACL.
>>
>> Thanks
>> Holger
>>
>
> Firstly - I'm talking about principles and execution, not syntax. VALUES
> is the way to get a data table into a SPARQL execution.
> setInitialBinding happens after parsing - injecting the preset row into
> execution.
>
> The real (first) issue with blank nodes isn't putting them back in a
> query; it's getting them in the first place.
>
> As soon as a blank node is serialized in all W3C formats (RDF, any
> SPARQL results), it isn't the same blank node.  There is an equivalent
> one in the document.
>
> If you are thinking of local API use, where the results are never
> serialized, then it's not an issue - like setInitialBinding, it's an API
> issue.  setInitialBinding is working after parsing.
>
> I'm afraid that section 12.1.1 is sliding towards mixing up syntax
> issues with abstraction and execution.  To keep to standards, you have
> to talk about SPARQL as a syntax.  You may get away with something like
> "?this has the value from <how you found it>" or
> "SPARQL execution must ensure that ?this has a value XXX in the
> answers". Though XXX and blank nodes will cause the usual reactions. You
> and I can probably macro-generate the debate ahead of time.
> Perma-thread-37.
>
> The perfect answer is (might be) to repeat the pattern that found ?this
> in the first place.  Obvious efficiency issues if done naively.  But
> otherwise, there is no way to connect the results of one SPARQL query to
> another query within the standards only.
>
> [Now - may I do a 50% rules, 50% procedural language that wires together
> multiple SPARQL queries and updates, please?]
>
> ARQ's solution to this is <_:...> URIs.  They name the bnode and the
> parser replaces them with the real blank node.
>
> In fact, RDF 1.1 says:
> [[
> 3.4 Blank Nodes
>
> Blank nodes are disjoint from IRIs and literals. Otherwise, the set of
> possible blank nodes is arbitrary. RDF makes no reference to any
> internal structure of blank nodes.
> ]]
> so you could say, for the RDF abstract syntax, there is a 1-1 labelling
> of all bnodes in use (i.e. finite - none of this axion-of-choice stuff)
> by UUID and just be done with it.  Given the UUID, you can find the
> blank node.
>
> Some people mix RDF abstract syntax with meaning of blank nodes
> (entailment) but they are different.  abstract syntax == data structure.
>
> As a data structure, blank nodes are just nodes in a graph.  So invent a
> reference for them (not a URI, not a literal).  Every RDF systems does
> anyway even if it is implicitly there like a java object reference (not
> Jena - blank nodes are the same by .equals, not ==; usual java stuff here).
>
>      Andy
>
>>> Differences in these viewpoints can occur in nested patetrns -
>>> sub-queries (you can have different variables with the same name - a
>>> textual substitution viewpoint breaks that) and OPTIONALs inside
>>> OPTIONALs (bottom up execution is not the same as top down execution).
>>>
>>> This has existed in ARQ for a very long time.  ARQ actually takes the
>>> initial binding and seeds the execution from there so it's like (2)
>>> but not exactly; it does respect non-projected variables inside nested
>>> SELECTS; it does not complete respect certain cases of
>>> OPTIONAL-inside-OPTIONAL.
>
> [[
> Actually - it isn't even as simple as that as the optimizer is aware of
> these tricky OPTIONAL-OPTIONAL cases and may do the right thing.
>
> The case of nested optionals, with a variable being mentioned only in
> the inner most and outer most patterns, but not intermediate ones, are
> rare even for generated queries from compositions in my experience.
> ]]
>
>>>
>>>     Andy
>>>
>>
>


-- 
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suominen@helsinki.fi
http://www.nationallibrary.fi

Re: Definition of SPARQL variable pre-binding

Posted by Andy Seaborne <an...@apache.org>.

On 16/06/15 09:33, Holger Knublauch wrote:
> Thanks, Andy.
>
> On 6/16/15 6:03 PM, Andy Seaborne wrote:
>> On 16/06/15 04:20, Holger Knublauch wrote:
>>> Hi,
>>>
>>> (this question is motivated by the ongoing Data Shapes WG, but I don't
>>> speak on their behalf).
>>
>> Ptr?
> http://w3c.github.io/data-shapes/shacl/
>
> esp http://w3c.github.io/data-shapes/shacl/#sparql-constraints-prebound
>
> http://www.w3.org/2014/data-shapes/track/issues/68

Thanks.

>
>
>>
>>>
>>> Jena and other APIs such as Sesame support the concept of pre-binding
>>> variables prior to SPARQL execution, using
>>> QueryExecution.setInitialBinding(). This is convenient to reuse
>>> parameterized queries, especially with blank nodes.
>>>
>>> Question: is there any formal basis of this functionality, formulated so
>>> that it can be implemented by other platforms too? I can see that it
>>> populates the original bindings that are passed through the algebra
>>> objects, but what would be the best way to explain this by means of
>>> concepts from the SPARQL 1.1 spec?
>>>
>>> Thanks
>>> Holger
>>>
>>
>> There are two possible explanations - they are not quite the same.
>>
>> 1/ It's a substitution of a variable for a value execution.  This is
>> very like parameterized queries. It's a pre-execution step.
>
> Do you mean syntactic insertion like the ParameterizedQuery class? This
> would not support bnodes, and the shapes and focus nodes of a SHACL
> constraint will frequently be bnodes. It should also avoid repeated
> query parsing, for performance reasons it would be better to operate on
> Query objects and their general equivalents (Algebra objects).

Substitution does not have to be in syntax - it's rewriting the AST with 
the real, actual bnode.

>> 2/ VALUES
>>
>> There is a binding as a one row VALUES table and it's join'ed into the
>> query as usual.
>
> I guess inserting a VALUES clause into the beginning would work, but
> then again what about bnodes? I guess instead of the VALUES keyword (as
> a string), it would need to rely on the equivalent algebra object?
>
> Just to be clear, this only needs to work in local datasets, not
> necessarily with SPARQL endpoints where all we have is a http string
> interface. I am looking for a couple of sentences that would provide a
> generic implementation strategy that most SPARQL engines either already
> have, or could easily add to support SHACL.
>
> Thanks
> Holger
>

Firstly - I'm talking about principles and execution, not syntax. 
VALUES is the way to get a data table into a SPARQL execution. 
setInitialBinding happens after parsing - injecting the preset row into 
execution.

The real (first) issue with blank nodes isn't putting them back in a 
query; it's getting them in the first place.

As soon as a blank node is serialized in all W3C formats (RDF, any 
SPARQL results), it isn't the same blank node.  There is an equivalent 
one in the document.

If you are thinking of local API use, where the results are never 
serialized, then it's not an issue - like setInitialBinding, it's an API 
issue.  setInitialBinding is working after parsing.

I'm afraid that section 12.1.1 is sliding towards mixing up syntax 
issues with abstraction and execution.  To keep to standards, you have 
to talk about SPARQL as a syntax.  You may get away with something like
"?this has the value from <how you found it>" or
"SPARQL execution must ensure that ?this has a value XXX in the 
answers". Though XXX and blank nodes will cause the usual reactions. 
You and I can probably macro-generate the debate ahead of time. 
Perma-thread-37.

The perfect answer is (might be) to repeat the pattern that found ?this 
in the first place.  Obvious efficiency issues if done naively.  But 
otherwise, there is no way to connect the results of one SPARQL query to 
another query within the standards only.

[Now - may I do a 50% rules, 50% procedural language that wires together 
multiple SPARQL queries and updates, please?]

ARQ's solution to this is <_:...> URIs.  They name the bnode and the 
parser replaces them with the real blank node.

In fact, RDF 1.1 says:
[[
3.4 Blank Nodes

Blank nodes are disjoint from IRIs and literals. Otherwise, the set of 
possible blank nodes is arbitrary. RDF makes no reference to any 
internal structure of blank nodes.
]]
so you could say, for the RDF abstract syntax, there is a 1-1 labelling 
of all bnodes in use (i.e. finite - none of this axion-of-choice stuff) 
by UUID and just be done with it.  Given the UUID, you can find the 
blank node.

Some people mix RDF abstract syntax with meaning of blank nodes 
(entailment) but they are different.  abstract syntax == data structure.

As a data structure, blank nodes are just nodes in a graph.  So invent a 
reference for them (not a URI, not a literal).  Every RDF systems does 
anyway even if it is implicitly there like a java object reference (not 
Jena - blank nodes are the same by .equals, not ==; usual java stuff here).

	Andy

>> Differences in these viewpoints can occur in nested patetrns -
>> sub-queries (you can have different variables with the same name - a
>> textual substitution viewpoint breaks that) and OPTIONALs inside
>> OPTIONALs (bottom up execution is not the same as top down execution).
>>
>> This has existed in ARQ for a very long time.  ARQ actually takes the
>> initial binding and seeds the execution from there so it's like (2)
>> but not exactly; it does respect non-projected variables inside nested
>> SELECTS; it does not complete respect certain cases of
>> OPTIONAL-inside-OPTIONAL.

[[
Actually - it isn't even as simple as that as the optimizer is aware of 
these tricky OPTIONAL-OPTIONAL cases and may do the right thing.

The case of nested optionals, with a variable being mentioned only in 
the inner most and outer most patterns, but not intermediate ones, are 
rare even for generated queries from compositions in my experience.
]]

>>
>>     Andy
>>
>

Re: Definition of SPARQL variable pre-binding

Posted by Holger Knublauch <ho...@knublauch.com>.

Thanks, Andy.

On 6/16/15 6:03 PM, Andy Seaborne wrote:
> On 16/06/15 04:20, Holger Knublauch wrote:
>> Hi,
>>
>> (this question is motivated by the ongoing Data Shapes WG, but I don't
>> speak on their behalf).
>
> Ptr?
http://w3c.github.io/data-shapes/shacl/

esp http://w3c.github.io/data-shapes/shacl/#sparql-constraints-prebound

http://www.w3.org/2014/data-shapes/track/issues/68

>
>>
>> Jena and other APIs such as Sesame support the concept of pre-binding
>> variables prior to SPARQL execution, using
>> QueryExecution.setInitialBinding(). This is convenient to reuse
>> parameterized queries, especially with blank nodes.
>>
>> Question: is there any formal basis of this functionality, formulated so
>> that it can be implemented by other platforms too? I can see that it
>> populates the original bindings that are passed through the algebra
>> objects, but what would be the best way to explain this by means of
>> concepts from the SPARQL 1.1 spec?
>>
>> Thanks
>> Holger
>>
>
> There are two possible explanations - they are not quite the same.
>
> 1/ It's a substitution of a variable for a value execution.  This is 
> very like parameterized queries. It's a pre-execution step.

Do you mean syntactic insertion like the ParameterizedQuery class? This 
would not support bnodes, and the shapes and focus nodes of a SHACL 
constraint will frequently be bnodes. It should also avoid repeated 
query parsing, for performance reasons it would be better to operate on 
Query objects and their general equivalents (Algebra objects).

>
>
> 2/ VALUES
>
> There is a binding as a one row VALUES table and it's join'ed into the 
> query as usual.

I guess inserting a VALUES clause into the beginning would work, but 
then again what about bnodes? I guess instead of the VALUES keyword (as 
a string), it would need to rely on the equivalent algebra object?

Just to be clear, this only needs to work in local datasets, not 
necessarily with SPARQL endpoints where all we have is a http string 
interface. I am looking for a couple of sentences that would provide a 
generic implementation strategy that most SPARQL engines either already 
have, or could easily add to support SHACL.

Thanks
Holger

>
> Differences in these viewpoints can occur in nested patetrns - 
> sub-queries (you can have different variables with the same name - a 
> textual substitution viewpoint breaks that) and OPTIONALs inside 
> OPTIONALs (bottom up execution is not the same as top down execution).
>
> This has existed in ARQ for a very long time.  ARQ actually takes the 
> initial binding and seeds the execution from there so it's like (2) 
> but not exactly; it does respect non-projected variables inside nested 
> SELECTS; it does not complete respect certain cases of 
> OPTIONAL-inside-OPTIONAL.
>
>     Andy
>

Re: Definition of SPARQL variable pre-binding

Posted by Andy Seaborne <an...@apache.org>.

On 16/06/15 04:20, Holger Knublauch wrote:
> Hi,
>
> (this question is motivated by the ongoing Data Shapes WG, but I don't
> speak on their behalf).

Ptr?

>
> Jena and other APIs such as Sesame support the concept of pre-binding
> variables prior to SPARQL execution, using
> QueryExecution.setInitialBinding(). This is convenient to reuse
> parameterized queries, especially with blank nodes.
>
> Question: is there any formal basis of this functionality, formulated so
> that it can be implemented by other platforms too? I can see that it
> populates the original bindings that are passed through the algebra
> objects, but what would be the best way to explain this by means of
> concepts from the SPARQL 1.1 spec?
>
> Thanks
> Holger
>

There are two possible explanations - they are not quite the same.

1/ It's a substitution of a variable for a value execution.  This is 
very like parameterized queries. It's a pre-execution step.

2/ VALUES

There is a binding as a one row VALUES table and it's join'ed into the 
query as usual.

Differences in these viewpoints can occur in nested patetrns - 
sub-queries (you can have different variables with the same name - a 
textual substitution viewpoint breaks that) and OPTIONALs inside 
OPTIONALs (bottom up execution is not the same as top down execution).

This has existed in ARQ for a very long time.  ARQ actually takes the 
initial binding and seeds the execution from there so it's like (2) but 
not exactly; it does respect non-projected variables inside nested 
SELECTS; it does not complete respect certain cases of 
OPTIONAL-inside-OPTIONAL.

	Andy