You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Holger Knublauch <ho...@knublauch.com> on 2015/06/16 05:20:06 UTC
Definition of SPARQL variable pre-binding
Hi,
(this question is motivated by the ongoing Data Shapes WG, but I don't
speak on their behalf).
Jena and other APIs such as Sesame support the concept of pre-binding
variables prior to SPARQL execution, using
QueryExecution.setInitialBinding(). This is convenient to reuse
parameterized queries, especially with blank nodes.
Question: is there any formal basis of this functionality, formulated so
that it can be implemented by other platforms too? I can see that it
populates the original bindings that are passed through the algebra
objects, but what would be the best way to explain this by means of
concepts from the SPARQL 1.1 spec?
Thanks
Holger
Re: Definition of SPARQL variable pre-binding
Posted by Andy Seaborne <an...@apache.org>.
On 17/06/15 00:18, Holger Knublauch wrote:
> On 6/16/2015 22:03, Osma Suominen wrote:
>> Here's a slightly relevant discussion about how to support something
>> like pre-bound variables / parametrized queries in YASQE, a graphical
>> SPARQL editor component in the YASGUI suite (and used by Fuseki among
>> others): https://github.com/YASGUI/YASQE/issues/24
>
> Thanks for the pointer.
>
>>
>> I'm not sure I understand all the issues here very deeply, but it
>> would seem useful to have a standard way of expressing and executing
>> parametrized SPARQL queries, which could then be applied by YASQE and
>> SHACL among others.
>
> Indeed. Maybe the SHACL templates [1] could be one solution to that,
> assuming SHACL becomes a W3C standard. In the current draft you would
> specify a template as
>
> ex:MyTemplate
> a sh:Template ;
> rdfs:label "My template" ;
> rdfs:comment "Gets a list of all people born in a given country" ;
> sh:argument [
> sh:predicate ex:country ;
> sh:valueType schema:Country ;
> rdfs:comment "The country to get all people for" ;
> ] ;
> sh:sparql """
> SELECT ?person
> WHERE {
> ?person ex:bornIn ?country .
> } """ ;
> .
>
> This structure provides enough metadata to drive user interfaces, e.g.
> input forms where users select a country from a list. The semantics in
> the current draft are that variables become pre-bound (ex:country ->
> ?country). This approach has the advantage that each query can be
> instantiated as a naturally valid RDF instance, e.g.
>
> ex:ExampleQuery
> a ex:MyTemplate ;
> ex:country ex:Germany .
>
> This can then be used as a high level language for all kinds of query
> calls as constraints, rules or whatever - experts can prepare the SPARQL
> while end users just fill in the blanks.
>
> The semantics are intended to be like inserting a VALUES clause into the
> "beginning" of the query, i.e. they wouldn't be visible in sub-selects
> etc. In contrast to text-substitution algorithms, this also makes sure
> that queries are always syntactically valid and can be pre-compiled.
"prepared queries" are certainly the way to go. The phrase
"text-substitution algorithms" is probably a placeholder for a spectrum
of possibilities from string bashing (i.e. before parsing) to modifying
the post-parser AST (so no firing up the parser - good for the
high-frequency queries I'm guessing that SHACL will be doing). In ARQ
you can modify the query plan (the algebra) in the same style as AST
modification making it post-optimizer.
Is anyone looking at a language (based on JSON, YAML whatever) as a DSL
targeting the RDF formal language?
Andy
>
> Holger
>
> [1] http://w3c.github.io/data-shapes/shacl/#templates
>
Re: Definition of SPARQL variable pre-binding
Posted by Holger Knublauch <ho...@knublauch.com>.
On 6/16/2015 22:03, Osma Suominen wrote:
> Here's a slightly relevant discussion about how to support something
> like pre-bound variables / parametrized queries in YASQE, a graphical
> SPARQL editor component in the YASGUI suite (and used by Fuseki among
> others): https://github.com/YASGUI/YASQE/issues/24
Thanks for the pointer.
>
> I'm not sure I understand all the issues here very deeply, but it
> would seem useful to have a standard way of expressing and executing
> parametrized SPARQL queries, which could then be applied by YASQE and
> SHACL among others.
Indeed. Maybe the SHACL templates [1] could be one solution to that,
assuming SHACL becomes a W3C standard. In the current draft you would
specify a template as
ex:MyTemplate
a sh:Template ;
rdfs:label "My template" ;
rdfs:comment "Gets a list of all people born in a given country" ;
sh:argument [
sh:predicate ex:country ;
sh:valueType schema:Country ;
rdfs:comment "The country to get all people for" ;
] ;
sh:sparql """
SELECT ?person
WHERE {
?person ex:bornIn ?country .
} """ ;
.
This structure provides enough metadata to drive user interfaces, e.g.
input forms where users select a country from a list. The semantics in
the current draft are that variables become pre-bound (ex:country ->
?country). This approach has the advantage that each query can be
instantiated as a naturally valid RDF instance, e.g.
ex:ExampleQuery
a ex:MyTemplate ;
ex:country ex:Germany .
This can then be used as a high level language for all kinds of query
calls as constraints, rules or whatever - experts can prepare the SPARQL
while end users just fill in the blanks.
The semantics are intended to be like inserting a VALUES clause into the
"beginning" of the query, i.e. they wouldn't be visible in sub-selects
etc. In contrast to text-substitution algorithms, this also makes sure
that queries are always syntactically valid and can be pre-compiled.
Holger
[1] http://w3c.github.io/data-shapes/shacl/#templates
>
> -Osma
>
>
>
>
> On 16/06/15 12:51, Andy Seaborne wrote:
>> On 16/06/15 09:33, Holger Knublauch wrote:
>>> Thanks, Andy.
>>>
>>> On 6/16/15 6:03 PM, Andy Seaborne wrote:
>>>> On 16/06/15 04:20, Holger Knublauch wrote:
>>>>> Hi,
>>>>>
>>>>> (this question is motivated by the ongoing Data Shapes WG, but I
>>>>> don't
>>>>> speak on their behalf).
>>>>
>>>> Ptr?
>>> http://w3c.github.io/data-shapes/shacl/
>>>
>>> esp http://w3c.github.io/data-shapes/shacl/#sparql-constraints-prebound
>>>
>>> http://www.w3.org/2014/data-shapes/track/issues/68
>>
>> Thanks.
>>
>>>
>>>
>>>>
>>>>>
>>>>> Jena and other APIs such as Sesame support the concept of pre-binding
>>>>> variables prior to SPARQL execution, using
>>>>> QueryExecution.setInitialBinding(). This is convenient to reuse
>>>>> parameterized queries, especially with blank nodes.
>>>>>
>>>>> Question: is there any formal basis of this functionality,
>>>>> formulated so
>>>>> that it can be implemented by other platforms too? I can see that it
>>>>> populates the original bindings that are passed through the algebra
>>>>> objects, but what would be the best way to explain this by means of
>>>>> concepts from the SPARQL 1.1 spec?
>>>>>
>>>>> Thanks
>>>>> Holger
>>>>>
>>>>
>>>> There are two possible explanations - they are not quite the same.
>>>>
>>>> 1/ It's a substitution of a variable for a value execution. This is
>>>> very like parameterized queries. It's a pre-execution step.
>>>
>>> Do you mean syntactic insertion like the ParameterizedQuery class? This
>>> would not support bnodes, and the shapes and focus nodes of a SHACL
>>> constraint will frequently be bnodes. It should also avoid repeated
>>> query parsing, for performance reasons it would be better to operate on
>>> Query objects and their general equivalents (Algebra objects).
>>
>> Substitution does not have to be in syntax - it's rewriting the AST with
>> the real, actual bnode.
>>
>>>> 2/ VALUES
>>>>
>>>> There is a binding as a one row VALUES table and it's join'ed into the
>>>> query as usual.
>>>
>>> I guess inserting a VALUES clause into the beginning would work, but
>>> then again what about bnodes? I guess instead of the VALUES keyword (as
>>> a string), it would need to rely on the equivalent algebra object?
>>>
>>> Just to be clear, this only needs to work in local datasets, not
>>> necessarily with SPARQL endpoints where all we have is a http string
>>> interface. I am looking for a couple of sentences that would provide a
>>> generic implementation strategy that most SPARQL engines either already
>>> have, or could easily add to support SHACL.
>>>
>>> Thanks
>>> Holger
>>>
>>
>> Firstly - I'm talking about principles and execution, not syntax. VALUES
>> is the way to get a data table into a SPARQL execution.
>> setInitialBinding happens after parsing - injecting the preset row into
>> execution.
>>
>> The real (first) issue with blank nodes isn't putting them back in a
>> query; it's getting them in the first place.
>>
>> As soon as a blank node is serialized in all W3C formats (RDF, any
>> SPARQL results), it isn't the same blank node. There is an equivalent
>> one in the document.
>>
>> If you are thinking of local API use, where the results are never
>> serialized, then it's not an issue - like setInitialBinding, it's an API
>> issue. setInitialBinding is working after parsing.
>>
>> I'm afraid that section 12.1.1 is sliding towards mixing up syntax
>> issues with abstraction and execution. To keep to standards, you have
>> to talk about SPARQL as a syntax. You may get away with something like
>> "?this has the value from <how you found it>" or
>> "SPARQL execution must ensure that ?this has a value XXX in the
>> answers". Though XXX and blank nodes will cause the usual reactions. You
>> and I can probably macro-generate the debate ahead of time.
>> Perma-thread-37.
>>
>> The perfect answer is (might be) to repeat the pattern that found ?this
>> in the first place. Obvious efficiency issues if done naively. But
>> otherwise, there is no way to connect the results of one SPARQL query to
>> another query within the standards only.
>>
>> [Now - may I do a 50% rules, 50% procedural language that wires together
>> multiple SPARQL queries and updates, please?]
>>
>> ARQ's solution to this is <_:...> URIs. They name the bnode and the
>> parser replaces them with the real blank node.
>>
>> In fact, RDF 1.1 says:
>> [[
>> 3.4 Blank Nodes
>>
>> Blank nodes are disjoint from IRIs and literals. Otherwise, the set of
>> possible blank nodes is arbitrary. RDF makes no reference to any
>> internal structure of blank nodes.
>> ]]
>> so you could say, for the RDF abstract syntax, there is a 1-1 labelling
>> of all bnodes in use (i.e. finite - none of this axion-of-choice stuff)
>> by UUID and just be done with it. Given the UUID, you can find the
>> blank node.
>>
>> Some people mix RDF abstract syntax with meaning of blank nodes
>> (entailment) but they are different. abstract syntax == data structure.
>>
>> As a data structure, blank nodes are just nodes in a graph. So invent a
>> reference for them (not a URI, not a literal). Every RDF systems does
>> anyway even if it is implicitly there like a java object reference (not
>> Jena - blank nodes are the same by .equals, not ==; usual java stuff
>> here).
>>
>> Andy
>>
>>>> Differences in these viewpoints can occur in nested patetrns -
>>>> sub-queries (you can have different variables with the same name - a
>>>> textual substitution viewpoint breaks that) and OPTIONALs inside
>>>> OPTIONALs (bottom up execution is not the same as top down execution).
>>>>
>>>> This has existed in ARQ for a very long time. ARQ actually takes the
>>>> initial binding and seeds the execution from there so it's like (2)
>>>> but not exactly; it does respect non-projected variables inside nested
>>>> SELECTS; it does not complete respect certain cases of
>>>> OPTIONAL-inside-OPTIONAL.
>>
>> [[
>> Actually - it isn't even as simple as that as the optimizer is aware of
>> these tricky OPTIONAL-OPTIONAL cases and may do the right thing.
>>
>> The case of nested optionals, with a variable being mentioned only in
>> the inner most and outer most patterns, but not intermediate ones, are
>> rare even for generated queries from compositions in my experience.
>> ]]
>>
>>>>
>>>> Andy
>>>>
>>>
>>
>
>
Re: Definition of SPARQL variable pre-binding
Posted by Osma Suominen <os...@helsinki.fi>.
Here's a slightly relevant discussion about how to support something
like pre-bound variables / parametrized queries in YASQE, a graphical
SPARQL editor component in the YASGUI suite (and used by Fuseki among
others): https://github.com/YASGUI/YASQE/issues/24
I'm not sure I understand all the issues here very deeply, but it would
seem useful to have a standard way of expressing and executing
parametrized SPARQL queries, which could then be applied by YASQE and
SHACL among others.
-Osma
On 16/06/15 12:51, Andy Seaborne wrote:
> On 16/06/15 09:33, Holger Knublauch wrote:
>> Thanks, Andy.
>>
>> On 6/16/15 6:03 PM, Andy Seaborne wrote:
>>> On 16/06/15 04:20, Holger Knublauch wrote:
>>>> Hi,
>>>>
>>>> (this question is motivated by the ongoing Data Shapes WG, but I don't
>>>> speak on their behalf).
>>>
>>> Ptr?
>> http://w3c.github.io/data-shapes/shacl/
>>
>> esp http://w3c.github.io/data-shapes/shacl/#sparql-constraints-prebound
>>
>> http://www.w3.org/2014/data-shapes/track/issues/68
>
> Thanks.
>
>>
>>
>>>
>>>>
>>>> Jena and other APIs such as Sesame support the concept of pre-binding
>>>> variables prior to SPARQL execution, using
>>>> QueryExecution.setInitialBinding(). This is convenient to reuse
>>>> parameterized queries, especially with blank nodes.
>>>>
>>>> Question: is there any formal basis of this functionality,
>>>> formulated so
>>>> that it can be implemented by other platforms too? I can see that it
>>>> populates the original bindings that are passed through the algebra
>>>> objects, but what would be the best way to explain this by means of
>>>> concepts from the SPARQL 1.1 spec?
>>>>
>>>> Thanks
>>>> Holger
>>>>
>>>
>>> There are two possible explanations - they are not quite the same.
>>>
>>> 1/ It's a substitution of a variable for a value execution. This is
>>> very like parameterized queries. It's a pre-execution step.
>>
>> Do you mean syntactic insertion like the ParameterizedQuery class? This
>> would not support bnodes, and the shapes and focus nodes of a SHACL
>> constraint will frequently be bnodes. It should also avoid repeated
>> query parsing, for performance reasons it would be better to operate on
>> Query objects and their general equivalents (Algebra objects).
>
> Substitution does not have to be in syntax - it's rewriting the AST with
> the real, actual bnode.
>
>>> 2/ VALUES
>>>
>>> There is a binding as a one row VALUES table and it's join'ed into the
>>> query as usual.
>>
>> I guess inserting a VALUES clause into the beginning would work, but
>> then again what about bnodes? I guess instead of the VALUES keyword (as
>> a string), it would need to rely on the equivalent algebra object?
>>
>> Just to be clear, this only needs to work in local datasets, not
>> necessarily with SPARQL endpoints where all we have is a http string
>> interface. I am looking for a couple of sentences that would provide a
>> generic implementation strategy that most SPARQL engines either already
>> have, or could easily add to support SHACL.
>>
>> Thanks
>> Holger
>>
>
> Firstly - I'm talking about principles and execution, not syntax. VALUES
> is the way to get a data table into a SPARQL execution.
> setInitialBinding happens after parsing - injecting the preset row into
> execution.
>
> The real (first) issue with blank nodes isn't putting them back in a
> query; it's getting them in the first place.
>
> As soon as a blank node is serialized in all W3C formats (RDF, any
> SPARQL results), it isn't the same blank node. There is an equivalent
> one in the document.
>
> If you are thinking of local API use, where the results are never
> serialized, then it's not an issue - like setInitialBinding, it's an API
> issue. setInitialBinding is working after parsing.
>
> I'm afraid that section 12.1.1 is sliding towards mixing up syntax
> issues with abstraction and execution. To keep to standards, you have
> to talk about SPARQL as a syntax. You may get away with something like
> "?this has the value from <how you found it>" or
> "SPARQL execution must ensure that ?this has a value XXX in the
> answers". Though XXX and blank nodes will cause the usual reactions. You
> and I can probably macro-generate the debate ahead of time.
> Perma-thread-37.
>
> The perfect answer is (might be) to repeat the pattern that found ?this
> in the first place. Obvious efficiency issues if done naively. But
> otherwise, there is no way to connect the results of one SPARQL query to
> another query within the standards only.
>
> [Now - may I do a 50% rules, 50% procedural language that wires together
> multiple SPARQL queries and updates, please?]
>
> ARQ's solution to this is <_:...> URIs. They name the bnode and the
> parser replaces them with the real blank node.
>
> In fact, RDF 1.1 says:
> [[
> 3.4 Blank Nodes
>
> Blank nodes are disjoint from IRIs and literals. Otherwise, the set of
> possible blank nodes is arbitrary. RDF makes no reference to any
> internal structure of blank nodes.
> ]]
> so you could say, for the RDF abstract syntax, there is a 1-1 labelling
> of all bnodes in use (i.e. finite - none of this axion-of-choice stuff)
> by UUID and just be done with it. Given the UUID, you can find the
> blank node.
>
> Some people mix RDF abstract syntax with meaning of blank nodes
> (entailment) but they are different. abstract syntax == data structure.
>
> As a data structure, blank nodes are just nodes in a graph. So invent a
> reference for them (not a URI, not a literal). Every RDF systems does
> anyway even if it is implicitly there like a java object reference (not
> Jena - blank nodes are the same by .equals, not ==; usual java stuff here).
>
> Andy
>
>>> Differences in these viewpoints can occur in nested patetrns -
>>> sub-queries (you can have different variables with the same name - a
>>> textual substitution viewpoint breaks that) and OPTIONALs inside
>>> OPTIONALs (bottom up execution is not the same as top down execution).
>>>
>>> This has existed in ARQ for a very long time. ARQ actually takes the
>>> initial binding and seeds the execution from there so it's like (2)
>>> but not exactly; it does respect non-projected variables inside nested
>>> SELECTS; it does not complete respect certain cases of
>>> OPTIONAL-inside-OPTIONAL.
>
> [[
> Actually - it isn't even as simple as that as the optimizer is aware of
> these tricky OPTIONAL-OPTIONAL cases and may do the right thing.
>
> The case of nested optionals, with a variable being mentioned only in
> the inner most and outer most patterns, but not intermediate ones, are
> rare even for generated queries from compositions in my experience.
> ]]
>
>>>
>>> Andy
>>>
>>
>
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suominen@helsinki.fi
http://www.nationallibrary.fi
Re: Definition of SPARQL variable pre-binding
Posted by Andy Seaborne <an...@apache.org>.
On 16/06/15 09:33, Holger Knublauch wrote:
> Thanks, Andy.
>
> On 6/16/15 6:03 PM, Andy Seaborne wrote:
>> On 16/06/15 04:20, Holger Knublauch wrote:
>>> Hi,
>>>
>>> (this question is motivated by the ongoing Data Shapes WG, but I don't
>>> speak on their behalf).
>>
>> Ptr?
> http://w3c.github.io/data-shapes/shacl/
>
> esp http://w3c.github.io/data-shapes/shacl/#sparql-constraints-prebound
>
> http://www.w3.org/2014/data-shapes/track/issues/68
Thanks.
>
>
>>
>>>
>>> Jena and other APIs such as Sesame support the concept of pre-binding
>>> variables prior to SPARQL execution, using
>>> QueryExecution.setInitialBinding(). This is convenient to reuse
>>> parameterized queries, especially with blank nodes.
>>>
>>> Question: is there any formal basis of this functionality, formulated so
>>> that it can be implemented by other platforms too? I can see that it
>>> populates the original bindings that are passed through the algebra
>>> objects, but what would be the best way to explain this by means of
>>> concepts from the SPARQL 1.1 spec?
>>>
>>> Thanks
>>> Holger
>>>
>>
>> There are two possible explanations - they are not quite the same.
>>
>> 1/ It's a substitution of a variable for a value execution. This is
>> very like parameterized queries. It's a pre-execution step.
>
> Do you mean syntactic insertion like the ParameterizedQuery class? This
> would not support bnodes, and the shapes and focus nodes of a SHACL
> constraint will frequently be bnodes. It should also avoid repeated
> query parsing, for performance reasons it would be better to operate on
> Query objects and their general equivalents (Algebra objects).
Substitution does not have to be in syntax - it's rewriting the AST with
the real, actual bnode.
>> 2/ VALUES
>>
>> There is a binding as a one row VALUES table and it's join'ed into the
>> query as usual.
>
> I guess inserting a VALUES clause into the beginning would work, but
> then again what about bnodes? I guess instead of the VALUES keyword (as
> a string), it would need to rely on the equivalent algebra object?
>
> Just to be clear, this only needs to work in local datasets, not
> necessarily with SPARQL endpoints where all we have is a http string
> interface. I am looking for a couple of sentences that would provide a
> generic implementation strategy that most SPARQL engines either already
> have, or could easily add to support SHACL.
>
> Thanks
> Holger
>
Firstly - I'm talking about principles and execution, not syntax.
VALUES is the way to get a data table into a SPARQL execution.
setInitialBinding happens after parsing - injecting the preset row into
execution.
The real (first) issue with blank nodes isn't putting them back in a
query; it's getting them in the first place.
As soon as a blank node is serialized in all W3C formats (RDF, any
SPARQL results), it isn't the same blank node. There is an equivalent
one in the document.
If you are thinking of local API use, where the results are never
serialized, then it's not an issue - like setInitialBinding, it's an API
issue. setInitialBinding is working after parsing.
I'm afraid that section 12.1.1 is sliding towards mixing up syntax
issues with abstraction and execution. To keep to standards, you have
to talk about SPARQL as a syntax. You may get away with something like
"?this has the value from <how you found it>" or
"SPARQL execution must ensure that ?this has a value XXX in the
answers". Though XXX and blank nodes will cause the usual reactions.
You and I can probably macro-generate the debate ahead of time.
Perma-thread-37.
The perfect answer is (might be) to repeat the pattern that found ?this
in the first place. Obvious efficiency issues if done naively. But
otherwise, there is no way to connect the results of one SPARQL query to
another query within the standards only.
[Now - may I do a 50% rules, 50% procedural language that wires together
multiple SPARQL queries and updates, please?]
ARQ's solution to this is <_:...> URIs. They name the bnode and the
parser replaces them with the real blank node.
In fact, RDF 1.1 says:
[[
3.4 Blank Nodes
Blank nodes are disjoint from IRIs and literals. Otherwise, the set of
possible blank nodes is arbitrary. RDF makes no reference to any
internal structure of blank nodes.
]]
so you could say, for the RDF abstract syntax, there is a 1-1 labelling
of all bnodes in use (i.e. finite - none of this axion-of-choice stuff)
by UUID and just be done with it. Given the UUID, you can find the
blank node.
Some people mix RDF abstract syntax with meaning of blank nodes
(entailment) but they are different. abstract syntax == data structure.
As a data structure, blank nodes are just nodes in a graph. So invent a
reference for them (not a URI, not a literal). Every RDF systems does
anyway even if it is implicitly there like a java object reference (not
Jena - blank nodes are the same by .equals, not ==; usual java stuff here).
Andy
>> Differences in these viewpoints can occur in nested patetrns -
>> sub-queries (you can have different variables with the same name - a
>> textual substitution viewpoint breaks that) and OPTIONALs inside
>> OPTIONALs (bottom up execution is not the same as top down execution).
>>
>> This has existed in ARQ for a very long time. ARQ actually takes the
>> initial binding and seeds the execution from there so it's like (2)
>> but not exactly; it does respect non-projected variables inside nested
>> SELECTS; it does not complete respect certain cases of
>> OPTIONAL-inside-OPTIONAL.
[[
Actually - it isn't even as simple as that as the optimizer is aware of
these tricky OPTIONAL-OPTIONAL cases and may do the right thing.
The case of nested optionals, with a variable being mentioned only in
the inner most and outer most patterns, but not intermediate ones, are
rare even for generated queries from compositions in my experience.
]]
>>
>> Andy
>>
>
Re: Definition of SPARQL variable pre-binding
Posted by Holger Knublauch <ho...@knublauch.com>.
Thanks, Andy.
On 6/16/15 6:03 PM, Andy Seaborne wrote:
> On 16/06/15 04:20, Holger Knublauch wrote:
>> Hi,
>>
>> (this question is motivated by the ongoing Data Shapes WG, but I don't
>> speak on their behalf).
>
> Ptr?
http://w3c.github.io/data-shapes/shacl/
esp http://w3c.github.io/data-shapes/shacl/#sparql-constraints-prebound
http://www.w3.org/2014/data-shapes/track/issues/68
>
>>
>> Jena and other APIs such as Sesame support the concept of pre-binding
>> variables prior to SPARQL execution, using
>> QueryExecution.setInitialBinding(). This is convenient to reuse
>> parameterized queries, especially with blank nodes.
>>
>> Question: is there any formal basis of this functionality, formulated so
>> that it can be implemented by other platforms too? I can see that it
>> populates the original bindings that are passed through the algebra
>> objects, but what would be the best way to explain this by means of
>> concepts from the SPARQL 1.1 spec?
>>
>> Thanks
>> Holger
>>
>
> There are two possible explanations - they are not quite the same.
>
> 1/ It's a substitution of a variable for a value execution. This is
> very like parameterized queries. It's a pre-execution step.
Do you mean syntactic insertion like the ParameterizedQuery class? This
would not support bnodes, and the shapes and focus nodes of a SHACL
constraint will frequently be bnodes. It should also avoid repeated
query parsing, for performance reasons it would be better to operate on
Query objects and their general equivalents (Algebra objects).
>
>
> 2/ VALUES
>
> There is a binding as a one row VALUES table and it's join'ed into the
> query as usual.
I guess inserting a VALUES clause into the beginning would work, but
then again what about bnodes? I guess instead of the VALUES keyword (as
a string), it would need to rely on the equivalent algebra object?
Just to be clear, this only needs to work in local datasets, not
necessarily with SPARQL endpoints where all we have is a http string
interface. I am looking for a couple of sentences that would provide a
generic implementation strategy that most SPARQL engines either already
have, or could easily add to support SHACL.
Thanks
Holger
>
> Differences in these viewpoints can occur in nested patetrns -
> sub-queries (you can have different variables with the same name - a
> textual substitution viewpoint breaks that) and OPTIONALs inside
> OPTIONALs (bottom up execution is not the same as top down execution).
>
> This has existed in ARQ for a very long time. ARQ actually takes the
> initial binding and seeds the execution from there so it's like (2)
> but not exactly; it does respect non-projected variables inside nested
> SELECTS; it does not complete respect certain cases of
> OPTIONAL-inside-OPTIONAL.
>
> Andy
>
Re: Definition of SPARQL variable pre-binding
Posted by Andy Seaborne <an...@apache.org>.
On 16/06/15 04:20, Holger Knublauch wrote:
> Hi,
>
> (this question is motivated by the ongoing Data Shapes WG, but I don't
> speak on their behalf).
Ptr?
>
> Jena and other APIs such as Sesame support the concept of pre-binding
> variables prior to SPARQL execution, using
> QueryExecution.setInitialBinding(). This is convenient to reuse
> parameterized queries, especially with blank nodes.
>
> Question: is there any formal basis of this functionality, formulated so
> that it can be implemented by other platforms too? I can see that it
> populates the original bindings that are passed through the algebra
> objects, but what would be the best way to explain this by means of
> concepts from the SPARQL 1.1 spec?
>
> Thanks
> Holger
>
There are two possible explanations - they are not quite the same.
1/ It's a substitution of a variable for a value execution. This is
very like parameterized queries. It's a pre-execution step.
2/ VALUES
There is a binding as a one row VALUES table and it's join'ed into the
query as usual.
Differences in these viewpoints can occur in nested patetrns -
sub-queries (you can have different variables with the same name - a
textual substitution viewpoint breaks that) and OPTIONALs inside
OPTIONALs (bottom up execution is not the same as top down execution).
This has existed in ARQ for a very long time. ARQ actually takes the
initial binding and seeds the execution from there so it's like (2) but
not exactly; it does respect non-projected variables inside nested
SELECTS; it does not complete respect certain cases of
OPTIONAL-inside-OPTIONAL.
Andy