You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Andy Seaborne <an...@apache.org> on 2015/06/28 12:08:04 UTC

Query parameterization.

(info / discussion / ...)

In working on JENA-963 (OpAsQuery; reworked handling of SPARQL modifiers 
for GROUP BY), it was easier/better to add the code I had for rewriting 
syntax by transformation, much like the algebra is rewritten by the 
optimizer.  The use case is rewriting the output of OpAsQuery to remove 
unnecessary nesting of levels of "{}" which arise during translation for 
the safety of the translation.

Hence putting in package oaj.sparql.syntax.syntaxtransform, a general 
framework for rewriting syntax, like we have for the SPARQL+ algebra.

It is also capable of being a parameterized query system (PQ).  We 
already ParameterizedSparqlString (PSS) so how do they compare?

Work-in-progress:

https://github.com/afs/jena-workspace/blob/master/src/main/java/syntaxtransform/ParameterizedQuery.java

PQ is a rewrite of a Query object (the template) with a map of variables 
to constants. That is, it works on the syntax tree after parsing and 
produces a syntax tree.

PSS is a builder with substitution. It builds a string, carefully 
(injection attacks) and is neutral as to what it is working with - query 
or update or something weird.
http://jena.apache.org/documentation/query/parameterized-sparql-strings.html

Summary:

PQ is only for replacement of a variable in a template.
PSS is a builder that can do that as part of building.

PQ covers cases PSS doesn't - neither is perfect.

PSS works with INSERT DATA.
PQ would use the INSERT { ... } WHERE {} form.

Details:

PSS:
   Can build query, update strings and fragments
   Supports JDBC style positional parameters (a '?')
     These must be bound to get a valid query.
     Can generate illegal syntax.
   Tests the type of the injected value (string, iri, double etc).
   Has corner cases
      Looks for ?x as a string so ...
        "This is not a ?x as a variable"
        <http://example/foo?x=123>
        "SELECT ?x"
        ns:local\?x (a legal local part)
   Protects against injection by checking.
   Works on INSERT DATA.

PQ:
   Replaces SPARQL variables where identified as variables.
     (no extra-syntax positional '?')
   Legal query to legal syntax query.
     The query may violate scope rules (example below).
     Not a query builder.
   Post parser, so no reparsing to use the query
     (for large updates and queries)
   Injection is meaningless - can only inject values, not syntax.
   Can rewrite structurally: "SELECT ?x" => "SELECT  (:value AS ?x)"
     which is useful to record the injection variables.
   Works with "INSERT {?s ?p ?o } WHERE { }"

PQ example:

   Query template = QueryFactory.create(.. valid query ..) ;
   Map<String, RDFNode> map = new HashMap<>() ;
   map.put("y", ResourceFactory.createPlainLiteral("Bristol") ;
   Query query = ParameterizedQuery.setVariables(template, map) ;


A perfect system probably needs a "template language" which SPARQL 
extended with a new "template variable" which is only allowed in certain 
places in the query and must be bound before use.

Some examples of hard templates:

(1) Not variables:
<http://example/foo?x=123>
"This is not a ?x as a variable"
ns:local\?x

(2) Some places ?x can not be replaced with a value directly.
    SELECT ?x { ?s ?p ?x }



A possible output is:
   SELECT  (:X AS ?x) { ?s ?p :X }
which is nice as it record the substitution but it fails when nested again.

SELECT ?x { {SELECT ?x { ?s ?p ?x } } ?s ?p ?o }

This is a bad query:
SELECT (:X AS ?x) { {SELECT (:X AS ?x) { ...

(3) Other places:
SELECT ?x { BIND(1 AS ?x) }
SELECT ?x { VALUES ?x { 123 } }

	Andy

Re: Query parameterization.

Posted by Andy Seaborne <an...@apache.org>.
For completeness:

A 3rd option is injecting a VALUEs at the start of the query:

SELECT  ?x
WHERE
   { ?s  :p  ?x
     FILTER ( ?x > 57 )
   }

and ?x = :X

====>

PREFIX  :     <http://example/>

SELECT  ?x
WHERE
   { VALUES ?x { :X }
     ?s  :p  ?x
     FILTER ( ?x > 57 )
   }


Like PQ, it's template+params based. A feature is that it can insert 
multiple possibilities.

	Andy

On 28/06/15 11:08, Andy Seaborne wrote:
> (info / discussion / ...)
>
> In working on JENA-963 (OpAsQuery; reworked handling of SPARQL modifiers
> for GROUP BY), it was easier/better to add the code I had for rewriting
> syntax by transformation, much like the algebra is rewritten by the
> optimizer.  The use case is rewriting the output of OpAsQuery to remove
> unnecessary nesting of levels of "{}" which arise during translation for
> the safety of the translation.
>
> Hence putting in package oaj.sparql.syntax.syntaxtransform, a general
> framework for rewriting syntax, like we have for the SPARQL+ algebra.
>
> It is also capable of being a parameterized query system (PQ).  We
> already ParameterizedSparqlString (PSS) so how do they compare?
>
> Work-in-progress:
>
> https://github.com/afs/jena-workspace/blob/master/src/main/java/syntaxtransform/ParameterizedQuery.java
>
>
> PQ is a rewrite of a Query object (the template) with a map of variables
> to constants. That is, it works on the syntax tree after parsing and
> produces a syntax tree.
>
> PSS is a builder with substitution. It builds a string, carefully
> (injection attacks) and is neutral as to what it is working with - query
> or update or something weird.
> http://jena.apache.org/documentation/query/parameterized-sparql-strings.html
>
>
> Summary:
>
> PQ is only for replacement of a variable in a template.
> PSS is a builder that can do that as part of building.
>
> PQ covers cases PSS doesn't - neither is perfect.
>
> PSS works with INSERT DATA.
> PQ would use the INSERT { ... } WHERE {} form.
>
> Details:
>
> PSS:
>    Can build query, update strings and fragments
>    Supports JDBC style positional parameters (a '?')
>      These must be bound to get a valid query.
>      Can generate illegal syntax.
>    Tests the type of the injected value (string, iri, double etc).
>    Has corner cases
>       Looks for ?x as a string so ...
>         "This is not a ?x as a variable"
>         <http://example/foo?x=123>
>         "SELECT ?x"
>         ns:local\?x (a legal local part)
>    Protects against injection by checking.
>    Works on INSERT DATA.
>
> PQ:
>    Replaces SPARQL variables where identified as variables.
>      (no extra-syntax positional '?')
>    Legal query to legal syntax query.
>      The query may violate scope rules (example below).
>      Not a query builder.
>    Post parser, so no reparsing to use the query
>      (for large updates and queries)
>    Injection is meaningless - can only inject values, not syntax.
>    Can rewrite structurally: "SELECT ?x" => "SELECT  (:value AS ?x)"
>      which is useful to record the injection variables.
>    Works with "INSERT {?s ?p ?o } WHERE { }"
>
> PQ example:
>
>    Query template = QueryFactory.create(.. valid query ..) ;
>    Map<String, RDFNode> map = new HashMap<>() ;
>    map.put("y", ResourceFactory.createPlainLiteral("Bristol") ;
>    Query query = ParameterizedQuery.setVariables(template, map) ;
>
>
> A perfect system probably needs a "template language" which SPARQL
> extended with a new "template variable" which is only allowed in certain
> places in the query and must be bound before use.
>
> Some examples of hard templates:
>
> (1) Not variables:
> <http://example/foo?x=123>
> "This is not a ?x as a variable"
> ns:local\?x
>
> (2) Some places ?x can not be replaced with a value directly.
>     SELECT ?x { ?s ?p ?x }
>
>
>
> A possible output is:
>    SELECT  (:X AS ?x) { ?s ?p :X }
> which is nice as it record the substitution but it fails when nested again.
>
> SELECT ?x { {SELECT ?x { ?s ?p ?x } } ?s ?p ?o }
>
> This is a bad query:
> SELECT (:X AS ?x) { {SELECT (:X AS ?x) { ...
>
> (3) Other places:
> SELECT ?x { BIND(1 AS ?x) }
> SELECT ?x { VALUES ?x { 123 } }
>
>      Andy


Re: Query parameterization.

Posted by Andy Seaborne <an...@apache.org>.
On 01/07/15 05:27, Holger Knublauch wrote:
> Hi Andy,
>
> this looks great, and is just in time for the ongoing discussions in the
> SHACL group. I apologize in advance for not having the bandwidth yet to
> try this out from your branch, but this topic will definitely bubble up
> in the priorities soon...
>
> I have not fully understood how the semantics of this are different from
> the setInitialBinding feature that we currently use in SPIN, and which
> seems to do a pretty good job. However, having a facility to do further
> pre-processing in advance may improve performance and provide a more
> formal definition of what setInitialBinding is doing. I am personally
> not enthusiastic about approaches based on text-substitution, so working
> on the parsed syntax tree looks good to me. There are some (rare) cases
> where text-substitution would be more powerful, e.g. dynamic path
> properties

If you can insert compound syntax, then injection attacks need to be 
considered.

> and some solution modifiers, but as you say no approach is
> perfect.

Better done on the algebra?  Especially around SELECT clause as it is 
several modifiers in tangle.

(See recent OpAsQuery discussion and changes)

>
> Questions:
>
> - would this also pre-bind variables inside of nested SELECTs?

Yes (it's a choice - it could not do it with some analysis of the inner 
projection as it passes through).

> - I assume this can handle blank nodes (e.g. rdf:Lists) as bindings?

Probably! (it's tricky and needs more testing)
...
Yes - the replacement with bnodes-are-variables in SPARQL is done during 
parsing and this is post parse (different to all string based approaches).

If the substituted query to turned into a string, it will beome a bnode 
in SPARQL which then reparses is a variable.  The printing code 
(specifically NodeToLabelMapBNode.asString) handles it and would need a 
tweak.

The <_:label> form would be better but needs implementing.

> - What about bound(?var) and ?var is pre-bound?

?var in bound(?var) is replaced (as ?var in all expressions).  This is 
syntax.

	Andy

>
> Thanks
> Holger
>
>
> On 6/28/15 8:08 PM, Andy Seaborne wrote:
>> (info / discussion / ...)
>>
>> In working on JENA-963 (OpAsQuery; reworked handling of SPARQL
>> modifiers for GROUP BY), it was easier/better to add the code I had
>> for rewriting syntax by transformation, much like the algebra is
>> rewritten by the optimizer.  The use case is rewriting the output of
>> OpAsQuery to remove unnecessary nesting of levels of "{}" which arise
>> during translation for the safety of the translation.
>>
>> Hence putting in package oaj.sparql.syntax.syntaxtransform, a general
>> framework for rewriting syntax, like we have for the SPARQL+ algebra.
>>
>> It is also capable of being a parameterized query system (PQ).  We
>> already ParameterizedSparqlString (PSS) so how do they compare?
>>
>> Work-in-progress:
>>
>> https://github.com/afs/jena-workspace/blob/master/src/main/java/syntaxtransform/ParameterizedQuery.java
>>
>>
>> PQ is a rewrite of a Query object (the template) with a map of
>> variables to constants. That is, it works on the syntax tree after
>> parsing and produces a syntax tree.
>>
>> PSS is a builder with substitution. It builds a string, carefully
>> (injection attacks) and is neutral as to what it is working with -
>> query or update or something weird.
>> http://jena.apache.org/documentation/query/parameterized-sparql-strings.html
>>
>>
>> Summary:
>>
>> PQ is only for replacement of a variable in a template.
>> PSS is a builder that can do that as part of building.
>>
>> PQ covers cases PSS doesn't - neither is perfect.
>>
>> PSS works with INSERT DATA.
>> PQ would use the INSERT { ... } WHERE {} form.
>>
>> Details:
>>
>> PSS:
>>   Can build query, update strings and fragments
>>   Supports JDBC style positional parameters (a '?')
>>     These must be bound to get a valid query.
>>     Can generate illegal syntax.
>>   Tests the type of the injected value (string, iri, double etc).
>>   Has corner cases
>>      Looks for ?x as a string so ...
>>        "This is not a ?x as a variable"
>>        <http://example/foo?x=123>
>>        "SELECT ?x"
>>        ns:local\?x (a legal local part)
>>   Protects against injection by checking.
>>   Works on INSERT DATA.
>>
>> PQ:
>>   Replaces SPARQL variables where identified as variables.
>>     (no extra-syntax positional '?')
>>   Legal query to legal syntax query.
>>     The query may violate scope rules (example below).
>>     Not a query builder.
>>   Post parser, so no reparsing to use the query
>>     (for large updates and queries)
>>   Injection is meaningless - can only inject values, not syntax.
>>   Can rewrite structurally: "SELECT ?x" => "SELECT  (:value AS ?x)"
>>     which is useful to record the injection variables.
>>   Works with "INSERT {?s ?p ?o } WHERE { }"
>>
>> PQ example:
>>
>>   Query template = QueryFactory.create(.. valid query ..) ;
>>   Map<String, RDFNode> map = new HashMap<>() ;
>>   map.put("y", ResourceFactory.createPlainLiteral("Bristol") ;
>>   Query query = ParameterizedQuery.setVariables(template, map) ;
>>
>>
>> A perfect system probably needs a "template language" which SPARQL
>> extended with a new "template variable" which is only allowed in
>> certain places in the query and must be bound before use.
>>
>> Some examples of hard templates:
>>
>> (1) Not variables:
>> <http://example/foo?x=123>
>> "This is not a ?x as a variable"
>> ns:local\?x
>>
>> (2) Some places ?x can not be replaced with a value directly.
>>    SELECT ?x { ?s ?p ?x }
>>
>>
>>
>> A possible output is:
>>   SELECT  (:X AS ?x) { ?s ?p :X }
>> which is nice as it record the substitution but it fails when nested
>> again.
>>
>> SELECT ?x { {SELECT ?x { ?s ?p ?x } } ?s ?p ?o }
>>
>> This is a bad query:
>> SELECT (:X AS ?x) { {SELECT (:X AS ?x) { ...
>>
>> (3) Other places:
>> SELECT ?x { BIND(1 AS ?x) }
>> SELECT ?x { VALUES ?x { 123 } }
>>
>>     Andy
>


Re: Query parameterization.

Posted by Andy Seaborne <an...@apache.org>.
On 03/07/15 09:35, Andy Seaborne wrote:
> On 01/07/15 07:17, Claude Warren wrote:
>> SelectBuilder sb = new SelectBuilder()
>>      .addVar( "*" )
>>      .addWhere( "?s", "?p", "?o" );
>> sb.setVar( Var.alloc( "?o" ), NodeFactory.createURI(
>> "http://xmlns.com/foaf/0.1/Person"  ) ) ;Query q = sb.build();
>
> Hi Claude,
>
> Should that be one of
>    Var.alloc( "o" )
>    Var.alloc(Var.canonical("?o"))
>
> How does it compare to the corner cases in my first message?
>
>
> There is at least one injection attack:
>
> NodeFactory.createURI of
>
> "http://xmlns.com/foaf/0.1/Person> . ?s ?q <http://example/ns"
>
> because it is string inclusion, jena-querybuilder needs to do the same
> checks that ParametrizedSparqlString does for URI.  A check is needed on
> literals but a different kind of test.
>
> BTW:
>
> and how do I add
>
> OPTIONAL {
>     ?s <q> 123 .
>     ?s <v> ?x .
>     FILTER(?x>56)
> }
> ?
>
> And for UNION, there seems to be a confusion because it takes a
> SelectBuilder (a subquery) but that's an SQL-ism, not SPARQL.
>
> It seems to cause problems:
>
>          SelectBuilder sb = new SelectBuilder().addVar("*") ;
>          sb.addWhere("?s", "?p", "?o") ;
>          SelectBuilder sb1 = new SelectBuilder().addVar("*") ;
>          sb1.addWhere("?s", "?p", "?o") ;
>          sb1.addUnion(sb1) ;
>          Query q1 = sb1.build() ;
>          String s1 = q1.toString() ;
>          System.out.println(s1) ;
>
> I get stack overflow.

Silly mistake on my part.

         SelectBuilder sb = new SelectBuilder().addVar("*") ;
         sb.addWhere("?s", "?p", "?o") ;
         SelectBuilder sb1 = new SelectBuilder().addVar("*") ;
         sb1.addWhere("?s1", "?p1", "?o1") ;
         sb.addUnion(sb1) ;
         Query q1 = sb.build() ;
         String s1 = q1.toString() ;
         System.out.println(s1) ;


>
> UNION and OPTIONAL are similar - they take graph patterns.
>
But I now get this illegal query;

SELECT  *
WHERE
   { ?s  ?p  ?o
     UNION
       { SELECT  ?s ?p ?o
         WHERE
           { ?s  ?p  ?o }
       }
   }

which should be:

SELECT  *
WHERE
   { { ?s  ?p  ?o }
     UNION
       { SELECT  ?s ?p ?o
         WHERE
           { ?s  ?p  ?o }
       }
   }

each side of the UNION is a  ElementGroup.

>      Andy
>


Re: Query parameterization.

Posted by Andy Seaborne <an...@apache.org>.
On 01/07/15 07:17, Claude Warren wrote:
> SelectBuilder sb = new SelectBuilder()
>      .addVar( "*" )
>      .addWhere( "?s", "?p", "?o" );
> sb.setVar( Var.alloc( "?o" ), NodeFactory.createURI(
> "http://xmlns.com/foaf/0.1/Person"  ) ) ;Query q = sb.build();

Hi Claude,

Should that be one of
   Var.alloc( "o" )
   Var.alloc(Var.canonical("?o"))

How does it compare to the corner cases in my first message?


There is at least one injection attack:

NodeFactory.createURI of

"http://xmlns.com/foaf/0.1/Person> . ?s ?q <http://example/ns"

because it is string inclusion, jena-querybuilder needs to do the same 
checks that ParametrizedSparqlString does for URI.  A check is needed on 
literals but a different kind of test.

BTW:

and how do I add

OPTIONAL {
    ?s <q> 123 .
    ?s <v> ?x .
    FILTER(?x>56)
}
?

And for UNION, there seems to be a confusion because it takes a 
SelectBuilder (a subquery) but that's an SQL-ism, not SPARQL.

It seems to cause problems:

         SelectBuilder sb = new SelectBuilder().addVar("*") ;
         sb.addWhere("?s", "?p", "?o") ;
         SelectBuilder sb1 = new SelectBuilder().addVar("*") ;
         sb1.addWhere("?s", "?p", "?o") ;
         sb1.addUnion(sb1) ;
         Query q1 = sb1.build() ;
         String s1 = q1.toString() ;
         System.out.println(s1) ;

I get stack overflow.

UNION and OPTIONAL are similar - they take graph patterns.

	Andy


Re: Query parameterization.

Posted by Claude Warren <cl...@xenei.com>.
The QueryBuilder also has parameterized variables of a type.

Basically you can construct the query with variables and then replace the
variable with a value by calling setVar()  just before calling build.

SelectBuilder sb = new SelectBuilder()
    .addVar( "*" )
    .addWhere( "?s", "?p", "?o" );
sb.setVar( Var.alloc( "?o" ), NodeFactory.createURI(
"http://xmlns.com/foaf/0.1/Person" ) ) ;Query q = sb.build();

produces

SELECT * WHERE
  { ?s ?p <http://xmlns.com/foaf/0.1/Person> }



On Wed, Jul 1, 2015 at 5:27 AM, Holger Knublauch <ho...@knublauch.com>
wrote:

> Hi Andy,
>
> this looks great, and is just in time for the ongoing discussions in the
> SHACL group. I apologize in advance for not having the bandwidth yet to try
> this out from your branch, but this topic will definitely bubble up in the
> priorities soon...
>
> I have not fully understood how the semantics of this are different from
> the setInitialBinding feature that we currently use in SPIN, and which
> seems to do a pretty good job. However, having a facility to do further
> pre-processing in advance may improve performance and provide a more formal
> definition of what setInitialBinding is doing. I am personally not
> enthusiastic about approaches based on text-substitution, so working on the
> parsed syntax tree looks good to me. There are some (rare) cases where
> text-substitution would be more powerful, e.g. dynamic path properties and
> some solution modifiers, but as you say no approach is perfect.
>
> Questions:
>
> - would this also pre-bind variables inside of nested SELECTs?
> - I assume this can handle blank nodes (e.g. rdf:Lists) as bindings?
> - What about bound(?var) and ?var is pre-bound?
>
> Thanks
> Holger
>
>
>
> On 6/28/15 8:08 PM, Andy Seaborne wrote:
>
>> (info / discussion / ...)
>>
>> In working on JENA-963 (OpAsQuery; reworked handling of SPARQL modifiers
>> for GROUP BY), it was easier/better to add the code I had for rewriting
>> syntax by transformation, much like the algebra is rewritten by the
>> optimizer.  The use case is rewriting the output of OpAsQuery to remove
>> unnecessary nesting of levels of "{}" which arise during translation for
>> the safety of the translation.
>>
>> Hence putting in package oaj.sparql.syntax.syntaxtransform, a general
>> framework for rewriting syntax, like we have for the SPARQL+ algebra.
>>
>> It is also capable of being a parameterized query system (PQ).  We
>> already ParameterizedSparqlString (PSS) so how do they compare?
>>
>> Work-in-progress:
>>
>>
>> https://github.com/afs/jena-workspace/blob/master/src/main/java/syntaxtransform/ParameterizedQuery.java
>>
>> PQ is a rewrite of a Query object (the template) with a map of variables
>> to constants. That is, it works on the syntax tree after parsing and
>> produces a syntax tree.
>>
>> PSS is a builder with substitution. It builds a string, carefully
>> (injection attacks) and is neutral as to what it is working with - query or
>> update or something weird.
>>
>> http://jena.apache.org/documentation/query/parameterized-sparql-strings.html
>>
>> Summary:
>>
>> PQ is only for replacement of a variable in a template.
>> PSS is a builder that can do that as part of building.
>>
>> PQ covers cases PSS doesn't - neither is perfect.
>>
>> PSS works with INSERT DATA.
>> PQ would use the INSERT { ... } WHERE {} form.
>>
>> Details:
>>
>> PSS:
>>   Can build query, update strings and fragments
>>   Supports JDBC style positional parameters (a '?')
>>     These must be bound to get a valid query.
>>     Can generate illegal syntax.
>>   Tests the type of the injected value (string, iri, double etc).
>>   Has corner cases
>>      Looks for ?x as a string so ...
>>        "This is not a ?x as a variable"
>>        <http://example/foo?x=123>
>>        "SELECT ?x"
>>        ns:local\?x (a legal local part)
>>   Protects against injection by checking.
>>   Works on INSERT DATA.
>>
>> PQ:
>>   Replaces SPARQL variables where identified as variables.
>>     (no extra-syntax positional '?')
>>   Legal query to legal syntax query.
>>     The query may violate scope rules (example below).
>>     Not a query builder.
>>   Post parser, so no reparsing to use the query
>>     (for large updates and queries)
>>   Injection is meaningless - can only inject values, not syntax.
>>   Can rewrite structurally: "SELECT ?x" => "SELECT  (:value AS ?x)"
>>     which is useful to record the injection variables.
>>   Works with "INSERT {?s ?p ?o } WHERE { }"
>>
>> PQ example:
>>
>>   Query template = QueryFactory.create(.. valid query ..) ;
>>   Map<String, RDFNode> map = new HashMap<>() ;
>>   map.put("y", ResourceFactory.createPlainLiteral("Bristol") ;
>>   Query query = ParameterizedQuery.setVariables(template, map) ;
>>
>>
>> A perfect system probably needs a "template language" which SPARQL
>> extended with a new "template variable" which is only allowed in certain
>> places in the query and must be bound before use.
>>
>> Some examples of hard templates:
>>
>> (1) Not variables:
>> <http://example/foo?x=123>
>> "This is not a ?x as a variable"
>> ns:local\?x
>>
>> (2) Some places ?x can not be replaced with a value directly.
>>    SELECT ?x { ?s ?p ?x }
>>
>>
>>
>> A possible output is:
>>   SELECT  (:X AS ?x) { ?s ?p :X }
>> which is nice as it record the substitution but it fails when nested
>> again.
>>
>> SELECT ?x { {SELECT ?x { ?s ?p ?x } } ?s ?p ?o }
>>
>> This is a bad query:
>> SELECT (:X AS ?x) { {SELECT (:X AS ?x) { ...
>>
>> (3) Other places:
>> SELECT ?x { BIND(1 AS ?x) }
>> SELECT ?x { VALUES ?x { 123 } }
>>
>>     Andy
>>
>
>


-- 
I like: Like Like - The likeliest place on the web
<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren

Re: Query parameterization.

Posted by Holger Knublauch <ho...@knublauch.com>.
Hi Andy,

this looks great, and is just in time for the ongoing discussions in the 
SHACL group. I apologize in advance for not having the bandwidth yet to 
try this out from your branch, but this topic will definitely bubble up 
in the priorities soon...

I have not fully understood how the semantics of this are different from 
the setInitialBinding feature that we currently use in SPIN, and which 
seems to do a pretty good job. However, having a facility to do further 
pre-processing in advance may improve performance and provide a more 
formal definition of what setInitialBinding is doing. I am personally 
not enthusiastic about approaches based on text-substitution, so working 
on the parsed syntax tree looks good to me. There are some (rare) cases 
where text-substitution would be more powerful, e.g. dynamic path 
properties and some solution modifiers, but as you say no approach is 
perfect.

Questions:

- would this also pre-bind variables inside of nested SELECTs?
- I assume this can handle blank nodes (e.g. rdf:Lists) as bindings?
- What about bound(?var) and ?var is pre-bound?

Thanks
Holger


On 6/28/15 8:08 PM, Andy Seaborne wrote:
> (info / discussion / ...)
>
> In working on JENA-963 (OpAsQuery; reworked handling of SPARQL 
> modifiers for GROUP BY), it was easier/better to add the code I had 
> for rewriting syntax by transformation, much like the algebra is 
> rewritten by the optimizer.  The use case is rewriting the output of 
> OpAsQuery to remove unnecessary nesting of levels of "{}" which arise 
> during translation for the safety of the translation.
>
> Hence putting in package oaj.sparql.syntax.syntaxtransform, a general 
> framework for rewriting syntax, like we have for the SPARQL+ algebra.
>
> It is also capable of being a parameterized query system (PQ).  We 
> already ParameterizedSparqlString (PSS) so how do they compare?
>
> Work-in-progress:
>
> https://github.com/afs/jena-workspace/blob/master/src/main/java/syntaxtransform/ParameterizedQuery.java 
>
>
> PQ is a rewrite of a Query object (the template) with a map of 
> variables to constants. That is, it works on the syntax tree after 
> parsing and produces a syntax tree.
>
> PSS is a builder with substitution. It builds a string, carefully 
> (injection attacks) and is neutral as to what it is working with - 
> query or update or something weird.
> http://jena.apache.org/documentation/query/parameterized-sparql-strings.html 
>
>
> Summary:
>
> PQ is only for replacement of a variable in a template.
> PSS is a builder that can do that as part of building.
>
> PQ covers cases PSS doesn't - neither is perfect.
>
> PSS works with INSERT DATA.
> PQ would use the INSERT { ... } WHERE {} form.
>
> Details:
>
> PSS:
>   Can build query, update strings and fragments
>   Supports JDBC style positional parameters (a '?')
>     These must be bound to get a valid query.
>     Can generate illegal syntax.
>   Tests the type of the injected value (string, iri, double etc).
>   Has corner cases
>      Looks for ?x as a string so ...
>        "This is not a ?x as a variable"
>        <http://example/foo?x=123>
>        "SELECT ?x"
>        ns:local\?x (a legal local part)
>   Protects against injection by checking.
>   Works on INSERT DATA.
>
> PQ:
>   Replaces SPARQL variables where identified as variables.
>     (no extra-syntax positional '?')
>   Legal query to legal syntax query.
>     The query may violate scope rules (example below).
>     Not a query builder.
>   Post parser, so no reparsing to use the query
>     (for large updates and queries)
>   Injection is meaningless - can only inject values, not syntax.
>   Can rewrite structurally: "SELECT ?x" => "SELECT  (:value AS ?x)"
>     which is useful to record the injection variables.
>   Works with "INSERT {?s ?p ?o } WHERE { }"
>
> PQ example:
>
>   Query template = QueryFactory.create(.. valid query ..) ;
>   Map<String, RDFNode> map = new HashMap<>() ;
>   map.put("y", ResourceFactory.createPlainLiteral("Bristol") ;
>   Query query = ParameterizedQuery.setVariables(template, map) ;
>
>
> A perfect system probably needs a "template language" which SPARQL 
> extended with a new "template variable" which is only allowed in 
> certain places in the query and must be bound before use.
>
> Some examples of hard templates:
>
> (1) Not variables:
> <http://example/foo?x=123>
> "This is not a ?x as a variable"
> ns:local\?x
>
> (2) Some places ?x can not be replaced with a value directly.
>    SELECT ?x { ?s ?p ?x }
>
>
>
> A possible output is:
>   SELECT  (:X AS ?x) { ?s ?p :X }
> which is nice as it record the substitution but it fails when nested 
> again.
>
> SELECT ?x { {SELECT ?x { ?s ?p ?x } } ?s ?p ?o }
>
> This is a bad query:
> SELECT (:X AS ?x) { {SELECT (:X AS ?x) { ...
>
> (3) Other places:
> SELECT ?x { BIND(1 AS ?x) }
> SELECT ?x { VALUES ?x { 123 } }
>
>     Andy