You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jena.apache.org by Holger Knublauch <ho...@knublauch.com> on 2012/08/10 03:12:30 UTC

Change in execution order between Jena 2.7.2 and 2.7.3

Andy,

we are evaluating the move to 2.7.3 and have been immediately hit by 
what looks like a change of SPARQL semantics in ARQ. See the attached 
Java test which returns "Test" in 272 but null in 273. The query is 
really simple:

     SELECT *
     WHERE {
         {
             BIND ("Test" AS ?label) .
         } .
         BIND (?label AS ?result) .
     }

but ?label is no longer visible in the outer BIND. The same happens if 
you replace the inner BIND with a BGP that binds ?label, but I wanted to 
make the example model independent.

So my obvious question: is this the intended behavior, why the change etc?

Thanks,
Holger

Re: Change in execution order between Jena 2.7.2 and 2.7.3

Posted by Elli Schwarz <el...@yahoo.com>.

Andy,


Great, thanks! Using 0.2.5-SNAPSHOT is working for me in my queries, with the original BIND interpretation.


-Elli



________________________________
 From: Andy Seaborne <an...@apache.org>
To: users@jena.apache.org 
Sent: Wednesday, September 19, 2012 2:37 PM
Subject: Re: Change in execution order between Jena 2.7.2 and 2.7.3
 
On 19/09/12 18:55, Elli Schwarz wrote:
> I've been following this email thread with great interest, as well as the emails to the SPARQL Working Group comments regarding BIND semantics. I use BINDs heavily in my queries, and when I upgraded to Jena 2.7.3 I noticed that many of my queries no longer worked properly, so I reverted back to using Jena 2.7.2 (and Fuseki 2.3.0) until the issue was sorted out.
>
>
> I see that the SPARQL working group has now clarified how BIND should work, and that it reverted the changes made in Last Call 3, so my understanding is BIND will work similar to the way it worked in Jena 2.7.2. Is there a timeline for when a new version of Jena/Fuseki will be released that will contain the fix for BIND? There are several bug fixes I needed that were fixed in Jena 2.7.3, but since I had to revert I don't have these fixes anymore.
>

You can use the development SNAPSHOT build to ensure that what you 
expect is there.  Testing development builds helps prepare for releases.

https://repository.apache.org/content/repositories/snapshots/org/apache/jena/

    Andy

>
> Thank you,
>
> Elli
>
>
>
> ________________________________
>   From: Holger Knublauch <ho...@knublauch.com>
> To: users@jena.apache.org
> Sent: Wednesday, August 15, 2012 3:20 AM
> Subject: Re: Change in execution order between Jena 2.7.2 and 2.7.3
>
> On 8/13/2012 5:20, Andy Seaborne wrote:
>> On 12/08/12 02:46, Holger Knublauch wrote:
>> ...
>>> but we and our customers have an unknown number of queries in
>>> production
>> ...
>>
>> TQ gets Jena for free.
>
> Yes and this is greatly appreciated. You guys are doing an amazing job. We have built quite an empire on top of that. You will understand why I am especially sensitive to surprising changes to the foundation of our software stack. Please accept my apologies if I sounded too frustrated.
>
>> What would help is if TQ tested against the nightly snapshot builds especially just before a release.  The project makes available development builds at all times so we can deal with such issues early, before a release.
>
> This would be good, but would be limited to cases in which the Jena API itself remains stable. Usually there are always some API changes that won't even make our stuff compile without changes. This plus the overhead of setting up the infrastructure has prevented us from doing continuous testing.
>
>>
>>> I believe TQ will need to raise this issue with the SPARQL 1.1 WG
>>> again, although it seems we are very late in the process.
>>
>> You are, of course, welcome to.
>>
>> Referring to specification text would strengthen your case. Referring to implementation bugs is, IMO, not a strong case.  They happen, that's life.
>>
>> Using the sub-query form will remove duplicating BIND statements. Sub-queries allow applying BIND after FILTER.
>
> As you have seen I have written to the WG. From a user's perspective, I believe
>
> {
>      [Anything]
>      BIND (... AS ?x)
> }
>
> should be equivalent to
>
> {
>      SELECT (... AS ?x)
>      WHERE {
>          [Anything]
>      }
> }
>
> But let's see what comes out of the WG mailing list discussion.
>
> Thanks
> Holger
>
>
>>
>>> And yes, optimizing the FILTER placement would be great and would
>>> remove some of the pain and allow query authors to improve query
>>> performance.
>>
>> I've raised JENA-293 to track this optimization. Please submit a patch.
>>
>> https://issues.apache.org/jira/browse/JENA-293
>>
>>>
>>> Thanks, Holger
>>
>>       Andy

Re: Change in execution order between Jena 2.7.2 and 2.7.3

Posted by Andy Seaborne <an...@apache.org>.

On 19/09/12 18:55, Elli Schwarz wrote:
> I've been following this email thread with great interest, as well as the emails to the SPARQL Working Group comments regarding BIND semantics. I use BINDs heavily in my queries, and when I upgraded to Jena 2.7.3 I noticed that many of my queries no longer worked properly, so I reverted back to using Jena 2.7.2 (and Fuseki 2.3.0) until the issue was sorted out.
>
>
> I see that the SPARQL working group has now clarified how BIND should work, and that it reverted the changes made in Last Call 3, so my understanding is BIND will work similar to the way it worked in Jena 2.7.2. Is there a timeline for when a new version of Jena/Fuseki will be released that will contain the fix for BIND? There are several bug fixes I needed that were fixed in Jena 2.7.3, but since I had to revert I don't have these fixes anymore.
>

You can use the development SNAPSHOT build to ensure that what you 
expect is there.  Testing development builds helps prepare for releases.

https://repository.apache.org/content/repositories/snapshots/org/apache/jena/

	Andy

>
> Thank you,
>
> Elli
>
>
>
> ________________________________
>   From: Holger Knublauch <ho...@knublauch.com>
> To: users@jena.apache.org
> Sent: Wednesday, August 15, 2012 3:20 AM
> Subject: Re: Change in execution order between Jena 2.7.2 and 2.7.3
>
> On 8/13/2012 5:20, Andy Seaborne wrote:
>> On 12/08/12 02:46, Holger Knublauch wrote:
>> ...
>>> but we and our customers have an unknown number of queries in
>>> production
>> ...
>>
>> TQ gets Jena for free.
>
> Yes and this is greatly appreciated. You guys are doing an amazing job. We have built quite an empire on top of that. You will understand why I am especially sensitive to surprising changes to the foundation of our software stack. Please accept my apologies if I sounded too frustrated.
>
>> What would help is if TQ tested against the nightly snapshot builds especially just before a release.  The project makes available development builds at all times so we can deal with such issues early, before a release.
>
> This would be good, but would be limited to cases in which the Jena API itself remains stable. Usually there are always some API changes that won't even make our stuff compile without changes. This plus the overhead of setting up the infrastructure has prevented us from doing continuous testing.
>
>>
>>> I believe TQ will need to raise this issue with the SPARQL 1.1 WG
>>> again, although it seems we are very late in the process.
>>
>> You are, of course, welcome to.
>>
>> Referring to specification text would strengthen your case. Referring to implementation bugs is, IMO, not a strong case.  They happen, that's life.
>>
>> Using the sub-query form will remove duplicating BIND statements. Sub-queries allow applying BIND after FILTER.
>
> As you have seen I have written to the WG. From a user's perspective, I believe
>
> {
>      [Anything]
>      BIND (... AS ?x)
> }
>
> should be equivalent to
>
> {
>      SELECT (... AS ?x)
>      WHERE {
>          [Anything]
>      }
> }
>
> But let's see what comes out of the WG mailing list discussion.
>
> Thanks
> Holger
>
>
>>
>>> And yes, optimizing the FILTER placement would be great and would
>>> remove some of the pain and allow query authors to improve query
>>> performance.
>>
>> I've raised JENA-293 to track this optimization. Please submit a patch.
>>
>> https://issues.apache.org/jira/browse/JENA-293
>>
>>>
>>> Thanks, Holger
>>
>>       Andy

Re: Change in execution order between Jena 2.7.2 and 2.7.3

Posted by Elli Schwarz <el...@yahoo.com>.

I've been following this email thread with great interest, as well as the emails to the SPARQL Working Group comments regarding BIND semantics. I use BINDs heavily in my queries, and when I upgraded to Jena 2.7.3 I noticed that many of my queries no longer worked properly, so I reverted back to using Jena 2.7.2 (and Fuseki 2.3.0) until the issue was sorted out.

I see that the SPARQL working group has now clarified how BIND should work, and that it reverted the changes made in Last Call 3, so my understanding is BIND will work similar to the way it worked in Jena 2.7.2. Is there a timeline for when a new version of Jena/Fuseki will be released that will contain the fix for BIND? There are several bug fixes I needed that were fixed in Jena 2.7.3, but since I had to revert I don't have these fixes anymore.

Thank you,

Elli

________________________________
 From: Holger Knublauch <ho...@knublauch.com>
To: users@jena.apache.org 
Sent: Wednesday, August 15, 2012 3:20 AM
Subject: Re: Change in execution order between Jena 2.7.2 and 2.7.3

On 8/13/2012 5:20, Andy Seaborne wrote:
> On 12/08/12 02:46, Holger Knublauch wrote:
> ...
>> but we and our customers have an unknown number of queries in
>> production
> ...
> 
> TQ gets Jena for free.

Yes and this is greatly appreciated. You guys are doing an amazing job. We have built quite an empire on top of that. You will understand why I am especially sensitive to surprising changes to the foundation of our software stack. Please accept my apologies if I sounded too frustrated.

> What would help is if TQ tested against the nightly snapshot builds especially just before a release.  The project makes available development builds at all times so we can deal with such issues early, before a release.

This would be good, but would be limited to cases in which the Jena API itself remains stable. Usually there are always some API changes that won't even make our stuff compile without changes. This plus the overhead of setting up the infrastructure has prevented us from doing continuous testing.

> 
>> I believe TQ will need to raise this issue with the SPARQL 1.1 WG
>> again, although it seems we are very late in the process.
> 
> You are, of course, welcome to.
> 
> Referring to specification text would strengthen your case. Referring to implementation bugs is, IMO, not a strong case.  They happen, that's life.
> 
> Using the sub-query form will remove duplicating BIND statements. Sub-queries allow applying BIND after FILTER.

As you have seen I have written to the WG. From a user's perspective, I believe

{
    [Anything]
    BIND (... AS ?x)
}

should be equivalent to

{
    SELECT (... AS ?x)
    WHERE {
        [Anything]
    }
}

But let's see what comes out of the WG mailing list discussion.

Thanks
Holger

> 
>> And yes, optimizing the FILTER placement would be great and would
>> remove some of the pain and allow query authors to improve query
>> performance.
> 
> I've raised JENA-293 to track this optimization. Please submit a patch.
> 
> https://issues.apache.org/jira/browse/JENA-293
> 
>> 
>> Thanks, Holger
> 
>     Andy

Re: Change in execution order between Jena 2.7.2 and 2.7.3

Posted by Andy Seaborne <an...@apache.org>.

On 15/08/12 08:20, Holger Knublauch wrote:
...
> This would be good, but would be limited to cases in which the Jena
> API itself remains stable. Usually there are always some API changes
> that won't even make our stuff compile without changes. This plus the
>  overhead of setting up the infrastructure has prevented us from
> doing continuous testing.

The API changes very rarely, with deprecation cycles usually over
several versions.

In the past, TQ have called into internal code - surely, knowing in
advance when that changes is important to maintaining your software.
Even then, such events are rare.

It's TopQuadrant's choice - test early or test late and risk exposure to
unintentional changes made for the better that have been other people
may now depend on.

     Andy

Re: Change in execution order between Jena 2.7.2 and 2.7.3

Posted by Andy Seaborne <an...@apache.org>.

On 15/08/12 08:20, Holger Knublauch wrote:
> On 8/13/2012 5:20, Andy Seaborne wrote:
>> On 12/08/12 02:46, Holger Knublauch wrote:
>> ...
>>> but we and our customers have an unknown number of queries in
>>> production
>> ...
>>
>> TQ gets Jena for free.
>
> Yes and this is greatly appreciated. You guys are doing an amazing job.
> We have built quite an empire on top of that. You will understand why I
> am especially sensitive to surprising changes to the foundation of our
> software stack. Please accept my apologies if I sounded too frustrated.
>
>> What would help is if TQ tested against the nightly snapshot builds
>> especially just before a release.  The project makes available
>> development builds at all times so we can deal with such issues early,
>> before a release.
>
> This would be good, but would be limited to cases in which the Jena API
> itself remains stable. Usually there are always some API changes that
> won't even make our stuff compile without changes. This plus the
> overhead of setting up the infrastructure has prevented us from doing
> continuous testing.

The API changes very rarely, with deprecation cycles usually over 
several versions.

In the past, TQ have called into internal code - surely, knowing in 
advance when that changes is important to maintaining your software. 
Even then, such events are rare.

It's TopQuadrant's choice - test early or test late and risk exposure to 
unintentional changes made for the better that have been other people 
may now depend on.

	Andy

Re: Change in execution order between Jena 2.7.2 and 2.7.3

Posted by Holger Knublauch <ho...@knublauch.com>.

On 8/13/2012 5:20, Andy Seaborne wrote:
> On 12/08/12 02:46, Holger Knublauch wrote:
> ...
>> but we and our customers have an unknown number of queries in
>> production
> ...
>
> TQ gets Jena for free.

Yes and this is greatly appreciated. You guys are doing an amazing job. 
We have built quite an empire on top of that. You will understand why I 
am especially sensitive to surprising changes to the foundation of our 
software stack. Please accept my apologies if I sounded too frustrated.

> What would help is if TQ tested against the nightly snapshot builds 
> especially just before a release.  The project makes available 
> development builds at all times so we can deal with such issues early, 
> before a release.

This would be good, but would be limited to cases in which the Jena API 
itself remains stable. Usually there are always some API changes that 
won't even make our stuff compile without changes. This plus the 
overhead of setting up the infrastructure has prevented us from doing 
continuous testing.

>
>> I believe TQ will need to raise this issue with the SPARQL 1.1 WG
>> again, although it seems we are very late in the process.
>
> You are, of course, welcome to.
>
> Referring to specification text would strengthen your case. Referring 
> to implementation bugs is, IMO, not a strong case.  They happen, 
> that's life.
>
> Using the sub-query form will remove duplicating BIND statements. 
> Sub-queries allow applying BIND after FILTER.

As you have seen I have written to the WG. From a user's perspective, I 
believe

{
     [Anything]
     BIND (... AS ?x)
}

should be equivalent to

{
     SELECT (... AS ?x)
     WHERE {
         [Anything]
     }
}

But let's see what comes out of the WG mailing list discussion.

Thanks
Holger

>
>> And yes, optimizing the FILTER placement would be great and would
>> remove some of the pain and allow query authors to improve query
>> performance.
>
> I've raised JENA-293 to track this optimization. Please submit a patch.
>
> https://issues.apache.org/jira/browse/JENA-293
>
>>
>> Thanks, Holger
>
>     Andy

Re: Change in execution order between Jena 2.7.2 and 2.7.3

Posted by Andy Seaborne <an...@apache.org>.

On 12/08/12 02:46, Holger Knublauch wrote:
...
> but we and our customers have an unknown number of queries in
> production
...

TQ gets Jena for free.  What would help is if TQ tested against the 
nightly snapshot builds especially just before a release.  The project 
makes available development builds at all times so we can deal with such 
issues early, before a release.

> I believe TQ will need to raise this issue with the SPARQL 1.1 WG
> again, although it seems we are very late in the process.

You are, of course, welcome to.

Referring to specification text would strengthen your case.  Referring 
to implementation bugs is, IMO, not a strong case.  They happen, that's 
life.

Using the sub-query form will remove duplicating BIND statements. 
Sub-queries allow applying BIND after FILTER.

> And yes, optimizing the FILTER placement would be great and would
> remove some of the pain and allow query authors to improve query
> performance.

I've raised JENA-293 to track this optimization. Please submit a patch.

https://issues.apache.org/jira/browse/JENA-293

>
> Thanks, Holger

	Andy

Re: Change in execution order between Jena 2.7.2 and 2.7.3

Posted by Holger Knublauch <ho...@knublauch.com>.

Hi Andy,

oh my, this is really a bigger issue than I thought. The following query 
pattern also no longer works

SELECT *
WHERE {
     GRAPH <http://spinrdf.org/spin> {
         ?x rdfs:label ?label .
     }
     BIND (?label AS ?result) .
}

The above is again an artificial test case that makes no sense, but we 
and our customers have an unknown number of queries in production that 
use values from other { ... } blocks in BIND steps, often multiple BINDs 
where intermediate values are sliced and diced by succeeding BINDs. Here 
is a typical example (?graph is pre-bound from the outside):

SELECT ?result
WHERE {
     {
         BIND (xsd:string(?graph) AS ?str) .
         FILTER fn:starts-with(?str, "urn:x-evn-master:") .
     } .
     BIND (fn:substring(?str, 18) AS ?a) .
     BIND (spif:lastIndexOf(?a, ":") AS ?last) .
     BIND (IF(bound(?last), fn:substring(?a, 1, ?last), ?a) AS ?result) .
}

In another example from our queries

SELECT ?node ?label ?leaf ?icon ?movable
WHERE {
     {
         ?node skos:broader ?parent .
         BIND ((!swa:isReadOnlyTriple(?node, skos:broader, ?parent)) AS 
?movable) .
     }
     UNION
     {
         ?parent skos:hasTopConcept ?node .
         BIND ((!swa:isReadOnlyTriple(?parent, skos:hasTopConcept, 
?node)) AS ?movable) .
     } .
     BIND (NOT EXISTS  {
         ?child skos:broader ?node .
     } AS ?leaf) .
     BIND (ui:label(?node) AS ?label) .
     BIND ("evn-icon-concept" AS ?icon) .
}
ORDER BY (?label)

I have to move the computation of ?label into each branch of the UNION, 
and move the computation of ?leaf into the SELECT projection. The latter 
isn't a big problem except for readability, but the double appearance of 
?label is really bad. The new query is

SELECT ?node ?label ((NOT EXISTS  {
     ?child skos:broader ?node .
}) AS ?leaf) ?icon ?movable
WHERE {
     {
         ?node skos:broader ?parent .
         BIND ((!swa:isReadOnlyTriple(?node, skos:broader, ?parent)) AS 
?movable) .
         BIND (ui:label(?node) AS ?label) .
     }
     UNION
     {
         ?parent skos:hasTopConcept ?node .
         BIND ((!swa:isReadOnlyTriple(?parent, skos:hasTopConcept, 
?node)) AS ?movable) .
         BIND (ui:label(?node) AS ?label) .
     } .
     BIND ("evn-icon-concept" AS ?icon) .
}
ORDER BY (?label)

Others are harder to refactor. For example I have to reformulate this query

SELECT ?actionName ?onSelect ?enabled ?group ?label ?iconClass
WHERE {
     GRAPH ui:unionGraph {
         {
             ?action a swa:ResourceAction .
             ?action rdfs:label ?label .
             ?action arg:condition ?condition .
             BIND (ui:encodeNode(?action) AS ?actionName) .
             BIND (spl:object(?action, arg:onSelect) AS ?onSelectRaw) .
             BIND (COALESCE(?onSelectRaw, 
IF(swa:hasOtherArgument(?action), CONCAT("swa.openHandlerDialog(\"", 
ui:escapeJSON(?label), "\", \"<", xsd:string(?action), ">\", \"", 
ui:escapeJSON(xsd:string(?resource)), "\")"), ?none)) AS ?onSelect) .
             BIND (COALESCE(spl:object(?action, arg:group), "") AS ?group) .
             BIND (spl:object(?action, arg:iconClass) AS ?iconClass) .
             FILTER (((!bound(?appName)) || (?appName = "")) || 
swa:actionHasAppName(?action, ?appName)) .
         } .
         BIND (spin:eval(?condition, arg:resource, ?resource) AS ?enabled) .
         FILTER bound(?enabled) .
     } .
}
ORDER BY (?group) (?label)

because the FILTER depends on the previous BIND, but the BIND can't use 
the values from the upper block. I really don't want the spin:eval to be 
called if the FILTER above it is false - it's an expensive operation. I 
guess it has to become

SELECT ?actionName ?onSelect ?enabled ?group ?label ?iconClass
WHERE {
     GRAPH ui:unionGraph {
         ?action a swa:ResourceAction .
         ?action rdfs:label ?label .
         ?action arg:condition ?condition .
         BIND (ui:encodeNode(?action) AS ?actionName) .
         BIND (spl:object(?action, arg:onSelect) AS ?onSelectRaw) .
         BIND (COALESCE(?onSelectRaw, IF(swa:hasOtherArgument(?action), 
CONCAT("swa.openHandlerDialog(\"", ui:escapeJSON(?label), "\", \"<", 
xsd:string(?action), ">\", \"", ui:escapeJSON(xsd:string(?resource)), 
"\")"), ?none)) AS ?onSelect) .
         BIND (COALESCE(spl:object(?action, arg:group), "") AS ?group) .
         BIND (spl:object(?action, arg:iconClass) AS ?iconClass) .
         BIND ((((!bound(?appName)) || (?appName = "")) || 
swa:actionHasAppName(?action, ?appName)) AS ?app) .
         BIND (IF(?app, spin:eval(?condition, arg:resource, ?resource), 
?none) AS ?enabled) .
         FILTER bound(?enabled) .
     } .
}
ORDER BY (?group) (?label)

i.e. the trick is to replace the upper FILTER with the intermediate 
helper variable ?app, and use this to prevent the spin:eval call with an 
IF. This trick obviously doesn't scale if there is a chain of other BINDs.


While I don't understand all the technical details, I believe BIND has 
become unnecessarily limited and unintuitive with this spec. If your 
previous implementation (that you had for many years including LET) was 
indeed just a bug then it was a very useful bug. What ever happened to 
the nice mantra that SPARQL is executed from the inside out, if it 
becomes impossible to use the produced values in BIND statements? It 
seems that the baby has been thrown out with the bath water here.

I believe TQ will need to raise this issue with the SPARQL 1.1 WG again, 
although it seems we are very late in the process.

BTW in the future it would be helpful to see such changes listed in the 
release notes.

And yes, optimizing the FILTER placement would be great and would remove 
some of the pain and allow query authors to improve query performance.

Thanks,
Holger


On 8/12/2012 1:19, Andy Seaborne wrote:
> On 11/08/12 00:50, Holger Knublauch wrote:
>> On 8/10/2012 19:40, Andy Seaborne wrote:
>>> On 10/08/12 02:12, Holger Knublauch wrote:
>>>> Andy,
>>>>
>>>> we are evaluating the move to 2.7.3 and have been immediately hit by
>>>> what looks like a change of SPARQL semantics in ARQ. See the attached
>>>> Java test which returns "Test" in 272 but null in 273. The query is
>>>> really simple:
>>>>
>>>>      SELECT *
>>>>      WHERE {
>>>>          {
>>>>              BIND ("Test" AS ?label) .
>>>>          } .
>>>>          BIND (?label AS ?result) .
>>>>      }
>>>>
>>>> but ?label is no longer visible in the outer BIND. The same happens if
>>>> you replace the inner BIND with a BGP that binds ?label, but I 
>>>> wanted to
>>>> make the example model independent.
>>>>
>>>> So my obvious question: is this the intended behavior, why the change
>>>> etc?
>>>
>>> 2.7.3 is right - 2.7.2. is wrong (plain old bug, fixed due to having
>>> to clarify scoping in the SPARQL spec so I went back and check ARQ).
>>>
>>> >          {
>>> >              BIND ("Test" AS ?label) .
>>> >          } .
>>> >          BIND (?label AS ?result) .
>>>
>>> That's a join of the inner, first BIND and the outer BIND.
>>>
>>> The Outer BIND applies to the immediately preceeding BGP. BIND binds
>>> quite tightly (if you'll forgive the pun).
>>>
>>> The preceeding BGP is actually empty - it's between the "}" and
>>> "BIND (?label AS ?result) ."
>>>
>>> Think of it as :
>>>
>>>     {
>>>         { BIND ("Test" AS ?label) . }
>>>         {} BIND (?label AS ?result) .
>>>     }
>>>
>>> technically, that's structurally different but it stresses the empty
>>> part before second BIND.
>>>
>>> The important factor is the scope of ?label.
>>>
>>> The query joins "BIND ("Test" AS ?label)" and
>>> "BIND (?label AS ?result)".  So it evals "BIND (?label AS ?result)"
>>> not in the context of the "BIND ("Test" AS ?label)" i.e. the use of
>>> ?label in "BIND (?label AS ?result)" is unbound.
>>
>> Thanks Andy. I cannot claim that I understand this yet. Nor do I believe
>> many of our users will. Where does the "hidden {}" come from?
>>
>> The pattern that I don't see how to solve with the new design is as
>> follows:
>
> It's not a new design ... it's what the spec has said all along 
> although it was a bit of a mess.  The descriptive section was clear; 
> the formal section was open to "multiple interpretations" at best, 
> including none :-(  Any spec changes are to make it clear.Also, ARQ 
> was just plain wrong and had a bug regardless of the spec.
>
>>      {
>>          ?x ex:prop ?value .
>>          FILTER (?value some condition) .
>>      }
>>      BIND (my:function(?value) AS ?result) .
>>
>> I only want my:function to execute if the FILTER is passed. Therefore I
>> cannot simply write
>>
>>      ?x ex:prop ?value .
>>      FILTER (?value some condition) .
>>      BIND (my:function(?value) AS ?result) .
>>
>> because 2.7.2 moves the FILTER to the end and makes it effectively
>>
>>      ?x ex:prop ?value .
>>      BIND (my:function(?value) AS ?result) .
>>      FILTER (?value some condition) .
>>
>> I had introduced the inner { ... } block to ensure that the FILTER is
>> grouped together with the previous line. The mantra "SPARQL executes
>> from the inside out" was just easy enough to explain, but now inner
>> blocks seem to have become useless.
>>
>> How would I have to rewrite the first query to make sure that the BIND
>> is only executed after the FILTER, but with ?value bound?
>
> So this wil do exactly what you want - the SELECT expression form will 
> do what you want.
>
> SELECT ?value (my:function(?value) AS ?result)
> {
>    ?x ex:prop ?value .
>    FILTER (?value some condition) .
> }
>
> It is regrettable that * isn't allowed in this position.
>
> Then it really is like:
>
> BIND (my:function(?value) AS ?result WHERE
>       {
>          ?x ex:prop ?value .
>          FILTER (?value some condition) .
>       })
>
> The other way to approach is that
>
>   {
>     ?x ex:prop ?value .
>     BIND (my:function(?value) AS ?result) .
>     FILTER (?value some condition) .
>   }
>
> Any function should really cope with anything pased to it - it can 
> return as error (an exception) and ?result is not bound.
>
> The optimizer can push the filter through the (extend) - the algebra 
> operator for BIND - so the execution is more efficient.
>
> BGP -> extend -> filter
>
> becomes
>
> BGP -> filter -> extend
>
> It can do this because the extend variable ?result is not used in the 
> filter.  The code (TransformFilterPlacement) does not currently do this.
>
> I'd file a JIRA for it but JIRA@ASF is undergoing maintenance at the 
> moment.  They are having to move it to a bigger machine due to too 
> much load.
>
>     Andy
>
>>
>> Thanks
>> Holger
>>
>

Re: Change in execution order between Jena 2.7.2 and 2.7.3

Posted by Andy Seaborne <an...@apache.org>.

On 11/08/12 00:50, Holger Knublauch wrote:
> On 8/10/2012 19:40, Andy Seaborne wrote:
>> On 10/08/12 02:12, Holger Knublauch wrote:
>>> Andy,
>>>
>>> we are evaluating the move to 2.7.3 and have been immediately hit by
>>> what looks like a change of SPARQL semantics in ARQ. See the attached
>>> Java test which returns "Test" in 272 but null in 273. The query is
>>> really simple:
>>>
>>>      SELECT *
>>>      WHERE {
>>>          {
>>>              BIND ("Test" AS ?label) .
>>>          } .
>>>          BIND (?label AS ?result) .
>>>      }
>>>
>>> but ?label is no longer visible in the outer BIND. The same happens if
>>> you replace the inner BIND with a BGP that binds ?label, but I wanted to
>>> make the example model independent.
>>>
>>> So my obvious question: is this the intended behavior, why the change
>>> etc?
>>
>> 2.7.3 is right - 2.7.2. is wrong (plain old bug, fixed due to having
>> to clarify scoping in the SPARQL spec so I went back and check ARQ).
>>
>> >          {
>> >              BIND ("Test" AS ?label) .
>> >          } .
>> >          BIND (?label AS ?result) .
>>
>> That's a join of the inner, first BIND and the outer BIND.
>>
>> The Outer BIND applies to the immediately preceeding BGP.  BIND binds
>> quite tightly (if you'll forgive the pun).
>>
>> The preceeding BGP is actually empty - it's between the "}" and
>> "BIND (?label AS ?result) ."
>>
>> Think of it as :
>>
>>     {
>>         { BIND ("Test" AS ?label) . }
>>         {} BIND (?label AS ?result) .
>>     }
>>
>> technically, that's structurally different but it stresses the empty
>> part before second BIND.
>>
>> The important factor is the scope of ?label.
>>
>> The query joins "BIND ("Test" AS ?label)" and
>> "BIND (?label AS ?result)".  So it evals "BIND (?label AS ?result)"
>> not in the context of the "BIND ("Test" AS ?label)" i.e. the use of
>> ?label in "BIND (?label AS ?result)" is unbound.
>
> Thanks Andy. I cannot claim that I understand this yet. Nor do I believe
> many of our users will. Where does the "hidden {}" come from?
>
> The pattern that I don't see how to solve with the new design is as
> follows:

It's not a new design ... it's what the spec has said all along although 
it was a bit of a mess.  The descriptive section was clear; the formal 
section was open to "multiple interpretations" at best, including none 
:-(  Any spec changes are to make it clear.Also, ARQ was just plain 
wrong and had a bug regardless of the spec.

>      {
>          ?x ex:prop ?value .
>          FILTER (?value some condition) .
>      }
>      BIND (my:function(?value) AS ?result) .
>
> I only want my:function to execute if the FILTER is passed. Therefore I
> cannot simply write
>
>      ?x ex:prop ?value .
>      FILTER (?value some condition) .
>      BIND (my:function(?value) AS ?result) .
>
> because 2.7.2 moves the FILTER to the end and makes it effectively
>
>      ?x ex:prop ?value .
>      BIND (my:function(?value) AS ?result) .
>      FILTER (?value some condition) .
>
> I had introduced the inner { ... } block to ensure that the FILTER is
> grouped together with the previous line. The mantra "SPARQL executes
> from the inside out" was just easy enough to explain, but now inner
> blocks seem to have become useless.
>
> How would I have to rewrite the first query to make sure that the BIND
> is only executed after the FILTER, but with ?value bound?

So this wil do exactly what you want - the SELECT expression form will 
do what you want.

SELECT ?value (my:function(?value) AS ?result)
{
    ?x ex:prop ?value .
    FILTER (?value some condition) .
}

It is regrettable that * isn't allowed in this position.

Then it really is like:

BIND (my:function(?value) AS ?result WHERE
       {
          ?x ex:prop ?value .
          FILTER (?value some condition) .
       })

The other way to approach is that

   {
     ?x ex:prop ?value .
     BIND (my:function(?value) AS ?result) .
     FILTER (?value some condition) .
   }

Any function should really cope with anything pased to it - it can 
return as error (an exception) and ?result is not bound.

The optimizer can push the filter through the (extend) - the algebra 
operator for BIND - so the execution is more efficient.

BGP -> extend -> filter

becomes

BGP -> filter -> extend

It can do this because the extend variable ?result is not used in the 
filter.  The code (TransformFilterPlacement) does not currently do this.

I'd file a JIRA for it but JIRA@ASF is undergoing maintenance at the 
moment.  They are having to move it to a bigger machine due to too much 
load.

	Andy

>
> Thanks
> Holger
>

Re: Change in execution order between Jena 2.7.2 and 2.7.3

Posted by Holger Knublauch <ho...@knublauch.com>.

On 8/10/2012 19:40, Andy Seaborne wrote:
> On 10/08/12 02:12, Holger Knublauch wrote:
>> Andy,
>>
>> we are evaluating the move to 2.7.3 and have been immediately hit by
>> what looks like a change of SPARQL semantics in ARQ. See the attached
>> Java test which returns "Test" in 272 but null in 273. The query is
>> really simple:
>>
>>      SELECT *
>>      WHERE {
>>          {
>>              BIND ("Test" AS ?label) .
>>          } .
>>          BIND (?label AS ?result) .
>>      }
>>
>> but ?label is no longer visible in the outer BIND. The same happens if
>> you replace the inner BIND with a BGP that binds ?label, but I wanted to
>> make the example model independent.
>>
>> So my obvious question: is this the intended behavior, why the change 
>> etc?
>
> 2.7.3 is right - 2.7.2. is wrong (plain old bug, fixed due to having 
> to clarify scoping in the SPARQL spec so I went back and check ARQ).
>
> >          {
> >              BIND ("Test" AS ?label) .
> >          } .
> >          BIND (?label AS ?result) .
>
> That's a join of the inner, first BIND and the outer BIND.
>
> The Outer BIND applies to the immediately preceeding BGP.  BIND binds 
> quite tightly (if you'll forgive the pun).
>
> The preceeding BGP is actually empty - it's between the "}" and
> "BIND (?label AS ?result) ."
>
> Think of it as :
>
>     {
>         { BIND ("Test" AS ?label) . }
>         {} BIND (?label AS ?result) .
>     }
>
> technically, that's structurally different but it stresses the empty 
> part before second BIND.
>
> The important factor is the scope of ?label.
>
> The query joins "BIND ("Test" AS ?label)" and
> "BIND (?label AS ?result)".  So it evals "BIND (?label AS ?result)" 
> not in the context of the "BIND ("Test" AS ?label)" i.e. the use of 
> ?label in "BIND (?label AS ?result)" is unbound.

Thanks Andy. I cannot claim that I understand this yet. Nor do I believe 
many of our users will. Where does the "hidden {}" come from?

The pattern that I don't see how to solve with the new design is as follows:

     {
         ?x ex:prop ?value .
         FILTER (?value some condition) .
     }
     BIND (my:function(?value) AS ?result) .

I only want my:function to execute if the FILTER is passed. Therefore I 
cannot simply write

     ?x ex:prop ?value .
     FILTER (?value some condition) .
     BIND (my:function(?value) AS ?result) .

because 2.7.2 moves the FILTER to the end and makes it effectively

     ?x ex:prop ?value .
     BIND (my:function(?value) AS ?result) .
     FILTER (?value some condition) .

I had introduced the inner { ... } block to ensure that the FILTER is 
grouped together with the previous line. The mantra "SPARQL executes 
from the inside out" was just easy enough to explain, but now inner 
blocks seem to have become useless.

How would I have to rewrite the first query to make sure that the BIND 
is only executed after the FILTER, but with ?value bound?

Thanks
Holger

Re: Change in execution order between Jena 2.7.2 and 2.7.3

Posted by Andy Seaborne <an...@apache.org>.

On 10/08/12 02:12, Holger Knublauch wrote:
> Andy,
>
> we are evaluating the move to 2.7.3 and have been immediately hit by
> what looks like a change of SPARQL semantics in ARQ. See the attached
> Java test which returns "Test" in 272 but null in 273. The query is
> really simple:
>
>      SELECT *
>      WHERE {
>          {
>              BIND ("Test" AS ?label) .
>          } .
>          BIND (?label AS ?result) .
>      }
>
> but ?label is no longer visible in the outer BIND. The same happens if
> you replace the inner BIND with a BGP that binds ?label, but I wanted to
> make the example model independent.
>
> So my obvious question: is this the intended behavior, why the change etc?

2.7.3 is right - 2.7.2. is wrong (plain old bug, fixed due to having to 
clarify scoping in the SPARQL spec so I went back and check ARQ).

 >          {
 >              BIND ("Test" AS ?label) .
 >          } .
 >          BIND (?label AS ?result) .

That's a join of the inner, first BIND and the outer BIND.

The Outer BIND applies to the immediately preceeding BGP.  BIND binds 
quite tightly (if you'll forgive the pun).

The preceeding BGP is actually empty - it's between the "}" and
"BIND (?label AS ?result) ."

Think of it as :

     {
         { BIND ("Test" AS ?label) . }
         {} BIND (?label AS ?result) .
     }

technically, that's structurally different but it stresses the empty 
part before second BIND.

The important factor is the scope of ?label.

The query joins "BIND ("Test" AS ?label)" and
"BIND (?label AS ?result)".  So it evals "BIND (?label AS ?result)" not 
in the context of the "BIND ("Test" AS ?label)"  i.e. the use of ?label 
in "BIND (?label AS ?result)" is unbound.

	Andy

>
> Thanks,
> Holger
>