You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Rob Vesse <rv...@yarcdata.com> on 2012/08/29 18:51:09 UTC

Slashes in variable names in generated algebra?

Andy

Just got a question from one of our devs about algebra ARQ is generating.  The original query is sq-08.rq from the subquery tests of the 1.1 test suite, the following is the fragment of the algebra he was asking about:

(extend ((?max ?/.0))
          (group () ((?/.0 (max ?/y)))
            (quadpattern (quad <urn:x-arq:DefaultGraphNode> ?/x <http://www.example.org/schema#p> ?/y)

Under what circumstances will ARQ insert slashes into variable names?

Thanks,

Rob

Re: Slashes in variable names in generated algebra?

Posted by Andy Seaborne <an...@apache.org>.
On 30/08/12 17:32, Rob Vesse wrote:
>
> On 8/30/12 2:45 AM, "Andy Seaborne" <an...@apache.org> wrote:
>
>> On 30/08/12 00:51, Rob Vesse wrote:
>>> Is there a particular class responsible for this renaming?
>>
>> The constants are in ARQConstants.
>>
>> The rename engine is in com.hp.hpl.jena.sparql.engine.Rename
>>
>>> Our developers would like to know exactly what forms Jena will use for
>>> variable names in the generated algebra as they've seen some other
>>> strange
>>> ones like ?*x as well as the ?/x
>>
>> IIRC ?*gNNN get allocated for some quad conversion.  Can't remember what
>> exactly (non-simplified property paths in GRAPHs?)
>>
>> Is there a specific reason?  (This is internal stuff that the systems
>> may change without notice.)
>
> A decision was made at some point in time to communicate queries
> internally within our system using the Jena algebra representation, then
> people complain every time they see something from Jena they hadn't seen
> before or expected (like the OpDisjunction I also asked about).

Ah - thank you for the background.  Makes sense.

So you're passing round algebra after optimization?

(disjunction) and ?/wacky only show up after the optimizer has run.  The 
algebra-from-query is as in the SPARQL spec.  ARQ introduces new 
operators to note certain features (e.g. (sequence) is a bunch of joins 
without scope issues).

	Andy

>
>>
>> If the execution strategy is not the constant-memory one the released
>> OpExecutor uses (and TDB[released]), then renaming isn't necessary as it
>> "just happens" as a result of bottom up evaluation.  Converting to
>> global variables has some convenience though, such as debugging and not
>> needing to do internal projection.
>
> I'm not sure if we could do away with renaming and as you said it does
> have some advantages.
>
> Rob
>
>>
>> 	Andy


Re: Slashes in variable names in generated algebra?

Posted by Rob Vesse <rv...@yarcdata.com>.
On 8/30/12 2:45 AM, "Andy Seaborne" <an...@apache.org> wrote:

>On 30/08/12 00:51, Rob Vesse wrote:
>> Is there a particular class responsible for this renaming?
>
>The constants are in ARQConstants.
>
>The rename engine is in com.hp.hpl.jena.sparql.engine.Rename
>
>> Our developers would like to know exactly what forms Jena will use for
>> variable names in the generated algebra as they've seen some other
>>strange
>> ones like ?*x as well as the ?/x
>
>IIRC ?*gNNN get allocated for some quad conversion.  Can't remember what
>exactly (non-simplified property paths in GRAPHs?)
>
>Is there a specific reason?  (This is internal stuff that the systems
>may change without notice.)

A decision was made at some point in time to communicate queries
internally within our system using the Jena algebra representation, then
people complain every time they see something from Jena they hadn't seen
before or expected (like the OpDisjunction I also asked about).

>
>If the execution strategy is not the constant-memory one the released
>OpExecutor uses (and TDB[released]), then renaming isn't necessary as it
>"just happens" as a result of bottom up evaluation.  Converting to
>global variables has some convenience though, such as debugging and not
>needing to do internal projection.

I'm not sure if we could do away with renaming and as you said it does
have some advantages.

Rob

>
>	Andy
>
>>
>> Rob
>>
>> On 8/29/12 10:24 AM, "Andy Seaborne" <an...@apache.org> wrote:
>>
>>> On 29/08/12 17:51, Rob Vesse wrote:
>>>> Andy
>>>>
>>>> Just got a question from one of our devs about algebra ARQ is
>>>> generating.  The original query is sq-08.rq from the subquery tests of
>>>> the 1.1 test suite, the following is the fragment of the algebra he
>>>>was
>>>> asking about:
>>>>
>>>> (extend ((?max ?/.0))
>>>>             (group () ((?/.0 (max ?/y)))
>>>>               (quadpattern (quad <urn:x-arq:DefaultGraphNode> ?/x
>>>> <http://www.example.org/schema#p> ?/y)
>>>>
>>>> Under what circumstances will ARQ insert slashes into variable names?
>>>
>>> When it needs to rename them to make then different from another use of
>>> the same name in a different scope.   It renames all variables that do
>>> not show in the projection of subqueries if it detemines any renaming
>>>is
>>> needed.
>>>
>>> The GROUP BY isn't the trigger: this query is simpler and uses
>>>renaming:
>>>
>>> PREFIX : <http://example/>
>>> SELECT *
>>> {
>>>     { SELECT ?x { ?x :p ?y } }
>>>     ?y :q ?r
>>> }
>>>
>>> There are two separate uses of ?y in that query.
>>>
>>>     { SELECT ?x { ?x :p ?y } }
>>>
>>> does not expose ?y so the use of ?y there is independent of the
>>> " ?y :q ?r".
>>>
>>> If ARQ determines some renaming is necessary, it systematically renames
>>> everything hidden by project scoping - it renames to only leave
>>> variables that exposed.  Hence the ?/.0 from the group aggregate.
>>>
>>> 	Andy
>>>
>>>>
>>>> Thanks,
>>>>
>>>> Rob
>>>>
>>>
>>
>


Re: Slashes in variable names in generated algebra?

Posted by Andy Seaborne <an...@apache.org>.
On 30/08/12 00:51, Rob Vesse wrote:
> Is there a particular class responsible for this renaming?

The constants are in ARQConstants.

The rename engine is in com.hp.hpl.jena.sparql.engine.Rename

> Our developers would like to know exactly what forms Jena will use for
> variable names in the generated algebra as they've seen some other strange
> ones like ?*x as well as the ?/x

IIRC ?*gNNN get allocated for some quad conversion.  Can't remember what 
exactly (non-simplified property paths in GRAPHs?)

Is there a specific reason?  (This is internal stuff that the systems 
may change without notice.)

If the execution strategy is not the constant-memory one the released 
OpExecutor uses (and TDB[released]), then renaming isn't necessary as it 
"just happens" as a result of bottom up evaluation.  Converting to 
global variables has some convenience though, such as debugging and not 
needing to do internal projection.

	Andy

>
> Rob
>
> On 8/29/12 10:24 AM, "Andy Seaborne" <an...@apache.org> wrote:
>
>> On 29/08/12 17:51, Rob Vesse wrote:
>>> Andy
>>>
>>> Just got a question from one of our devs about algebra ARQ is
>>> generating.  The original query is sq-08.rq from the subquery tests of
>>> the 1.1 test suite, the following is the fragment of the algebra he was
>>> asking about:
>>>
>>> (extend ((?max ?/.0))
>>>             (group () ((?/.0 (max ?/y)))
>>>               (quadpattern (quad <urn:x-arq:DefaultGraphNode> ?/x
>>> <http://www.example.org/schema#p> ?/y)
>>>
>>> Under what circumstances will ARQ insert slashes into variable names?
>>
>> When it needs to rename them to make then different from another use of
>> the same name in a different scope.   It renames all variables that do
>> not show in the projection of subqueries if it detemines any renaming is
>> needed.
>>
>> The GROUP BY isn't the trigger: this query is simpler and uses renaming:
>>
>> PREFIX : <http://example/>
>> SELECT *
>> {
>>     { SELECT ?x { ?x :p ?y } }
>>     ?y :q ?r
>> }
>>
>> There are two separate uses of ?y in that query.
>>
>>     { SELECT ?x { ?x :p ?y } }
>>
>> does not expose ?y so the use of ?y there is independent of the
>> " ?y :q ?r".
>>
>> If ARQ determines some renaming is necessary, it systematically renames
>> everything hidden by project scoping - it renames to only leave
>> variables that exposed.  Hence the ?/.0 from the group aggregate.
>>
>> 	Andy
>>
>>>
>>> Thanks,
>>>
>>> Rob
>>>
>>
>


Re: Slashes in variable names in generated algebra?

Posted by Rob Vesse <rv...@yarcdata.com>.
Is there a particular class responsible for this renaming?


Our developers would like to know exactly what forms Jena will use for
variable names in the generated algebra as they've seen some other strange
ones like ?*x as well as the ?/x

Rob

On 8/29/12 10:24 AM, "Andy Seaborne" <an...@apache.org> wrote:

>On 29/08/12 17:51, Rob Vesse wrote:
>> Andy
>>
>> Just got a question from one of our devs about algebra ARQ is
>>generating.  The original query is sq-08.rq from the subquery tests of
>>the 1.1 test suite, the following is the fragment of the algebra he was
>>asking about:
>>
>> (extend ((?max ?/.0))
>>            (group () ((?/.0 (max ?/y)))
>>              (quadpattern (quad <urn:x-arq:DefaultGraphNode> ?/x
>><http://www.example.org/schema#p> ?/y)
>>
>> Under what circumstances will ARQ insert slashes into variable names?
>
>When it needs to rename them to make then different from another use of
>the same name in a different scope.   It renames all variables that do
>not show in the projection of subqueries if it detemines any renaming is
>needed.
>
>The GROUP BY isn't the trigger: this query is simpler and uses renaming:
>
>PREFIX : <http://example/>
>SELECT *
>{
>    { SELECT ?x { ?x :p ?y } }
>    ?y :q ?r
>}
>
>There are two separate uses of ?y in that query.
>
>    { SELECT ?x { ?x :p ?y } }
>
>does not expose ?y so the use of ?y there is independent of the
>" ?y :q ?r".
>
>If ARQ determines some renaming is necessary, it systematically renames
>everything hidden by project scoping - it renames to only leave
>variables that exposed.  Hence the ?/.0 from the group aggregate.
>
>	Andy
>
>>
>> Thanks,
>>
>> Rob
>>
>


Re: Slashes in variable names in generated algebra?

Posted by Andy Seaborne <an...@apache.org>.
On 29/08/12 17:51, Rob Vesse wrote:
> Andy
>
> Just got a question from one of our devs about algebra ARQ is generating.  The original query is sq-08.rq from the subquery tests of the 1.1 test suite, the following is the fragment of the algebra he was asking about:
>
> (extend ((?max ?/.0))
>            (group () ((?/.0 (max ?/y)))
>              (quadpattern (quad <urn:x-arq:DefaultGraphNode> ?/x <http://www.example.org/schema#p> ?/y)
>
> Under what circumstances will ARQ insert slashes into variable names?

When it needs to rename them to make then different from another use of 
the same name in a different scope.   It renames all variables that do 
not show in the projection of subqueries if it detemines any renaming is 
needed.

The GROUP BY isn't the trigger: this query is simpler and uses renaming:

PREFIX : <http://example/>
SELECT *
{
    { SELECT ?x { ?x :p ?y } }
    ?y :q ?r
}

There are two separate uses of ?y in that query.

    { SELECT ?x { ?x :p ?y } }

does not expose ?y so the use of ?y there is independent of the
" ?y :q ?r".

If ARQ determines some renaming is necessary, it systematically renames 
everything hidden by project scoping - it renames to only leave 
variables that exposed.  Hence the ?/.0 from the group aggregate.

	Andy

>
> Thanks,
>
> Rob
>