You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Rob Vesse (JIRA)" <ji...@apache.org> on 2015/07/07 16:08:04 UTC

[jira] [Comment Edited] (JENA-780) Single use extend expressions could be substituted directly for their later usage

    [ https://issues.apache.org/jira/browse/JENA-780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616748#comment-14616748 ] 

Rob Vesse edited comment on JENA-780 at 7/7/15 2:07 PM:
--------------------------------------------------------

Regression for JENA-779 is now addressed.  This optimisation is currently disabled by default and must be explicitly enabled like so:

{noformat}
ARQ.getContext().set(ARQ.optInlineAssignments, true);
// Optionally make it aggressive
ARQ.getContext().set(ARQ.optInlineAssignmentsAggressive, true);
{noformat}

Or with the command line tools:

{noformat}
--set arq:optInlineAssignments=true
{noformat}

There are some potential corner cases I have not yet explored which need more consideration

For example what happens if the single use is via a {{HAVING}} which of course becomes a {{filter}} in the algebra.  Is that likely to cause any issues because of moving the expression from within the aggregation to outside of it? I don't know that it would since the optimisation only inlines assignments which are currently in scope and anything that an assignment references would either be valid in the same scope (or unbound in which case the value of the expression is not going to change depending on where it is used).

The other possible corner case I've thought of is around assignments that occur in a specific branch of an operator (e.g. {{union}}, {{join}}, {{leftjoin}}, {{minus}} etc) since it could mean that the assignment would yield a different value when moved e.g.

{noformat}
SELECT ?s
WHERE
{
  { ?s ?p ?o }
  UNION
  { BIND(true) AS ?x) }
  FILTER(?x)
}
{noformat}

Would appear to change the semantics if we rewrite it as we do currently because as written the query should actually return only a single empty row because the solutions from the LHS of the {{UNION}} can never satisfy the {{FILTER}} but if inlined the {{FILTER}} becomes {{FILTER(true)}} which keeps all solutions.  This implies that we need further checking to validate if an assignment can be safely moved.


was (Author: rvesse):
Regression for JENA-779 is now addressed.  This optimisation is currently disabled by default and must be explicitly enabled like so:

{noformat}
ARQ.getContext().set(ARQ.optInlineAssignments, true);
// Optionally make it aggressive
ARQ.getContext().set(ARQ.optInlineAssignmentsAggressive, true);
{noformat}

Or with the command line tools:

{noformat}
--set arq:optInlineAssignments=true
{noformat}

There are some potential corner cases I have not yet explored which need more consideration

For example what happens if the single use is via a {{HAVING}} which of course becomes a {{filter}} in the algebra.  Is that likely to cause any issues because of moving the expression from within the aggregation to outside of it? I don't know that it would since the optimisation only inlines assignments which are currently in scope and anything that an assignment references would either be valid in the same scope (or unbound in which case the value of the expression is not going to change depending on where it is used).

The other possible corner case I've thought of is around assignments that occur in a specific branch of an operator (e.g. {{union}}, {{optional}}, {{join}}) since it could mean that the assignment would yield a different value when moved e.g.

{noformat}
SELECT ?s
WHERE
{
  { ?s ?p ?o }
  UNION
  { BIND(true) AS ?x) }
  FILTER(?x)
}
{noformat}

Would appear to change the semantics if we rewrite it as we do currently because as written the query should actually return only a single empty row because the solutions from the LHS of the {{UNION}} can never satisfy the {{FILTER}} but if inlined the {{FILTER}} becomes {{FILTER(true)}} which keeps all solutions.  This implies that we need further checking to validate if an assignment can be safely moved.

> Single use extend expressions could be substituted directly for their later usage
> ---------------------------------------------------------------------------------
>
>                 Key: JENA-780
>                 URL: https://issues.apache.org/jira/browse/JENA-780
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: ARQ, Optimizer
>    Affects Versions: Jena 2.12.0
>            Reporter: Rob Vesse
>            Assignee: Rob Vesse
>            Priority: Minor
>         Attachments: JENA-780.patch
>
>
> This RFE is a follow on from JENA-779, the query with a sub-optimal plan there uses a {{BIND}} to create a value which is then only used once in a subsequent filter.
> Actually that query uses it twice but I think the general approach I am trying to describe in this RFE bears consideration.  In this case it seems like it would be possible to substitute the extend expression for the bound variable in the filter expression.
> Simplified variant of original query such that the bound value is only used once:
> {noformat}
> SELECT DISTINCT ?uri
> {
>   { ?uri ?p ?o }
>   UNION
>   {
>     ?sub ?p ?uri
>     FILTER(isIRI(?uri))
>   }
>   BIND(str(?uri) as ?s)
>   FILTER(STRSTARTS(?s, "http://"))
> }
> {noformat}
> Rewritten query:
> {noformat}
> SELECT DISTINCT ?uri
> {
>   { ?uri ?p ?o }
>   UNION
>   {
>     ?sub ?p ?uri
>     FILTER(isIRI(?uri))
>   }
>   FILTER(STRSTARTS(str(?uri), "http://"))
> }
> {noformat}
> Which avoids an extend expression whose value is only used once and will ultimately be thrown away.
> From a {{Transform}} standpoint this is likely awkward to implement in a pure transform since it requires knowledge about the query structure above the {{FILTER}} i.e. whether the bound variable is used elsewhere and so would need to use before and after visitors to track that additional state but I think this is a feasible optimisation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)