You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Rob Battle <rb...@bbn.com> on 2011/01/26 20:36:26 UTC

TransformFilterPlacement optimization

Hi,

I have been looking at the com.hp.hpl.jena.sparql.algebra.Optimize class in order to add some specific optimizations for BBN's Parliament [1] triple store.  I noticed that the Filter Placement algorithm executed by the TransformFilterPlacement class does not take into account conditionals.

For instance, suppose you have the following query:
PREFIX : <http://example.org/>
SELECT ?s ?name
WHERE {
  ?s :age ?age .
   OPTIONAL {
      ?s :name ?name .
   }
   FILTER (?age < 20) .
}

Couldn't this be optimized to move the filter above the optional block?  Currently, the TransformFilterPlacement class examines a filter's sub operation and checks to see if it is either an OpBGP, OpSequence, or OpQuadPattern.

It seems like this could be extended to also look at the OpConditional.  For instance, the following method will move the filter into the left side of the conditional if it applies.

   private static Op transformFilterConditional(ExprList exprs, Set<Var> varScope, OpConditional opConditional) {
      
      // Any filters that depend on no variables. 
      Op op = insertAnyFilter(exprs, varScope, null) ;
      
      Op left = opConditional.getLeft();
      
      left = transform(exprs, varScope, left);
      
      Op right = opConditional.getRight();
      
      op = new OpConditional(left, right);
      
      op = insertAnyFilter(exprs, varScope, op);
      return op;
      
   }

For the above query, the original TransformFilterPlacement class generates the following algebra expression:

(project (?s ?name)
  (filter (< ?age 20)
    (conditional
      (bgp (triple ?s <http://example.org/age> ?age))
      (bgp (triple ?s <http://example.org/name> ?name)))))

while the modified code generates the following:

(project (?s ?name)
  (conditional
    (filter (< ?age 20)
      (bgp (triple ?s <http://example.org/age> ?age)))
    (bgp (triple ?s <http://example.org/name> ?name))))


Unless I'm mistaken, this will cause the filter to be applied before the optional is considered.  Is there something I'm missing where this optimization is invalid?

-rob

[1] http://parliament.semwebcentral.org

Re: TransformFilterPlacement optimization

Posted by Andy Seaborne <an...@epimorphics.com>.
Hi Rob,

 > Unless I'm mistaken, this will cause the filter to be applied before 
the optional is considered.  Is there something I'm missing where this 
optimization is invalid?

No, I don't think you missing anything - looks good to me. There are 
probably other placements that works as well.

I've added your suggestion to the codebase - it's in today's SNAPSHOT 
build with a couple of tests (TestFilterTransform). Have you got any 
tests for it?

	Andy

On 26/01/11 19:36, Rob Battle wrote:
> Hi,
>
> I have been looking at the com.hp.hpl.jena.sparql.algebra.Optimize class in order to add some specific optimizations for BBN's Parliament [1] triple store.  I noticed that the Filter Placement algorithm executed by the TransformFilterPlacement class does not take into account conditionals.
>
> For instance, suppose you have the following query:
> PREFIX :<http://example.org/>
> SELECT ?s ?name
> WHERE {
>    ?s :age ?age .
>     OPTIONAL {
>        ?s :name ?name .
>     }
>     FILTER (?age<  20) .
> }
>
> Couldn't this be optimized to move the filter above the optional block?  Currently, the TransformFilterPlacement class examines a filter's sub operation and checks to see if it is either an OpBGP, OpSequence, or OpQuadPattern.
>
> It seems like this could be extended to also look at the OpConditional.  For instance, the following method will move the filter into the left side of the conditional if it applies.
>
>     private static Op transformFilterConditional(ExprList exprs, Set<Var>  varScope, OpConditional opConditional) {
>
>        // Any filters that depend on no variables.
>        Op op = insertAnyFilter(exprs, varScope, null) ;
>
>        Op left = opConditional.getLeft();
>
>        left = transform(exprs, varScope, left);
>
>        Op right = opConditional.getRight();
>
>        op = new OpConditional(left, right);
>
>        op = insertAnyFilter(exprs, varScope, op);
>        return op;
>
>     }
>
> For the above query, the original TransformFilterPlacement class generates the following algebra expression:
>
> (project (?s ?name)
>    (filter (<  ?age 20)
>      (conditional
>        (bgp (triple ?s<http://example.org/age>  ?age))
>        (bgp (triple ?s<http://example.org/name>  ?name)))))
>
> while the modified code generates the following:
>
> (project (?s ?name)
>    (conditional
>      (filter (<  ?age 20)
>        (bgp (triple ?s<http://example.org/age>  ?age)))
>      (bgp (triple ?s<http://example.org/name>  ?name))))
>
>
> Unless I'm mistaken, this will cause the filter to be applied before the optional is considered.  Is there something I'm missing where this optimization is invalid?
>
> -rob
>
> [1] http://parliament.semwebcentral.org