You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jena.apache.org by Kashif Rabbani <ka...@cs.aau.dk> on 2020/03/03 11:56:07 UTC

Order of triple patterns in Where Clause

Hi awesome community,

I have a question,  I am working on optimizing SPARQL query plan and I wonder does the order of triple patterns in the where clause effects the query plan or not?

For example, given a following query:

PREFIX  bio:  <http://purl.org/vocab/bio/0.1/>
PREFIX  mo:   <http://purl.org/ontology/mo/>
PREFIX  mbz:  <http://dbtune.org/musicbrainz/resource/vocab/>
PREFIX  cmno: <http://purl.org/ontology/classicalmusicnav#>

SELECT  ?a ?b ?c
WHERE
  { ?a  mbz:alias           "Amy Beach" .
    ?b  cmno:hasInfluenced  ?a .
    ?c  mo:composer         ?b ;
        bio:date            ?d
  }



// Let’s generate its algebra
Op op = Algebra.compile(query); results into this:

(project (?a ?b ?c)
  (bgp
    (triple ?a <http://dbtune.org/musicbrainz/resource/vocab/alias> "Amy Beach")
    (triple ?b <http://purl.org/ontology/classicalmusicnav#hasInfluenced> ?a)
    (triple ?c <http://purl.org/ontology/mo/composer> ?b)
    (triple ?c <http://purl.org/vocab/bio/0.1/date> ?d)
  ))

The bgp in algebra follows the exact same order as specified in the where clause of the query. Very precisely, does Jena constructs the query plan as it is? or it will change the order at some other level?

I would be happy if someone can guide me about how the Jena's plan actually constructed. If I will use some statistics of the actual RDF graph to change the order of triple patterns in the BGP based on selectivity, would it optimize the plan somehow?

Many Thanks,


Best Regards,
Kashif Rabbani.

Re: Order of triple patterns in Where Clause

Posted by Marco Neumann <ma...@gmail.com>.

yes that's more like it Andy. I will come back to you later on this in a
new thread

On Wed 11. Mar 2020 at 08:53, Andy Seaborne <an...@apache.org> wrote:

>
>
> On 09/03/2020 13:10, Marco Neumann wrote:
> > Ok granted yes not functional programming correct, but I think we are
> > getting sidetracked here by the specific meaning of the terms "purely
> > functional" and "pure function" . I am referring here only to the feature
> > of counting reductions in haskell.
> >
> > So let's take a basic query plus filter in ARQ on the following data set:
> >
> >   :a :b :c
> >   :c :d :e
> >   :f :g :h
> >
> > with this query:
> >
> > (filter (! (sameTerm ?x ?c))
> >    (bgp
> >      (triple ?x ?y ?z)
> >      (triple ?c ?d ?e)
> >    ))
> >
> > how many total "evaluations/operations" are performed over the data set
> to
> > arrive at a result set of 6?
>
> One join, generating 9 bindings. Then filter 9 things.
>
> One access for 3 results, then 3 accesses of three results.
> (c.f. LFJ)
>    If it were a hash join, then 3+3+hash work.
>
> Substition join (index join) works well when one side smaller then the
> other. A not untypical example is inverse functional properties; one
> side is one match, then to a single, more grounded access.
>
> ?s :id "12345"
>     :property1 ?v1
>
>      Andy
>
> >
> >
> > On Mon, Mar 9, 2020 at 11:50 AM Andy Seaborne <an...@apache.org> wrote:
> >
> >> I don't see how it applies the ARQ evaluator.
> >>
> >> That's not how it works.
> >>
> >> Just because the algebra is functional, it's not functional programming
> >> and not reduction evaluation.  It has executable statements and external
> >> data.
> >>
> >>       Andy
> >>
> >> On 08/03/2020 17:02, Marco Neumann wrote:
> >>> sorry my bad, that was a typo should be reductions* . A very basic
> >> concept
> >>> in functional languages like haskell and heap size measured in cells.
> >>>
> >>>
> >>> "Reduction is the process of converting an expression to a simpler
> form.
> >>> Conceptually, an expression is reduced by simplifying one reducible
> >>> expression (called “redex”) at a time."
> >>>
> >>
> https://www.futurelearn.com/courses/functional-programming-haskell/0/steps/27197
> >>>
> >>>
> >>> On Sun, Mar 8, 2020 at 4:44 PM Andy Seaborne <an...@apache.org> wrote:
> >>>
> >>>> Then I don't understand what you are looking for.
> >>>>
> >>>> What's a "deduction"? What's a "cell"?
> >>>>
> >>>> On 08/03/2020 14:22, Marco Neumann wrote:
> >>>>> thank you for the hint Andy, but not quite what I was looking for.
> >>>>>
> >>>>> I was aiming more for a type of feature I am familiar with from
> purely
> >>>>> functional programming languages like haskell, hugs, miranda etc to
> >>>> display
> >>>>> deductions and cells used during execution.
> >>>>>
> >>>>> Marco
> >>>>>
> >>>>> On Sun, Mar 8, 2020 at 10:42 AM Andy Seaborne <an...@apache.org>
> wrote:
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 06/03/2020 17:40, Marco Neumann wrote:
> >>>>>>> is there statistical data available for the number of deductions /
> >>>>>>> joins performed for each SPARQL query of a QueryExecution object?
> >>>>>>
> >>>>>> If you run with "explain" you can find out but there isn't a
> specific
> >>>>>> record kept by the code.
> >>>>>>
> >>>>>>>
> >>>>>>> On Fri, Mar 6, 2020 at 3:16 PM Andy Seaborne <an...@apache.org>
> >> wrote:
> >>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 05/03/2020 08:32, Kashif Rabbani wrote:
> >>>>>>>>> Hi Andy,
> >>>>>>>>>
> >>>>>>>>> Thanks for your response. I was wondering if there is any
> detailed
> >>>>>>>> documentation of the Jena optimization (rewriting & reordering)
> >>>>>> available
> >>>>>>>> online? If yes, can you please send me the reference?.
> >>>>>>>>
> >>>>>>>> The code mainly.
> >>>>>>>>
> >>>>>>>> The TDB stats is documented.
> >>>>>>>>
> >>>>>>>>> Also, if I create my own query plan (in algebraic form), is it
> >>>> possible
> >>>>>>>> to make Jena execute it as it is? I mean how to turn off jena’s
> >>>>>>>> optimization (rewriting & reordering)  and force my query plan for
> >>>>>>>> execution.
> >>>>>>>>
> >>>>>>>> Yes - two parts - algebra rewrites and BGP reordering.
> >>>>>>>>
> >>>>>>>> The context is a mapping of settings.
> >>>>>>>> there is a global context (ARQ.getContext())
> >>>>>>>> one per the DatasetGraph.getContext()
> >>>>>>>> one per query execution. QueryExecution.getContext()
> >>>>>>>>
> >>>>>>>> and it is treated hierarchically:
> >>>>>>>>
> >>>>>>>> Lookup in QueryExecution then DatasetGraph the Global.
> >>>>>>>>
> >>>>>>>> :: Algebra rewrite
> >>>>>>>>
> >>>>>>>> Some algebra rewrites have to be done - property functions, and
> >>>> rewrite
> >>>>>>>> some variables due to scoping. These aren't really "optimizations
> >>>> steps"
> >>>>>>>> but happen in that phase. There is OptimizerMinimal for those.
> >>>>>>>>
> >>>>>>>> To turn off optimizer and still do the minimum steps.
> >>>>>>>>
> >>>>>>>> context.set(ARQ.optimization, false)
> >>>>>>>>
> >>>>>>>> Either Algebra.exec(op, dsg) executes the algebra as given -
> that's
> >> a
> >>>>>>>> very low levelway of doing it.
> >>>>>>>>
> >>>>>>>> Turning the optimizer off is better because all the APIs work. eg
> >>>>>>>> QueryExecution.
> >>>>>>>>
> >>>>>>>> :: BGP reordering
> >>>>>>>>
> >>>>>>>> The reordering of triple patterns is separate.
> >>>>>>>> BGP steps are performed by a StageGenerator.
> >>>>>>>>
> >>>>>>>> To set up to use a custom StageGenerator:
> >>>>>>>>
> >>>>>>>> StageBuilder.setGenerator(ARQ.getContext(), stageGenerator) ;
> >>>>>>>>
> >>>>>>>> That's really only  call of
> >>>>>>>>         context.set(ARQ.stageGenerator, myStageGenerator) ;
> >>>>>>>>
> >>>>>>>> The default is StageGenratorGeneric that does ReorderFixed.
> >>>>>>>> It is used if there is no other setting in the context.
> >>>>>>>>
> >>>>>>>>          Andy
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Thanks again for your help.
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>>
> >>>>>>>>> Kashif Rabbani,
> >>>>>>>>> Research Assistant,
> >>>>>>>>> Department of Computer Science,
> >>>>>>>>> Aalborg University, Denmark.
> >>>>>>>>>
> >>>>>>>>>> On 3 Mar 2020, at 13.43, Andy Seaborne <an...@apache.org> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Hi Kashif,
> >>>>>>>>>>
> >>>>>>>>>> Optimization happens in two stages:
> >>>>>>>>>>
> >>>>>>>>>> 1. Rewrite of the algebra
> >>>>>>>>>> 2. Reordering of the BGPs
> >>>>>>>>>>
> >>>>>>>>>> BGPs can be implemented differnet ways - and they are an
> >> inferenece
> >>>>>>>> extnesion point in SPARQL.
> >>>>>>>>>>
> >>>>>>>>>> What you see if the first. BGPs are reordered during execution.
> >>>>>>>>>>
> >>>>>>>>>> The algorithm can be stats driven for TDB and TDB2 storage:
> >>>>>>>>>>       https://jena.apache.org/documentation/tdb/optimizer.html
> >>>>>>>>>>
> >>>>>>>>>> The interface is
> >>>>>>>>
> >> org.apache.jena.sparql.engine.optimizer.reorder.ReorderTransformation
> >>>>>>>>>>
> >>>>>>>>>> and a general purpose reordering is done for in-memory and is
> the
> >>>>>>>> default for TDB.
> >>>>>>>>>>
> >>>>>>>>>> The default reorder is "grounded triples first, leave equal
> >> weights
> >>>>>>>> alone". It cascades whether a term is bound by an earlier step.
> >>>>>>>>>>
> >>>>>>>>>>>         { ?a  mbz:alias           "Amy Beach" .
> >>>>>>>>>>>           ?b  cmno:hasInfluenced  ?a .
> >>>>>>>>>>>           ?c  mo:composer         ?b ;
> >>>>>>>>>>>               bio:date            ?d
> >>>>>>>>>>>         }
> >>>>>>>>>>
> >>>>>>>>>> That's actually the default order -
> >>>>>>>>>>
> >>>>>>>>>> ?a  mbz:alias           "Amy Beach" .
> >>>>>>>>>>
> >>>>>>>>>> has two bound terms so is done first.
> >>>>>>>>>>
> >>>>>>>>>> and now ?a is bound so
> >>>>>>>>>> ?b  cmno:hasInfluenced  ?a .
> >>>>>>>>>>
> >>>>>>>>>> etc.
> >>>>>>>>>>
> >>>>>>>>>> Given the boundedness of the pattern, and (guess) mbz:alias "Amy
> >>>>>> Beach"
> >>>>>>>> is quite selective, With stats  ? <property> ? would have to be
> less
> >>>>>>>> numerous than ? mbz:alias "Amy Beach".
> >>>>>>>>>>
> >>>>>>>>>> There's no algebra optimization for your example, only BGP
> >>>> reordering.
> >>>>>>>>>>
> >>>>>>>>>> qparse --print=opt shows stage 1 optimizations.
> >>>>>>>>>>
> >>>>>>>>>> Executing with "explain" shows BGP execution.
> >>>>>>>>>>
> >>>>>>>>>>         Andy
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 03/03/2020 11:56, Kashif Rabbani wrote:
> >>>>>>>>>>> Hi awesome community,
> >>>>>>>>>>> I have a question,  I am working on optimizing SPARQL query
> plan
> >>>> and
> >>>>>> I
> >>>>>>>> wonder does the order of triple patterns in the where clause
> effects
> >>>> the
> >>>>>>>> query plan or not?
> >>>>>>>>>>> For example, given a following query:
> >>>>>>>>>>> PREFIX  bio:  <http://purl.org/vocab/bio/0.1/>
> >>>>>>>>>>> PREFIX  mo:   <http://purl.org/ontology/mo/>
> >>>>>>>>>>> PREFIX  mbz:  <http://dbtune.org/musicbrainz/resource/vocab/>
> >>>>>>>>>>> PREFIX  cmno: <http://purl.org/ontology/classicalmusicnav#>
> >>>>>>>>>>> SELECT  ?a ?b ?c
> >>>>>>>>>>> WHERE
> >>>>>>>>>>>        { ?a  mbz:alias           "Amy Beach" .
> >>>>>>>>>>>          ?b  cmno:hasInfluenced  ?a .
> >>>>>>>>>>>          ?c  mo:composer         ?b ;
> >>>>>>>>>>>              bio:date            ?d
> >>>>>>>>>>>        }
> >>>>>>>>>>> // Let’s generate its algebra
> >>>>>>>>>>> Op op = Algebra.compile(query); results into this:
> >>>>>>>>>>> (project (?a ?b ?c)
> >>>>>>>>>>>        (bgp
> >>>>>>>>>>>          (triple ?a <
> >>>> http://dbtune.org/musicbrainz/resource/vocab/alias
> >>>>>>>
> >>>>>>>> "Amy Beach")
> >>>>>>>>>>>          (triple ?b <
> >>>>>>>> http://purl.org/ontology/classicalmusicnav#hasInfluenced> ?a)
> >>>>>>>>>>>          (triple ?c <http://purl.org/ontology/mo/composer> ?b)
> >>>>>>>>>>>          (triple ?c <http://purl.org/vocab/bio/0.1/date> ?d)
> >>>>>>>>>>>        ))
> >>>>>>>>>>> The bgp in algebra follows the exact same order as specified in
> >> the
> >>>>>>>> where clause of the query. Very precisely, does Jena constructs
> the
> >>>>>> query
> >>>>>>>> plan as it is? or it will change the order at some other level?
> >>>>>>>>>>> I would be happy if someone can guide me about how the Jena's
> >> plan
> >>>>>>>> actually constructed. If I will use some statistics of the actual
> >> RDF
> >>>>>> graph
> >>>>>>>> to change the order of triple patterns in the BGP based on
> >>>> selectivity,
> >>>>>>>> would it optimize the plan somehow?
> >>>>>>>>>>> Many Thanks,
> >>>>>>>>>>> Best Regards,
> >>>>>>>>>>> Kashif Rabbani.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >
> >
>
-- 


---
Marco Neumann
KONA

Re: Order of triple patterns in Where Clause

Posted by Andy Seaborne <an...@apache.org>.


On 09/03/2020 13:10, Marco Neumann wrote:
> Ok granted yes not functional programming correct, but I think we are
> getting sidetracked here by the specific meaning of the terms "purely
> functional" and "pure function" . I am referring here only to the feature
> of counting reductions in haskell.
> 
> So let's take a basic query plus filter in ARQ on the following data set:
> 
>   :a :b :c
>   :c :d :e
>   :f :g :h
> 
> with this query:
> 
> (filter (! (sameTerm ?x ?c))
>    (bgp
>      (triple ?x ?y ?z)
>      (triple ?c ?d ?e)
>    ))
> 
> how many total "evaluations/operations" are performed over the data set to
> arrive at a result set of 6?

One join, generating 9 bindings. Then filter 9 things.

One access for 3 results, then 3 accesses of three results.
(c.f. LFJ)
   If it were a hash join, then 3+3+hash work.

Substition join (index join) works well when one side smaller then the 
other. A not untypical example is inverse functional properties; one 
side is one match, then to a single, more grounded access.

?s :id "12345"
    :property1 ?v1

     Andy

> 
> 
> On Mon, Mar 9, 2020 at 11:50 AM Andy Seaborne <an...@apache.org> wrote:
> 
>> I don't see how it applies the ARQ evaluator.
>>
>> That's not how it works.
>>
>> Just because the algebra is functional, it's not functional programming
>> and not reduction evaluation.  It has executable statements and external
>> data.
>>
>>       Andy
>>
>> On 08/03/2020 17:02, Marco Neumann wrote:
>>> sorry my bad, that was a typo should be reductions* . A very basic
>> concept
>>> in functional languages like haskell and heap size measured in cells.
>>>
>>>
>>> "Reduction is the process of converting an expression to a simpler form.
>>> Conceptually, an expression is reduced by simplifying one reducible
>>> expression (called “redex”) at a time."
>>>
>> https://www.futurelearn.com/courses/functional-programming-haskell/0/steps/27197
>>>
>>>
>>> On Sun, Mar 8, 2020 at 4:44 PM Andy Seaborne <an...@apache.org> wrote:
>>>
>>>> Then I don't understand what you are looking for.
>>>>
>>>> What's a "deduction"? What's a "cell"?
>>>>
>>>> On 08/03/2020 14:22, Marco Neumann wrote:
>>>>> thank you for the hint Andy, but not quite what I was looking for.
>>>>>
>>>>> I was aiming more for a type of feature I am familiar with from purely
>>>>> functional programming languages like haskell, hugs, miranda etc to
>>>> display
>>>>> deductions and cells used during execution.
>>>>>
>>>>> Marco
>>>>>
>>>>> On Sun, Mar 8, 2020 at 10:42 AM Andy Seaborne <an...@apache.org> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On 06/03/2020 17:40, Marco Neumann wrote:
>>>>>>> is there statistical data available for the number of deductions /
>>>>>>> joins performed for each SPARQL query of a QueryExecution object?
>>>>>>
>>>>>> If you run with "explain" you can find out but there isn't a specific
>>>>>> record kept by the code.
>>>>>>
>>>>>>>
>>>>>>> On Fri, Mar 6, 2020 at 3:16 PM Andy Seaborne <an...@apache.org>
>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 05/03/2020 08:32, Kashif Rabbani wrote:
>>>>>>>>> Hi Andy,
>>>>>>>>>
>>>>>>>>> Thanks for your response. I was wondering if there is any detailed
>>>>>>>> documentation of the Jena optimization (rewriting & reordering)
>>>>>> available
>>>>>>>> online? If yes, can you please send me the reference?.
>>>>>>>>
>>>>>>>> The code mainly.
>>>>>>>>
>>>>>>>> The TDB stats is documented.
>>>>>>>>
>>>>>>>>> Also, if I create my own query plan (in algebraic form), is it
>>>> possible
>>>>>>>> to make Jena execute it as it is? I mean how to turn off jena’s
>>>>>>>> optimization (rewriting & reordering)  and force my query plan for
>>>>>>>> execution.
>>>>>>>>
>>>>>>>> Yes - two parts - algebra rewrites and BGP reordering.
>>>>>>>>
>>>>>>>> The context is a mapping of settings.
>>>>>>>> there is a global context (ARQ.getContext())
>>>>>>>> one per the DatasetGraph.getContext()
>>>>>>>> one per query execution. QueryExecution.getContext()
>>>>>>>>
>>>>>>>> and it is treated hierarchically:
>>>>>>>>
>>>>>>>> Lookup in QueryExecution then DatasetGraph the Global.
>>>>>>>>
>>>>>>>> :: Algebra rewrite
>>>>>>>>
>>>>>>>> Some algebra rewrites have to be done - property functions, and
>>>> rewrite
>>>>>>>> some variables due to scoping. These aren't really "optimizations
>>>> steps"
>>>>>>>> but happen in that phase. There is OptimizerMinimal for those.
>>>>>>>>
>>>>>>>> To turn off optimizer and still do the minimum steps.
>>>>>>>>
>>>>>>>> context.set(ARQ.optimization, false)
>>>>>>>>
>>>>>>>> Either Algebra.exec(op, dsg) executes the algebra as given - that's
>> a
>>>>>>>> very low levelway of doing it.
>>>>>>>>
>>>>>>>> Turning the optimizer off is better because all the APIs work. eg
>>>>>>>> QueryExecution.
>>>>>>>>
>>>>>>>> :: BGP reordering
>>>>>>>>
>>>>>>>> The reordering of triple patterns is separate.
>>>>>>>> BGP steps are performed by a StageGenerator.
>>>>>>>>
>>>>>>>> To set up to use a custom StageGenerator:
>>>>>>>>
>>>>>>>> StageBuilder.setGenerator(ARQ.getContext(), stageGenerator) ;
>>>>>>>>
>>>>>>>> That's really only  call of
>>>>>>>>         context.set(ARQ.stageGenerator, myStageGenerator) ;
>>>>>>>>
>>>>>>>> The default is StageGenratorGeneric that does ReorderFixed.
>>>>>>>> It is used if there is no other setting in the context.
>>>>>>>>
>>>>>>>>          Andy
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks again for your help.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> Kashif Rabbani,
>>>>>>>>> Research Assistant,
>>>>>>>>> Department of Computer Science,
>>>>>>>>> Aalborg University, Denmark.
>>>>>>>>>
>>>>>>>>>> On 3 Mar 2020, at 13.43, Andy Seaborne <an...@apache.org> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Kashif,
>>>>>>>>>>
>>>>>>>>>> Optimization happens in two stages:
>>>>>>>>>>
>>>>>>>>>> 1. Rewrite of the algebra
>>>>>>>>>> 2. Reordering of the BGPs
>>>>>>>>>>
>>>>>>>>>> BGPs can be implemented differnet ways - and they are an
>> inferenece
>>>>>>>> extnesion point in SPARQL.
>>>>>>>>>>
>>>>>>>>>> What you see if the first. BGPs are reordered during execution.
>>>>>>>>>>
>>>>>>>>>> The algorithm can be stats driven for TDB and TDB2 storage:
>>>>>>>>>>       https://jena.apache.org/documentation/tdb/optimizer.html
>>>>>>>>>>
>>>>>>>>>> The interface is
>>>>>>>>
>> org.apache.jena.sparql.engine.optimizer.reorder.ReorderTransformation
>>>>>>>>>>
>>>>>>>>>> and a general purpose reordering is done for in-memory and is the
>>>>>>>> default for TDB.
>>>>>>>>>>
>>>>>>>>>> The default reorder is "grounded triples first, leave equal
>> weights
>>>>>>>> alone". It cascades whether a term is bound by an earlier step.
>>>>>>>>>>
>>>>>>>>>>>         { ?a  mbz:alias           "Amy Beach" .
>>>>>>>>>>>           ?b  cmno:hasInfluenced  ?a .
>>>>>>>>>>>           ?c  mo:composer         ?b ;
>>>>>>>>>>>               bio:date            ?d
>>>>>>>>>>>         }
>>>>>>>>>>
>>>>>>>>>> That's actually the default order -
>>>>>>>>>>
>>>>>>>>>> ?a  mbz:alias           "Amy Beach" .
>>>>>>>>>>
>>>>>>>>>> has two bound terms so is done first.
>>>>>>>>>>
>>>>>>>>>> and now ?a is bound so
>>>>>>>>>> ?b  cmno:hasInfluenced  ?a .
>>>>>>>>>>
>>>>>>>>>> etc.
>>>>>>>>>>
>>>>>>>>>> Given the boundedness of the pattern, and (guess) mbz:alias "Amy
>>>>>> Beach"
>>>>>>>> is quite selective, With stats  ? <property> ? would have to be less
>>>>>>>> numerous than ? mbz:alias "Amy Beach".
>>>>>>>>>>
>>>>>>>>>> There's no algebra optimization for your example, only BGP
>>>> reordering.
>>>>>>>>>>
>>>>>>>>>> qparse --print=opt shows stage 1 optimizations.
>>>>>>>>>>
>>>>>>>>>> Executing with "explain" shows BGP execution.
>>>>>>>>>>
>>>>>>>>>>         Andy
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 03/03/2020 11:56, Kashif Rabbani wrote:
>>>>>>>>>>> Hi awesome community,
>>>>>>>>>>> I have a question,  I am working on optimizing SPARQL query plan
>>>> and
>>>>>> I
>>>>>>>> wonder does the order of triple patterns in the where clause effects
>>>> the
>>>>>>>> query plan or not?
>>>>>>>>>>> For example, given a following query:
>>>>>>>>>>> PREFIX  bio:  <http://purl.org/vocab/bio/0.1/>
>>>>>>>>>>> PREFIX  mo:   <http://purl.org/ontology/mo/>
>>>>>>>>>>> PREFIX  mbz:  <http://dbtune.org/musicbrainz/resource/vocab/>
>>>>>>>>>>> PREFIX  cmno: <http://purl.org/ontology/classicalmusicnav#>
>>>>>>>>>>> SELECT  ?a ?b ?c
>>>>>>>>>>> WHERE
>>>>>>>>>>>        { ?a  mbz:alias           "Amy Beach" .
>>>>>>>>>>>          ?b  cmno:hasInfluenced  ?a .
>>>>>>>>>>>          ?c  mo:composer         ?b ;
>>>>>>>>>>>              bio:date            ?d
>>>>>>>>>>>        }
>>>>>>>>>>> // Let’s generate its algebra
>>>>>>>>>>> Op op = Algebra.compile(query); results into this:
>>>>>>>>>>> (project (?a ?b ?c)
>>>>>>>>>>>        (bgp
>>>>>>>>>>>          (triple ?a <
>>>> http://dbtune.org/musicbrainz/resource/vocab/alias
>>>>>>>
>>>>>>>> "Amy Beach")
>>>>>>>>>>>          (triple ?b <
>>>>>>>> http://purl.org/ontology/classicalmusicnav#hasInfluenced> ?a)
>>>>>>>>>>>          (triple ?c <http://purl.org/ontology/mo/composer> ?b)
>>>>>>>>>>>          (triple ?c <http://purl.org/vocab/bio/0.1/date> ?d)
>>>>>>>>>>>        ))
>>>>>>>>>>> The bgp in algebra follows the exact same order as specified in
>> the
>>>>>>>> where clause of the query. Very precisely, does Jena constructs the
>>>>>> query
>>>>>>>> plan as it is? or it will change the order at some other level?
>>>>>>>>>>> I would be happy if someone can guide me about how the Jena's
>> plan
>>>>>>>> actually constructed. If I will use some statistics of the actual
>> RDF
>>>>>> graph
>>>>>>>> to change the order of triple patterns in the BGP based on
>>>> selectivity,
>>>>>>>> would it optimize the plan somehow?
>>>>>>>>>>> Many Thanks,
>>>>>>>>>>> Best Regards,
>>>>>>>>>>> Kashif Rabbani.
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
> 
>

Re: Order of triple patterns in Where Clause

Posted by Marco Neumann <ma...@gmail.com>.

Ok granted yes not functional programming correct, but I think we are
getting sidetracked here by the specific meaning of the terms "purely
functional" and "pure function" . I am referring here only to the feature
of counting reductions in haskell.

So let's take a basic query plus filter in ARQ on the following data set:

 :a :b :c
 :c :d :e
 :f :g :h

with this query:

(filter (! (sameTerm ?x ?c))
  (bgp
    (triple ?x ?y ?z)
    (triple ?c ?d ?e)
  ))

how many total "evaluations/operations" are performed over the data set to
arrive at a result set of 6?


On Mon, Mar 9, 2020 at 11:50 AM Andy Seaborne <an...@apache.org> wrote:

> I don't see how it applies the ARQ evaluator.
>
> That's not how it works.
>
> Just because the algebra is functional, it's not functional programming
> and not reduction evaluation.  It has executable statements and external
> data.
>
>      Andy
>
> On 08/03/2020 17:02, Marco Neumann wrote:
> > sorry my bad, that was a typo should be reductions* . A very basic
> concept
> > in functional languages like haskell and heap size measured in cells.
> >
> >
> > "Reduction is the process of converting an expression to a simpler form.
> > Conceptually, an expression is reduced by simplifying one reducible
> > expression (called “redex”) at a time."
> >
> https://www.futurelearn.com/courses/functional-programming-haskell/0/steps/27197
> >
> >
> > On Sun, Mar 8, 2020 at 4:44 PM Andy Seaborne <an...@apache.org> wrote:
> >
> >> Then I don't understand what you are looking for.
> >>
> >> What's a "deduction"? What's a "cell"?
> >>
> >> On 08/03/2020 14:22, Marco Neumann wrote:
> >>> thank you for the hint Andy, but not quite what I was looking for.
> >>>
> >>> I was aiming more for a type of feature I am familiar with from purely
> >>> functional programming languages like haskell, hugs, miranda etc to
> >> display
> >>> deductions and cells used during execution.
> >>>
> >>> Marco
> >>>
> >>> On Sun, Mar 8, 2020 at 10:42 AM Andy Seaborne <an...@apache.org> wrote:
> >>>
> >>>>
> >>>>
> >>>> On 06/03/2020 17:40, Marco Neumann wrote:
> >>>>> is there statistical data available for the number of deductions /
> >>>>> joins performed for each SPARQL query of a QueryExecution object?
> >>>>
> >>>> If you run with "explain" you can find out but there isn't a specific
> >>>> record kept by the code.
> >>>>
> >>>>>
> >>>>> On Fri, Mar 6, 2020 at 3:16 PM Andy Seaborne <an...@apache.org>
> wrote:
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 05/03/2020 08:32, Kashif Rabbani wrote:
> >>>>>>> Hi Andy,
> >>>>>>>
> >>>>>>> Thanks for your response. I was wondering if there is any detailed
> >>>>>> documentation of the Jena optimization (rewriting & reordering)
> >>>> available
> >>>>>> online? If yes, can you please send me the reference?.
> >>>>>>
> >>>>>> The code mainly.
> >>>>>>
> >>>>>> The TDB stats is documented.
> >>>>>>
> >>>>>>> Also, if I create my own query plan (in algebraic form), is it
> >> possible
> >>>>>> to make Jena execute it as it is? I mean how to turn off jena’s
> >>>>>> optimization (rewriting & reordering)  and force my query plan for
> >>>>>> execution.
> >>>>>>
> >>>>>> Yes - two parts - algebra rewrites and BGP reordering.
> >>>>>>
> >>>>>> The context is a mapping of settings.
> >>>>>> there is a global context (ARQ.getContext())
> >>>>>> one per the DatasetGraph.getContext()
> >>>>>> one per query execution. QueryExecution.getContext()
> >>>>>>
> >>>>>> and it is treated hierarchically:
> >>>>>>
> >>>>>> Lookup in QueryExecution then DatasetGraph the Global.
> >>>>>>
> >>>>>> :: Algebra rewrite
> >>>>>>
> >>>>>> Some algebra rewrites have to be done - property functions, and
> >> rewrite
> >>>>>> some variables due to scoping. These aren't really "optimizations
> >> steps"
> >>>>>> but happen in that phase. There is OptimizerMinimal for those.
> >>>>>>
> >>>>>> To turn off optimizer and still do the minimum steps.
> >>>>>>
> >>>>>> context.set(ARQ.optimization, false)
> >>>>>>
> >>>>>> Either Algebra.exec(op, dsg) executes the algebra as given - that's
> a
> >>>>>> very low levelway of doing it.
> >>>>>>
> >>>>>> Turning the optimizer off is better because all the APIs work. eg
> >>>>>> QueryExecution.
> >>>>>>
> >>>>>> :: BGP reordering
> >>>>>>
> >>>>>> The reordering of triple patterns is separate.
> >>>>>> BGP steps are performed by a StageGenerator.
> >>>>>>
> >>>>>> To set up to use a custom StageGenerator:
> >>>>>>
> >>>>>> StageBuilder.setGenerator(ARQ.getContext(), stageGenerator) ;
> >>>>>>
> >>>>>> That's really only  call of
> >>>>>>        context.set(ARQ.stageGenerator, myStageGenerator) ;
> >>>>>>
> >>>>>> The default is StageGenratorGeneric that does ReorderFixed.
> >>>>>> It is used if there is no other setting in the context.
> >>>>>>
> >>>>>>         Andy
> >>>>>>
> >>>>>>>
> >>>>>>> Thanks again for your help.
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>>
> >>>>>>> Kashif Rabbani,
> >>>>>>> Research Assistant,
> >>>>>>> Department of Computer Science,
> >>>>>>> Aalborg University, Denmark.
> >>>>>>>
> >>>>>>>> On 3 Mar 2020, at 13.43, Andy Seaborne <an...@apache.org> wrote:
> >>>>>>>>
> >>>>>>>> Hi Kashif,
> >>>>>>>>
> >>>>>>>> Optimization happens in two stages:
> >>>>>>>>
> >>>>>>>> 1. Rewrite of the algebra
> >>>>>>>> 2. Reordering of the BGPs
> >>>>>>>>
> >>>>>>>> BGPs can be implemented differnet ways - and they are an
> inferenece
> >>>>>> extnesion point in SPARQL.
> >>>>>>>>
> >>>>>>>> What you see if the first. BGPs are reordered during execution.
> >>>>>>>>
> >>>>>>>> The algorithm can be stats driven for TDB and TDB2 storage:
> >>>>>>>>      https://jena.apache.org/documentation/tdb/optimizer.html
> >>>>>>>>
> >>>>>>>> The interface is
> >>>>>>
> org.apache.jena.sparql.engine.optimizer.reorder.ReorderTransformation
> >>>>>>>>
> >>>>>>>> and a general purpose reordering is done for in-memory and is the
> >>>>>> default for TDB.
> >>>>>>>>
> >>>>>>>> The default reorder is "grounded triples first, leave equal
> weights
> >>>>>> alone". It cascades whether a term is bound by an earlier step.
> >>>>>>>>
> >>>>>>>>>        { ?a  mbz:alias           "Amy Beach" .
> >>>>>>>>>          ?b  cmno:hasInfluenced  ?a .
> >>>>>>>>>          ?c  mo:composer         ?b ;
> >>>>>>>>>              bio:date            ?d
> >>>>>>>>>        }
> >>>>>>>>
> >>>>>>>> That's actually the default order -
> >>>>>>>>
> >>>>>>>> ?a  mbz:alias           "Amy Beach" .
> >>>>>>>>
> >>>>>>>> has two bound terms so is done first.
> >>>>>>>>
> >>>>>>>> and now ?a is bound so
> >>>>>>>> ?b  cmno:hasInfluenced  ?a .
> >>>>>>>>
> >>>>>>>> etc.
> >>>>>>>>
> >>>>>>>> Given the boundedness of the pattern, and (guess) mbz:alias "Amy
> >>>> Beach"
> >>>>>> is quite selective, With stats  ? <property> ? would have to be less
> >>>>>> numerous than ? mbz:alias "Amy Beach".
> >>>>>>>>
> >>>>>>>> There's no algebra optimization for your example, only BGP
> >> reordering.
> >>>>>>>>
> >>>>>>>> qparse --print=opt shows stage 1 optimizations.
> >>>>>>>>
> >>>>>>>> Executing with "explain" shows BGP execution.
> >>>>>>>>
> >>>>>>>>        Andy
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 03/03/2020 11:56, Kashif Rabbani wrote:
> >>>>>>>>> Hi awesome community,
> >>>>>>>>> I have a question,  I am working on optimizing SPARQL query plan
> >> and
> >>>> I
> >>>>>> wonder does the order of triple patterns in the where clause effects
> >> the
> >>>>>> query plan or not?
> >>>>>>>>> For example, given a following query:
> >>>>>>>>> PREFIX  bio:  <http://purl.org/vocab/bio/0.1/>
> >>>>>>>>> PREFIX  mo:   <http://purl.org/ontology/mo/>
> >>>>>>>>> PREFIX  mbz:  <http://dbtune.org/musicbrainz/resource/vocab/>
> >>>>>>>>> PREFIX  cmno: <http://purl.org/ontology/classicalmusicnav#>
> >>>>>>>>> SELECT  ?a ?b ?c
> >>>>>>>>> WHERE
> >>>>>>>>>       { ?a  mbz:alias           "Amy Beach" .
> >>>>>>>>>         ?b  cmno:hasInfluenced  ?a .
> >>>>>>>>>         ?c  mo:composer         ?b ;
> >>>>>>>>>             bio:date            ?d
> >>>>>>>>>       }
> >>>>>>>>> // Let’s generate its algebra
> >>>>>>>>> Op op = Algebra.compile(query); results into this:
> >>>>>>>>> (project (?a ?b ?c)
> >>>>>>>>>       (bgp
> >>>>>>>>>         (triple ?a <
> >> http://dbtune.org/musicbrainz/resource/vocab/alias
> >>>>>
> >>>>>> "Amy Beach")
> >>>>>>>>>         (triple ?b <
> >>>>>> http://purl.org/ontology/classicalmusicnav#hasInfluenced> ?a)
> >>>>>>>>>         (triple ?c <http://purl.org/ontology/mo/composer> ?b)
> >>>>>>>>>         (triple ?c <http://purl.org/vocab/bio/0.1/date> ?d)
> >>>>>>>>>       ))
> >>>>>>>>> The bgp in algebra follows the exact same order as specified in
> the
> >>>>>> where clause of the query. Very precisely, does Jena constructs the
> >>>> query
> >>>>>> plan as it is? or it will change the order at some other level?
> >>>>>>>>> I would be happy if someone can guide me about how the Jena's
> plan
> >>>>>> actually constructed. If I will use some statistics of the actual
> RDF
> >>>> graph
> >>>>>> to change the order of triple patterns in the BGP based on
> >> selectivity,
> >>>>>> would it optimize the plan somehow?
> >>>>>>>>> Many Thanks,
> >>>>>>>>> Best Regards,
> >>>>>>>>> Kashif Rabbani.
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >
> >
>


-- 


---
Marco Neumann
KONA

Re: Order of triple patterns in Where Clause

Posted by Andy Seaborne <an...@apache.org>.

I don't see how it applies the ARQ evaluator.

That's not how it works.

Just because the algebra is functional, it's not functional programming 
and not reduction evaluation.  It has executable statements and external 
data.

     Andy

On 08/03/2020 17:02, Marco Neumann wrote:
> sorry my bad, that was a typo should be reductions* . A very basic concept
> in functional languages like haskell and heap size measured in cells.
> 
> 
> "Reduction is the process of converting an expression to a simpler form.
> Conceptually, an expression is reduced by simplifying one reducible
> expression (called “redex”) at a time."
> https://www.futurelearn.com/courses/functional-programming-haskell/0/steps/27197
> 
> 
> On Sun, Mar 8, 2020 at 4:44 PM Andy Seaborne <an...@apache.org> wrote:
> 
>> Then I don't understand what you are looking for.
>>
>> What's a "deduction"? What's a "cell"?
>>
>> On 08/03/2020 14:22, Marco Neumann wrote:
>>> thank you for the hint Andy, but not quite what I was looking for.
>>>
>>> I was aiming more for a type of feature I am familiar with from purely
>>> functional programming languages like haskell, hugs, miranda etc to
>> display
>>> deductions and cells used during execution.
>>>
>>> Marco
>>>
>>> On Sun, Mar 8, 2020 at 10:42 AM Andy Seaborne <an...@apache.org> wrote:
>>>
>>>>
>>>>
>>>> On 06/03/2020 17:40, Marco Neumann wrote:
>>>>> is there statistical data available for the number of deductions /
>>>>> joins performed for each SPARQL query of a QueryExecution object?
>>>>
>>>> If you run with "explain" you can find out but there isn't a specific
>>>> record kept by the code.
>>>>
>>>>>
>>>>> On Fri, Mar 6, 2020 at 3:16 PM Andy Seaborne <an...@apache.org> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On 05/03/2020 08:32, Kashif Rabbani wrote:
>>>>>>> Hi Andy,
>>>>>>>
>>>>>>> Thanks for your response. I was wondering if there is any detailed
>>>>>> documentation of the Jena optimization (rewriting & reordering)
>>>> available
>>>>>> online? If yes, can you please send me the reference?.
>>>>>>
>>>>>> The code mainly.
>>>>>>
>>>>>> The TDB stats is documented.
>>>>>>
>>>>>>> Also, if I create my own query plan (in algebraic form), is it
>> possible
>>>>>> to make Jena execute it as it is? I mean how to turn off jena’s
>>>>>> optimization (rewriting & reordering)  and force my query plan for
>>>>>> execution.
>>>>>>
>>>>>> Yes - two parts - algebra rewrites and BGP reordering.
>>>>>>
>>>>>> The context is a mapping of settings.
>>>>>> there is a global context (ARQ.getContext())
>>>>>> one per the DatasetGraph.getContext()
>>>>>> one per query execution. QueryExecution.getContext()
>>>>>>
>>>>>> and it is treated hierarchically:
>>>>>>
>>>>>> Lookup in QueryExecution then DatasetGraph the Global.
>>>>>>
>>>>>> :: Algebra rewrite
>>>>>>
>>>>>> Some algebra rewrites have to be done - property functions, and
>> rewrite
>>>>>> some variables due to scoping. These aren't really "optimizations
>> steps"
>>>>>> but happen in that phase. There is OptimizerMinimal for those.
>>>>>>
>>>>>> To turn off optimizer and still do the minimum steps.
>>>>>>
>>>>>> context.set(ARQ.optimization, false)
>>>>>>
>>>>>> Either Algebra.exec(op, dsg) executes the algebra as given - that's a
>>>>>> very low levelway of doing it.
>>>>>>
>>>>>> Turning the optimizer off is better because all the APIs work. eg
>>>>>> QueryExecution.
>>>>>>
>>>>>> :: BGP reordering
>>>>>>
>>>>>> The reordering of triple patterns is separate.
>>>>>> BGP steps are performed by a StageGenerator.
>>>>>>
>>>>>> To set up to use a custom StageGenerator:
>>>>>>
>>>>>> StageBuilder.setGenerator(ARQ.getContext(), stageGenerator) ;
>>>>>>
>>>>>> That's really only  call of
>>>>>>        context.set(ARQ.stageGenerator, myStageGenerator) ;
>>>>>>
>>>>>> The default is StageGenratorGeneric that does ReorderFixed.
>>>>>> It is used if there is no other setting in the context.
>>>>>>
>>>>>>         Andy
>>>>>>
>>>>>>>
>>>>>>> Thanks again for your help.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Kashif Rabbani,
>>>>>>> Research Assistant,
>>>>>>> Department of Computer Science,
>>>>>>> Aalborg University, Denmark.
>>>>>>>
>>>>>>>> On 3 Mar 2020, at 13.43, Andy Seaborne <an...@apache.org> wrote:
>>>>>>>>
>>>>>>>> Hi Kashif,
>>>>>>>>
>>>>>>>> Optimization happens in two stages:
>>>>>>>>
>>>>>>>> 1. Rewrite of the algebra
>>>>>>>> 2. Reordering of the BGPs
>>>>>>>>
>>>>>>>> BGPs can be implemented differnet ways - and they are an inferenece
>>>>>> extnesion point in SPARQL.
>>>>>>>>
>>>>>>>> What you see if the first. BGPs are reordered during execution.
>>>>>>>>
>>>>>>>> The algorithm can be stats driven for TDB and TDB2 storage:
>>>>>>>>      https://jena.apache.org/documentation/tdb/optimizer.html
>>>>>>>>
>>>>>>>> The interface is
>>>>>> org.apache.jena.sparql.engine.optimizer.reorder.ReorderTransformation
>>>>>>>>
>>>>>>>> and a general purpose reordering is done for in-memory and is the
>>>>>> default for TDB.
>>>>>>>>
>>>>>>>> The default reorder is "grounded triples first, leave equal weights
>>>>>> alone". It cascades whether a term is bound by an earlier step.
>>>>>>>>
>>>>>>>>>        { ?a  mbz:alias           "Amy Beach" .
>>>>>>>>>          ?b  cmno:hasInfluenced  ?a .
>>>>>>>>>          ?c  mo:composer         ?b ;
>>>>>>>>>              bio:date            ?d
>>>>>>>>>        }
>>>>>>>>
>>>>>>>> That's actually the default order -
>>>>>>>>
>>>>>>>> ?a  mbz:alias           "Amy Beach" .
>>>>>>>>
>>>>>>>> has two bound terms so is done first.
>>>>>>>>
>>>>>>>> and now ?a is bound so
>>>>>>>> ?b  cmno:hasInfluenced  ?a .
>>>>>>>>
>>>>>>>> etc.
>>>>>>>>
>>>>>>>> Given the boundedness of the pattern, and (guess) mbz:alias "Amy
>>>> Beach"
>>>>>> is quite selective, With stats  ? <property> ? would have to be less
>>>>>> numerous than ? mbz:alias "Amy Beach".
>>>>>>>>
>>>>>>>> There's no algebra optimization for your example, only BGP
>> reordering.
>>>>>>>>
>>>>>>>> qparse --print=opt shows stage 1 optimizations.
>>>>>>>>
>>>>>>>> Executing with "explain" shows BGP execution.
>>>>>>>>
>>>>>>>>        Andy
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 03/03/2020 11:56, Kashif Rabbani wrote:
>>>>>>>>> Hi awesome community,
>>>>>>>>> I have a question,  I am working on optimizing SPARQL query plan
>> and
>>>> I
>>>>>> wonder does the order of triple patterns in the where clause effects
>> the
>>>>>> query plan or not?
>>>>>>>>> For example, given a following query:
>>>>>>>>> PREFIX  bio:  <http://purl.org/vocab/bio/0.1/>
>>>>>>>>> PREFIX  mo:   <http://purl.org/ontology/mo/>
>>>>>>>>> PREFIX  mbz:  <http://dbtune.org/musicbrainz/resource/vocab/>
>>>>>>>>> PREFIX  cmno: <http://purl.org/ontology/classicalmusicnav#>
>>>>>>>>> SELECT  ?a ?b ?c
>>>>>>>>> WHERE
>>>>>>>>>       { ?a  mbz:alias           "Amy Beach" .
>>>>>>>>>         ?b  cmno:hasInfluenced  ?a .
>>>>>>>>>         ?c  mo:composer         ?b ;
>>>>>>>>>             bio:date            ?d
>>>>>>>>>       }
>>>>>>>>> // Let’s generate its algebra
>>>>>>>>> Op op = Algebra.compile(query); results into this:
>>>>>>>>> (project (?a ?b ?c)
>>>>>>>>>       (bgp
>>>>>>>>>         (triple ?a <
>> http://dbtune.org/musicbrainz/resource/vocab/alias
>>>>>
>>>>>> "Amy Beach")
>>>>>>>>>         (triple ?b <
>>>>>> http://purl.org/ontology/classicalmusicnav#hasInfluenced> ?a)
>>>>>>>>>         (triple ?c <http://purl.org/ontology/mo/composer> ?b)
>>>>>>>>>         (triple ?c <http://purl.org/vocab/bio/0.1/date> ?d)
>>>>>>>>>       ))
>>>>>>>>> The bgp in algebra follows the exact same order as specified in the
>>>>>> where clause of the query. Very precisely, does Jena constructs the
>>>> query
>>>>>> plan as it is? or it will change the order at some other level?
>>>>>>>>> I would be happy if someone can guide me about how the Jena's plan
>>>>>> actually constructed. If I will use some statistics of the actual RDF
>>>> graph
>>>>>> to change the order of triple patterns in the BGP based on
>> selectivity,
>>>>>> would it optimize the plan somehow?
>>>>>>>>> Many Thanks,
>>>>>>>>> Best Regards,
>>>>>>>>> Kashif Rabbani.
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
> 
>

Re: Order of triple patterns in Where Clause

Posted by Marco Neumann <ma...@gmail.com>.

sorry my bad, that was a typo should be reductions* . A very basic concept
in functional languages like haskell and heap size measured in cells.


"Reduction is the process of converting an expression to a simpler form.
Conceptually, an expression is reduced by simplifying one reducible
expression (called “redex”) at a time."
https://www.futurelearn.com/courses/functional-programming-haskell/0/steps/27197


On Sun, Mar 8, 2020 at 4:44 PM Andy Seaborne <an...@apache.org> wrote:

> Then I don't understand what you are looking for.
>
> What's a "deduction"? What's a "cell"?
>
> On 08/03/2020 14:22, Marco Neumann wrote:
> > thank you for the hint Andy, but not quite what I was looking for.
> >
> > I was aiming more for a type of feature I am familiar with from purely
> > functional programming languages like haskell, hugs, miranda etc to
> display
> > deductions and cells used during execution.
> >
> > Marco
> >
> > On Sun, Mar 8, 2020 at 10:42 AM Andy Seaborne <an...@apache.org> wrote:
> >
> >>
> >>
> >> On 06/03/2020 17:40, Marco Neumann wrote:
> >>> is there statistical data available for the number of deductions /
> >>> joins performed for each SPARQL query of a QueryExecution object?
> >>
> >> If you run with "explain" you can find out but there isn't a specific
> >> record kept by the code.
> >>
> >>>
> >>> On Fri, Mar 6, 2020 at 3:16 PM Andy Seaborne <an...@apache.org> wrote:
> >>>
> >>>>
> >>>>
> >>>> On 05/03/2020 08:32, Kashif Rabbani wrote:
> >>>>> Hi Andy,
> >>>>>
> >>>>> Thanks for your response. I was wondering if there is any detailed
> >>>> documentation of the Jena optimization (rewriting & reordering)
> >> available
> >>>> online? If yes, can you please send me the reference?.
> >>>>
> >>>> The code mainly.
> >>>>
> >>>> The TDB stats is documented.
> >>>>
> >>>>> Also, if I create my own query plan (in algebraic form), is it
> possible
> >>>> to make Jena execute it as it is? I mean how to turn off jena’s
> >>>> optimization (rewriting & reordering)  and force my query plan for
> >>>> execution.
> >>>>
> >>>> Yes - two parts - algebra rewrites and BGP reordering.
> >>>>
> >>>> The context is a mapping of settings.
> >>>> there is a global context (ARQ.getContext())
> >>>> one per the DatasetGraph.getContext()
> >>>> one per query execution. QueryExecution.getContext()
> >>>>
> >>>> and it is treated hierarchically:
> >>>>
> >>>> Lookup in QueryExecution then DatasetGraph the Global.
> >>>>
> >>>> :: Algebra rewrite
> >>>>
> >>>> Some algebra rewrites have to be done - property functions, and
> rewrite
> >>>> some variables due to scoping. These aren't really "optimizations
> steps"
> >>>> but happen in that phase. There is OptimizerMinimal for those.
> >>>>
> >>>> To turn off optimizer and still do the minimum steps.
> >>>>
> >>>> context.set(ARQ.optimization, false)
> >>>>
> >>>> Either Algebra.exec(op, dsg) executes the algebra as given - that's a
> >>>> very low levelway of doing it.
> >>>>
> >>>> Turning the optimizer off is better because all the APIs work. eg
> >>>> QueryExecution.
> >>>>
> >>>> :: BGP reordering
> >>>>
> >>>> The reordering of triple patterns is separate.
> >>>> BGP steps are performed by a StageGenerator.
> >>>>
> >>>> To set up to use a custom StageGenerator:
> >>>>
> >>>> StageBuilder.setGenerator(ARQ.getContext(), stageGenerator) ;
> >>>>
> >>>> That's really only  call of
> >>>>       context.set(ARQ.stageGenerator, myStageGenerator) ;
> >>>>
> >>>> The default is StageGenratorGeneric that does ReorderFixed.
> >>>> It is used if there is no other setting in the context.
> >>>>
> >>>>        Andy
> >>>>
> >>>>>
> >>>>> Thanks again for your help.
> >>>>>
> >>>>> Regards,
> >>>>>
> >>>>> Kashif Rabbani,
> >>>>> Research Assistant,
> >>>>> Department of Computer Science,
> >>>>> Aalborg University, Denmark.
> >>>>>
> >>>>>> On 3 Mar 2020, at 13.43, Andy Seaborne <an...@apache.org> wrote:
> >>>>>>
> >>>>>> Hi Kashif,
> >>>>>>
> >>>>>> Optimization happens in two stages:
> >>>>>>
> >>>>>> 1. Rewrite of the algebra
> >>>>>> 2. Reordering of the BGPs
> >>>>>>
> >>>>>> BGPs can be implemented differnet ways - and they are an inferenece
> >>>> extnesion point in SPARQL.
> >>>>>>
> >>>>>> What you see if the first. BGPs are reordered during execution.
> >>>>>>
> >>>>>> The algorithm can be stats driven for TDB and TDB2 storage:
> >>>>>>     https://jena.apache.org/documentation/tdb/optimizer.html
> >>>>>>
> >>>>>> The interface is
> >>>> org.apache.jena.sparql.engine.optimizer.reorder.ReorderTransformation
> >>>>>>
> >>>>>> and a general purpose reordering is done for in-memory and is the
> >>>> default for TDB.
> >>>>>>
> >>>>>> The default reorder is "grounded triples first, leave equal weights
> >>>> alone". It cascades whether a term is bound by an earlier step.
> >>>>>>
> >>>>>>>       { ?a  mbz:alias           "Amy Beach" .
> >>>>>>>         ?b  cmno:hasInfluenced  ?a .
> >>>>>>>         ?c  mo:composer         ?b ;
> >>>>>>>             bio:date            ?d
> >>>>>>>       }
> >>>>>>
> >>>>>> That's actually the default order -
> >>>>>>
> >>>>>> ?a  mbz:alias           "Amy Beach" .
> >>>>>>
> >>>>>> has two bound terms so is done first.
> >>>>>>
> >>>>>> and now ?a is bound so
> >>>>>> ?b  cmno:hasInfluenced  ?a .
> >>>>>>
> >>>>>> etc.
> >>>>>>
> >>>>>> Given the boundedness of the pattern, and (guess) mbz:alias "Amy
> >> Beach"
> >>>> is quite selective, With stats  ? <property> ? would have to be less
> >>>> numerous than ? mbz:alias "Amy Beach".
> >>>>>>
> >>>>>> There's no algebra optimization for your example, only BGP
> reordering.
> >>>>>>
> >>>>>> qparse --print=opt shows stage 1 optimizations.
> >>>>>>
> >>>>>> Executing with "explain" shows BGP execution.
> >>>>>>
> >>>>>>       Andy
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 03/03/2020 11:56, Kashif Rabbani wrote:
> >>>>>>> Hi awesome community,
> >>>>>>> I have a question,  I am working on optimizing SPARQL query plan
> and
> >> I
> >>>> wonder does the order of triple patterns in the where clause effects
> the
> >>>> query plan or not?
> >>>>>>> For example, given a following query:
> >>>>>>> PREFIX  bio:  <http://purl.org/vocab/bio/0.1/>
> >>>>>>> PREFIX  mo:   <http://purl.org/ontology/mo/>
> >>>>>>> PREFIX  mbz:  <http://dbtune.org/musicbrainz/resource/vocab/>
> >>>>>>> PREFIX  cmno: <http://purl.org/ontology/classicalmusicnav#>
> >>>>>>> SELECT  ?a ?b ?c
> >>>>>>> WHERE
> >>>>>>>      { ?a  mbz:alias           "Amy Beach" .
> >>>>>>>        ?b  cmno:hasInfluenced  ?a .
> >>>>>>>        ?c  mo:composer         ?b ;
> >>>>>>>            bio:date            ?d
> >>>>>>>      }
> >>>>>>> // Let’s generate its algebra
> >>>>>>> Op op = Algebra.compile(query); results into this:
> >>>>>>> (project (?a ?b ?c)
> >>>>>>>      (bgp
> >>>>>>>        (triple ?a <
> http://dbtune.org/musicbrainz/resource/vocab/alias
> >>>
> >>>> "Amy Beach")
> >>>>>>>        (triple ?b <
> >>>> http://purl.org/ontology/classicalmusicnav#hasInfluenced> ?a)
> >>>>>>>        (triple ?c <http://purl.org/ontology/mo/composer> ?b)
> >>>>>>>        (triple ?c <http://purl.org/vocab/bio/0.1/date> ?d)
> >>>>>>>      ))
> >>>>>>> The bgp in algebra follows the exact same order as specified in the
> >>>> where clause of the query. Very precisely, does Jena constructs the
> >> query
> >>>> plan as it is? or it will change the order at some other level?
> >>>>>>> I would be happy if someone can guide me about how the Jena's plan
> >>>> actually constructed. If I will use some statistics of the actual RDF
> >> graph
> >>>> to change the order of triple patterns in the BGP based on
> selectivity,
> >>>> would it optimize the plan somehow?
> >>>>>>> Many Thanks,
> >>>>>>> Best Regards,
> >>>>>>> Kashif Rabbani.
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >
> >
>


-- 


---
Marco Neumann
KONA

Re: Order of triple patterns in Where Clause

Posted by Andy Seaborne <an...@apache.org>.

Then I don't understand what you are looking for.

What's a "deduction"? What's a "cell"?

On 08/03/2020 14:22, Marco Neumann wrote:
> thank you for the hint Andy, but not quite what I was looking for.
> 
> I was aiming more for a type of feature I am familiar with from purely
> functional programming languages like haskell, hugs, miranda etc to display
> deductions and cells used during execution.
> 
> Marco
> 
> On Sun, Mar 8, 2020 at 10:42 AM Andy Seaborne <an...@apache.org> wrote:
> 
>>
>>
>> On 06/03/2020 17:40, Marco Neumann wrote:
>>> is there statistical data available for the number of deductions /
>>> joins performed for each SPARQL query of a QueryExecution object?
>>
>> If you run with "explain" you can find out but there isn't a specific
>> record kept by the code.
>>
>>>
>>> On Fri, Mar 6, 2020 at 3:16 PM Andy Seaborne <an...@apache.org> wrote:
>>>
>>>>
>>>>
>>>> On 05/03/2020 08:32, Kashif Rabbani wrote:
>>>>> Hi Andy,
>>>>>
>>>>> Thanks for your response. I was wondering if there is any detailed
>>>> documentation of the Jena optimization (rewriting & reordering)
>> available
>>>> online? If yes, can you please send me the reference?.
>>>>
>>>> The code mainly.
>>>>
>>>> The TDB stats is documented.
>>>>
>>>>> Also, if I create my own query plan (in algebraic form), is it possible
>>>> to make Jena execute it as it is? I mean how to turn off jena’s
>>>> optimization (rewriting & reordering)  and force my query plan for
>>>> execution.
>>>>
>>>> Yes - two parts - algebra rewrites and BGP reordering.
>>>>
>>>> The context is a mapping of settings.
>>>> there is a global context (ARQ.getContext())
>>>> one per the DatasetGraph.getContext()
>>>> one per query execution. QueryExecution.getContext()
>>>>
>>>> and it is treated hierarchically:
>>>>
>>>> Lookup in QueryExecution then DatasetGraph the Global.
>>>>
>>>> :: Algebra rewrite
>>>>
>>>> Some algebra rewrites have to be done - property functions, and rewrite
>>>> some variables due to scoping. These aren't really "optimizations steps"
>>>> but happen in that phase. There is OptimizerMinimal for those.
>>>>
>>>> To turn off optimizer and still do the minimum steps.
>>>>
>>>> context.set(ARQ.optimization, false)
>>>>
>>>> Either Algebra.exec(op, dsg) executes the algebra as given - that's a
>>>> very low levelway of doing it.
>>>>
>>>> Turning the optimizer off is better because all the APIs work. eg
>>>> QueryExecution.
>>>>
>>>> :: BGP reordering
>>>>
>>>> The reordering of triple patterns is separate.
>>>> BGP steps are performed by a StageGenerator.
>>>>
>>>> To set up to use a custom StageGenerator:
>>>>
>>>> StageBuilder.setGenerator(ARQ.getContext(), stageGenerator) ;
>>>>
>>>> That's really only  call of
>>>>       context.set(ARQ.stageGenerator, myStageGenerator) ;
>>>>
>>>> The default is StageGenratorGeneric that does ReorderFixed.
>>>> It is used if there is no other setting in the context.
>>>>
>>>>        Andy
>>>>
>>>>>
>>>>> Thanks again for your help.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Kashif Rabbani,
>>>>> Research Assistant,
>>>>> Department of Computer Science,
>>>>> Aalborg University, Denmark.
>>>>>
>>>>>> On 3 Mar 2020, at 13.43, Andy Seaborne <an...@apache.org> wrote:
>>>>>>
>>>>>> Hi Kashif,
>>>>>>
>>>>>> Optimization happens in two stages:
>>>>>>
>>>>>> 1. Rewrite of the algebra
>>>>>> 2. Reordering of the BGPs
>>>>>>
>>>>>> BGPs can be implemented differnet ways - and they are an inferenece
>>>> extnesion point in SPARQL.
>>>>>>
>>>>>> What you see if the first. BGPs are reordered during execution.
>>>>>>
>>>>>> The algorithm can be stats driven for TDB and TDB2 storage:
>>>>>>     https://jena.apache.org/documentation/tdb/optimizer.html
>>>>>>
>>>>>> The interface is
>>>> org.apache.jena.sparql.engine.optimizer.reorder.ReorderTransformation
>>>>>>
>>>>>> and a general purpose reordering is done for in-memory and is the
>>>> default for TDB.
>>>>>>
>>>>>> The default reorder is "grounded triples first, leave equal weights
>>>> alone". It cascades whether a term is bound by an earlier step.
>>>>>>
>>>>>>>       { ?a  mbz:alias           "Amy Beach" .
>>>>>>>         ?b  cmno:hasInfluenced  ?a .
>>>>>>>         ?c  mo:composer         ?b ;
>>>>>>>             bio:date            ?d
>>>>>>>       }
>>>>>>
>>>>>> That's actually the default order -
>>>>>>
>>>>>> ?a  mbz:alias           "Amy Beach" .
>>>>>>
>>>>>> has two bound terms so is done first.
>>>>>>
>>>>>> and now ?a is bound so
>>>>>> ?b  cmno:hasInfluenced  ?a .
>>>>>>
>>>>>> etc.
>>>>>>
>>>>>> Given the boundedness of the pattern, and (guess) mbz:alias "Amy
>> Beach"
>>>> is quite selective, With stats  ? <property> ? would have to be less
>>>> numerous than ? mbz:alias "Amy Beach".
>>>>>>
>>>>>> There's no algebra optimization for your example, only BGP reordering.
>>>>>>
>>>>>> qparse --print=opt shows stage 1 optimizations.
>>>>>>
>>>>>> Executing with "explain" shows BGP execution.
>>>>>>
>>>>>>       Andy
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 03/03/2020 11:56, Kashif Rabbani wrote:
>>>>>>> Hi awesome community,
>>>>>>> I have a question,  I am working on optimizing SPARQL query plan and
>> I
>>>> wonder does the order of triple patterns in the where clause effects the
>>>> query plan or not?
>>>>>>> For example, given a following query:
>>>>>>> PREFIX  bio:  <http://purl.org/vocab/bio/0.1/>
>>>>>>> PREFIX  mo:   <http://purl.org/ontology/mo/>
>>>>>>> PREFIX  mbz:  <http://dbtune.org/musicbrainz/resource/vocab/>
>>>>>>> PREFIX  cmno: <http://purl.org/ontology/classicalmusicnav#>
>>>>>>> SELECT  ?a ?b ?c
>>>>>>> WHERE
>>>>>>>      { ?a  mbz:alias           "Amy Beach" .
>>>>>>>        ?b  cmno:hasInfluenced  ?a .
>>>>>>>        ?c  mo:composer         ?b ;
>>>>>>>            bio:date            ?d
>>>>>>>      }
>>>>>>> // Let’s generate its algebra
>>>>>>> Op op = Algebra.compile(query); results into this:
>>>>>>> (project (?a ?b ?c)
>>>>>>>      (bgp
>>>>>>>        (triple ?a <http://dbtune.org/musicbrainz/resource/vocab/alias
>>>
>>>> "Amy Beach")
>>>>>>>        (triple ?b <
>>>> http://purl.org/ontology/classicalmusicnav#hasInfluenced> ?a)
>>>>>>>        (triple ?c <http://purl.org/ontology/mo/composer> ?b)
>>>>>>>        (triple ?c <http://purl.org/vocab/bio/0.1/date> ?d)
>>>>>>>      ))
>>>>>>> The bgp in algebra follows the exact same order as specified in the
>>>> where clause of the query. Very precisely, does Jena constructs the
>> query
>>>> plan as it is? or it will change the order at some other level?
>>>>>>> I would be happy if someone can guide me about how the Jena's plan
>>>> actually constructed. If I will use some statistics of the actual RDF
>> graph
>>>> to change the order of triple patterns in the BGP based on selectivity,
>>>> would it optimize the plan somehow?
>>>>>>> Many Thanks,
>>>>>>> Best Regards,
>>>>>>> Kashif Rabbani.
>>>>>
>>>>
>>>
>>>
>>
> 
>

Re: Order of triple patterns in Where Clause

Posted by Marco Neumann <ma...@gmail.com>.

thank you for the hint Andy, but not quite what I was looking for.

I was aiming more for a type of feature I am familiar with from purely
functional programming languages like haskell, hugs, miranda etc to display
deductions and cells used during execution.

Marco

On Sun, Mar 8, 2020 at 10:42 AM Andy Seaborne <an...@apache.org> wrote:

>
>
> On 06/03/2020 17:40, Marco Neumann wrote:
> > is there statistical data available for the number of deductions /
> > joins performed for each SPARQL query of a QueryExecution object?
>
> If you run with "explain" you can find out but there isn't a specific
> record kept by the code.
>
> >
> > On Fri, Mar 6, 2020 at 3:16 PM Andy Seaborne <an...@apache.org> wrote:
> >
> >>
> >>
> >> On 05/03/2020 08:32, Kashif Rabbani wrote:
> >>> Hi Andy,
> >>>
> >>> Thanks for your response. I was wondering if there is any detailed
> >> documentation of the Jena optimization (rewriting & reordering)
> available
> >> online? If yes, can you please send me the reference?.
> >>
> >> The code mainly.
> >>
> >> The TDB stats is documented.
> >>
> >>> Also, if I create my own query plan (in algebraic form), is it possible
> >> to make Jena execute it as it is? I mean how to turn off jena’s
> >> optimization (rewriting & reordering)  and force my query plan for
> >> execution.
> >>
> >> Yes - two parts - algebra rewrites and BGP reordering.
> >>
> >> The context is a mapping of settings.
> >> there is a global context (ARQ.getContext())
> >> one per the DatasetGraph.getContext()
> >> one per query execution. QueryExecution.getContext()
> >>
> >> and it is treated hierarchically:
> >>
> >> Lookup in QueryExecution then DatasetGraph the Global.
> >>
> >> :: Algebra rewrite
> >>
> >> Some algebra rewrites have to be done - property functions, and rewrite
> >> some variables due to scoping. These aren't really "optimizations steps"
> >> but happen in that phase. There is OptimizerMinimal for those.
> >>
> >> To turn off optimizer and still do the minimum steps.
> >>
> >> context.set(ARQ.optimization, false)
> >>
> >> Either Algebra.exec(op, dsg) executes the algebra as given - that's a
> >> very low levelway of doing it.
> >>
> >> Turning the optimizer off is better because all the APIs work. eg
> >> QueryExecution.
> >>
> >> :: BGP reordering
> >>
> >> The reordering of triple patterns is separate.
> >> BGP steps are performed by a StageGenerator.
> >>
> >> To set up to use a custom StageGenerator:
> >>
> >> StageBuilder.setGenerator(ARQ.getContext(), stageGenerator) ;
> >>
> >> That's really only  call of
> >>      context.set(ARQ.stageGenerator, myStageGenerator) ;
> >>
> >> The default is StageGenratorGeneric that does ReorderFixed.
> >> It is used if there is no other setting in the context.
> >>
> >>       Andy
> >>
> >>>
> >>> Thanks again for your help.
> >>>
> >>> Regards,
> >>>
> >>> Kashif Rabbani,
> >>> Research Assistant,
> >>> Department of Computer Science,
> >>> Aalborg University, Denmark.
> >>>
> >>>> On 3 Mar 2020, at 13.43, Andy Seaborne <an...@apache.org> wrote:
> >>>>
> >>>> Hi Kashif,
> >>>>
> >>>> Optimization happens in two stages:
> >>>>
> >>>> 1. Rewrite of the algebra
> >>>> 2. Reordering of the BGPs
> >>>>
> >>>> BGPs can be implemented differnet ways - and they are an inferenece
> >> extnesion point in SPARQL.
> >>>>
> >>>> What you see if the first. BGPs are reordered during execution.
> >>>>
> >>>> The algorithm can be stats driven for TDB and TDB2 storage:
> >>>>    https://jena.apache.org/documentation/tdb/optimizer.html
> >>>>
> >>>> The interface is
> >> org.apache.jena.sparql.engine.optimizer.reorder.ReorderTransformation
> >>>>
> >>>> and a general purpose reordering is done for in-memory and is the
> >> default for TDB.
> >>>>
> >>>> The default reorder is "grounded triples first, leave equal weights
> >> alone". It cascades whether a term is bound by an earlier step.
> >>>>
> >>>>>      { ?a  mbz:alias           "Amy Beach" .
> >>>>>        ?b  cmno:hasInfluenced  ?a .
> >>>>>        ?c  mo:composer         ?b ;
> >>>>>            bio:date            ?d
> >>>>>      }
> >>>>
> >>>> That's actually the default order -
> >>>>
> >>>> ?a  mbz:alias           "Amy Beach" .
> >>>>
> >>>> has two bound terms so is done first.
> >>>>
> >>>> and now ?a is bound so
> >>>> ?b  cmno:hasInfluenced  ?a .
> >>>>
> >>>> etc.
> >>>>
> >>>> Given the boundedness of the pattern, and (guess) mbz:alias "Amy
> Beach"
> >> is quite selective, With stats  ? <property> ? would have to be less
> >> numerous than ? mbz:alias "Amy Beach".
> >>>>
> >>>> There's no algebra optimization for your example, only BGP reordering.
> >>>>
> >>>> qparse --print=opt shows stage 1 optimizations.
> >>>>
> >>>> Executing with "explain" shows BGP execution.
> >>>>
> >>>>      Andy
> >>>>
> >>>>
> >>>>
> >>>> On 03/03/2020 11:56, Kashif Rabbani wrote:
> >>>>> Hi awesome community,
> >>>>> I have a question,  I am working on optimizing SPARQL query plan and
> I
> >> wonder does the order of triple patterns in the where clause effects the
> >> query plan or not?
> >>>>> For example, given a following query:
> >>>>> PREFIX  bio:  <http://purl.org/vocab/bio/0.1/>
> >>>>> PREFIX  mo:   <http://purl.org/ontology/mo/>
> >>>>> PREFIX  mbz:  <http://dbtune.org/musicbrainz/resource/vocab/>
> >>>>> PREFIX  cmno: <http://purl.org/ontology/classicalmusicnav#>
> >>>>> SELECT  ?a ?b ?c
> >>>>> WHERE
> >>>>>     { ?a  mbz:alias           "Amy Beach" .
> >>>>>       ?b  cmno:hasInfluenced  ?a .
> >>>>>       ?c  mo:composer         ?b ;
> >>>>>           bio:date            ?d
> >>>>>     }
> >>>>> // Let’s generate its algebra
> >>>>> Op op = Algebra.compile(query); results into this:
> >>>>> (project (?a ?b ?c)
> >>>>>     (bgp
> >>>>>       (triple ?a <http://dbtune.org/musicbrainz/resource/vocab/alias
> >
> >> "Amy Beach")
> >>>>>       (triple ?b <
> >> http://purl.org/ontology/classicalmusicnav#hasInfluenced> ?a)
> >>>>>       (triple ?c <http://purl.org/ontology/mo/composer> ?b)
> >>>>>       (triple ?c <http://purl.org/vocab/bio/0.1/date> ?d)
> >>>>>     ))
> >>>>> The bgp in algebra follows the exact same order as specified in the
> >> where clause of the query. Very precisely, does Jena constructs the
> query
> >> plan as it is? or it will change the order at some other level?
> >>>>> I would be happy if someone can guide me about how the Jena's plan
> >> actually constructed. If I will use some statistics of the actual RDF
> graph
> >> to change the order of triple patterns in the BGP based on selectivity,
> >> would it optimize the plan somehow?
> >>>>> Many Thanks,
> >>>>> Best Regards,
> >>>>> Kashif Rabbani.
> >>>
> >>
> >
> >
>


-- 


---
Marco Neumann
KONA

Re: Order of triple patterns in Where Clause

Posted by Andy Seaborne <an...@apache.org>.


On 06/03/2020 17:40, Marco Neumann wrote:
> is there statistical data available for the number of deductions /
> joins performed for each SPARQL query of a QueryExecution object?

If you run with "explain" you can find out but there isn't a specific 
record kept by the code.

> 
> On Fri, Mar 6, 2020 at 3:16 PM Andy Seaborne <an...@apache.org> wrote:
> 
>>
>>
>> On 05/03/2020 08:32, Kashif Rabbani wrote:
>>> Hi Andy,
>>>
>>> Thanks for your response. I was wondering if there is any detailed
>> documentation of the Jena optimization (rewriting & reordering) available
>> online? If yes, can you please send me the reference?.
>>
>> The code mainly.
>>
>> The TDB stats is documented.
>>
>>> Also, if I create my own query plan (in algebraic form), is it possible
>> to make Jena execute it as it is? I mean how to turn off jena’s
>> optimization (rewriting & reordering)  and force my query plan for
>> execution.
>>
>> Yes - two parts - algebra rewrites and BGP reordering.
>>
>> The context is a mapping of settings.
>> there is a global context (ARQ.getContext())
>> one per the DatasetGraph.getContext()
>> one per query execution. QueryExecution.getContext()
>>
>> and it is treated hierarchically:
>>
>> Lookup in QueryExecution then DatasetGraph the Global.
>>
>> :: Algebra rewrite
>>
>> Some algebra rewrites have to be done - property functions, and rewrite
>> some variables due to scoping. These aren't really "optimizations steps"
>> but happen in that phase. There is OptimizerMinimal for those.
>>
>> To turn off optimizer and still do the minimum steps.
>>
>> context.set(ARQ.optimization, false)
>>
>> Either Algebra.exec(op, dsg) executes the algebra as given - that's a
>> very low levelway of doing it.
>>
>> Turning the optimizer off is better because all the APIs work. eg
>> QueryExecution.
>>
>> :: BGP reordering
>>
>> The reordering of triple patterns is separate.
>> BGP steps are performed by a StageGenerator.
>>
>> To set up to use a custom StageGenerator:
>>
>> StageBuilder.setGenerator(ARQ.getContext(), stageGenerator) ;
>>
>> That's really only  call of
>>      context.set(ARQ.stageGenerator, myStageGenerator) ;
>>
>> The default is StageGenratorGeneric that does ReorderFixed.
>> It is used if there is no other setting in the context.
>>
>>       Andy
>>
>>>
>>> Thanks again for your help.
>>>
>>> Regards,
>>>
>>> Kashif Rabbani,
>>> Research Assistant,
>>> Department of Computer Science,
>>> Aalborg University, Denmark.
>>>
>>>> On 3 Mar 2020, at 13.43, Andy Seaborne <an...@apache.org> wrote:
>>>>
>>>> Hi Kashif,
>>>>
>>>> Optimization happens in two stages:
>>>>
>>>> 1. Rewrite of the algebra
>>>> 2. Reordering of the BGPs
>>>>
>>>> BGPs can be implemented differnet ways - and they are an inferenece
>> extnesion point in SPARQL.
>>>>
>>>> What you see if the first. BGPs are reordered during execution.
>>>>
>>>> The algorithm can be stats driven for TDB and TDB2 storage:
>>>>    https://jena.apache.org/documentation/tdb/optimizer.html
>>>>
>>>> The interface is
>> org.apache.jena.sparql.engine.optimizer.reorder.ReorderTransformation
>>>>
>>>> and a general purpose reordering is done for in-memory and is the
>> default for TDB.
>>>>
>>>> The default reorder is "grounded triples first, leave equal weights
>> alone". It cascades whether a term is bound by an earlier step.
>>>>
>>>>>      { ?a  mbz:alias           "Amy Beach" .
>>>>>        ?b  cmno:hasInfluenced  ?a .
>>>>>        ?c  mo:composer         ?b ;
>>>>>            bio:date            ?d
>>>>>      }
>>>>
>>>> That's actually the default order -
>>>>
>>>> ?a  mbz:alias           "Amy Beach" .
>>>>
>>>> has two bound terms so is done first.
>>>>
>>>> and now ?a is bound so
>>>> ?b  cmno:hasInfluenced  ?a .
>>>>
>>>> etc.
>>>>
>>>> Given the boundedness of the pattern, and (guess) mbz:alias "Amy Beach"
>> is quite selective, With stats  ? <property> ? would have to be less
>> numerous than ? mbz:alias "Amy Beach".
>>>>
>>>> There's no algebra optimization for your example, only BGP reordering.
>>>>
>>>> qparse --print=opt shows stage 1 optimizations.
>>>>
>>>> Executing with "explain" shows BGP execution.
>>>>
>>>>      Andy
>>>>
>>>>
>>>>
>>>> On 03/03/2020 11:56, Kashif Rabbani wrote:
>>>>> Hi awesome community,
>>>>> I have a question,  I am working on optimizing SPARQL query plan and I
>> wonder does the order of triple patterns in the where clause effects the
>> query plan or not?
>>>>> For example, given a following query:
>>>>> PREFIX  bio:  <http://purl.org/vocab/bio/0.1/>
>>>>> PREFIX  mo:   <http://purl.org/ontology/mo/>
>>>>> PREFIX  mbz:  <http://dbtune.org/musicbrainz/resource/vocab/>
>>>>> PREFIX  cmno: <http://purl.org/ontology/classicalmusicnav#>
>>>>> SELECT  ?a ?b ?c
>>>>> WHERE
>>>>>     { ?a  mbz:alias           "Amy Beach" .
>>>>>       ?b  cmno:hasInfluenced  ?a .
>>>>>       ?c  mo:composer         ?b ;
>>>>>           bio:date            ?d
>>>>>     }
>>>>> // Let’s generate its algebra
>>>>> Op op = Algebra.compile(query); results into this:
>>>>> (project (?a ?b ?c)
>>>>>     (bgp
>>>>>       (triple ?a <http://dbtune.org/musicbrainz/resource/vocab/alias>
>> "Amy Beach")
>>>>>       (triple ?b <
>> http://purl.org/ontology/classicalmusicnav#hasInfluenced> ?a)
>>>>>       (triple ?c <http://purl.org/ontology/mo/composer> ?b)
>>>>>       (triple ?c <http://purl.org/vocab/bio/0.1/date> ?d)
>>>>>     ))
>>>>> The bgp in algebra follows the exact same order as specified in the
>> where clause of the query. Very precisely, does Jena constructs the query
>> plan as it is? or it will change the order at some other level?
>>>>> I would be happy if someone can guide me about how the Jena's plan
>> actually constructed. If I will use some statistics of the actual RDF graph
>> to change the order of triple patterns in the BGP based on selectivity,
>> would it optimize the plan somehow?
>>>>> Many Thanks,
>>>>> Best Regards,
>>>>> Kashif Rabbani.
>>>
>>
> 
>

Re: Order of triple patterns in Where Clause

Posted by Marco Neumann <ma...@gmail.com>.

is there statistical data available for the number of deductions /
joins performed for each SPARQL query of a QueryExecution object?

On Fri, Mar 6, 2020 at 3:16 PM Andy Seaborne <an...@apache.org> wrote:

>
>
> On 05/03/2020 08:32, Kashif Rabbani wrote:
> > Hi Andy,
> >
> > Thanks for your response. I was wondering if there is any detailed
> documentation of the Jena optimization (rewriting & reordering) available
> online? If yes, can you please send me the reference?.
>
> The code mainly.
>
> The TDB stats is documented.
>
> > Also, if I create my own query plan (in algebraic form), is it possible
> to make Jena execute it as it is? I mean how to turn off jena’s
> optimization (rewriting & reordering)  and force my query plan for
> execution.
>
> Yes - two parts - algebra rewrites and BGP reordering.
>
> The context is a mapping of settings.
> there is a global context (ARQ.getContext())
> one per the DatasetGraph.getContext()
> one per query execution. QueryExecution.getContext()
>
> and it is treated hierarchically:
>
> Lookup in QueryExecution then DatasetGraph the Global.
>
> :: Algebra rewrite
>
> Some algebra rewrites have to be done - property functions, and rewrite
> some variables due to scoping. These aren't really "optimizations steps"
> but happen in that phase. There is OptimizerMinimal for those.
>
> To turn off optimizer and still do the minimum steps.
>
> context.set(ARQ.optimization, false)
>
> Either Algebra.exec(op, dsg) executes the algebra as given - that's a
> very low levelway of doing it.
>
> Turning the optimizer off is better because all the APIs work. eg
> QueryExecution.
>
> :: BGP reordering
>
> The reordering of triple patterns is separate.
> BGP steps are performed by a StageGenerator.
>
> To set up to use a custom StageGenerator:
>
> StageBuilder.setGenerator(ARQ.getContext(), stageGenerator) ;
>
> That's really only  call of
>     context.set(ARQ.stageGenerator, myStageGenerator) ;
>
> The default is StageGenratorGeneric that does ReorderFixed.
> It is used if there is no other setting in the context.
>
>      Andy
>
> >
> > Thanks again for your help.
> >
> > Regards,
> >
> > Kashif Rabbani,
> > Research Assistant,
> > Department of Computer Science,
> > Aalborg University, Denmark.
> >
> >> On 3 Mar 2020, at 13.43, Andy Seaborne <an...@apache.org> wrote:
> >>
> >> Hi Kashif,
> >>
> >> Optimization happens in two stages:
> >>
> >> 1. Rewrite of the algebra
> >> 2. Reordering of the BGPs
> >>
> >> BGPs can be implemented differnet ways - and they are an inferenece
> extnesion point in SPARQL.
> >>
> >> What you see if the first. BGPs are reordered during execution.
> >>
> >> The algorithm can be stats driven for TDB and TDB2 storage:
> >>   https://jena.apache.org/documentation/tdb/optimizer.html
> >>
> >> The interface is
> org.apache.jena.sparql.engine.optimizer.reorder.ReorderTransformation
> >>
> >> and a general purpose reordering is done for in-memory and is the
> default for TDB.
> >>
> >> The default reorder is "grounded triples first, leave equal weights
> alone". It cascades whether a term is bound by an earlier step.
> >>
> >>>     { ?a  mbz:alias           "Amy Beach" .
> >>>       ?b  cmno:hasInfluenced  ?a .
> >>>       ?c  mo:composer         ?b ;
> >>>           bio:date            ?d
> >>>     }
> >>
> >> That's actually the default order -
> >>
> >> ?a  mbz:alias           "Amy Beach" .
> >>
> >> has two bound terms so is done first.
> >>
> >> and now ?a is bound so
> >> ?b  cmno:hasInfluenced  ?a .
> >>
> >> etc.
> >>
> >> Given the boundedness of the pattern, and (guess) mbz:alias "Amy Beach"
> is quite selective, With stats  ? <property> ? would have to be less
> numerous than ? mbz:alias "Amy Beach".
> >>
> >> There's no algebra optimization for your example, only BGP reordering.
> >>
> >> qparse --print=opt shows stage 1 optimizations.
> >>
> >> Executing with "explain" shows BGP execution.
> >>
> >>     Andy
> >>
> >>
> >>
> >> On 03/03/2020 11:56, Kashif Rabbani wrote:
> >>> Hi awesome community,
> >>> I have a question,  I am working on optimizing SPARQL query plan and I
> wonder does the order of triple patterns in the where clause effects the
> query plan or not?
> >>> For example, given a following query:
> >>> PREFIX  bio:  <http://purl.org/vocab/bio/0.1/>
> >>> PREFIX  mo:   <http://purl.org/ontology/mo/>
> >>> PREFIX  mbz:  <http://dbtune.org/musicbrainz/resource/vocab/>
> >>> PREFIX  cmno: <http://purl.org/ontology/classicalmusicnav#>
> >>> SELECT  ?a ?b ?c
> >>> WHERE
> >>>    { ?a  mbz:alias           "Amy Beach" .
> >>>      ?b  cmno:hasInfluenced  ?a .
> >>>      ?c  mo:composer         ?b ;
> >>>          bio:date            ?d
> >>>    }
> >>> // Let’s generate its algebra
> >>> Op op = Algebra.compile(query); results into this:
> >>> (project (?a ?b ?c)
> >>>    (bgp
> >>>      (triple ?a <http://dbtune.org/musicbrainz/resource/vocab/alias>
> "Amy Beach")
> >>>      (triple ?b <
> http://purl.org/ontology/classicalmusicnav#hasInfluenced> ?a)
> >>>      (triple ?c <http://purl.org/ontology/mo/composer> ?b)
> >>>      (triple ?c <http://purl.org/vocab/bio/0.1/date> ?d)
> >>>    ))
> >>> The bgp in algebra follows the exact same order as specified in the
> where clause of the query. Very precisely, does Jena constructs the query
> plan as it is? or it will change the order at some other level?
> >>> I would be happy if someone can guide me about how the Jena's plan
> actually constructed. If I will use some statistics of the actual RDF graph
> to change the order of triple patterns in the BGP based on selectivity,
> would it optimize the plan somehow?
> >>> Many Thanks,
> >>> Best Regards,
> >>> Kashif Rabbani.
> >
>


-- 


---
Marco Neumann
KONA

Re: Order of triple patterns in Where Clause

Posted by Andy Seaborne <an...@apache.org>.


On 05/03/2020 08:32, Kashif Rabbani wrote:
> Hi Andy,
> 
> Thanks for your response. I was wondering if there is any detailed documentation of the Jena optimization (rewriting & reordering) available online? If yes, can you please send me the reference?.

The code mainly.

The TDB stats is documented.

> Also, if I create my own query plan (in algebraic form), is it possible to make Jena execute it as it is? I mean how to turn off jena’s optimization (rewriting & reordering)  and force my query plan for execution.

Yes - two parts - algebra rewrites and BGP reordering.

The context is a mapping of settings.
there is a global context (ARQ.getContext())
one per the DatasetGraph.getContext()
one per query execution. QueryExecution.getContext()

and it is treated hierarchically:

Lookup in QueryExecution then DatasetGraph the Global.

:: Algebra rewrite

Some algebra rewrites have to be done - property functions, and rewrite 
some variables due to scoping. These aren't really "optimizations steps" 
but happen in that phase. There is OptimizerMinimal for those.

To turn off optimizer and still do the minimum steps.

context.set(ARQ.optimization, false)

Either Algebra.exec(op, dsg) executes the algebra as given - that's a 
very low levelway of doing it.

Turning the optimizer off is better because all the APIs work. eg 
QueryExecution.

:: BGP reordering

The reordering of triple patterns is separate.
BGP steps are performed by a StageGenerator.

To set up to use a custom StageGenerator:

StageBuilder.setGenerator(ARQ.getContext(), stageGenerator) ;

That's really only  call of
    context.set(ARQ.stageGenerator, myStageGenerator) ;

The default is StageGenratorGeneric that does ReorderFixed.
It is used if there is no other setting in the context.

     Andy

> 
> Thanks again for your help.
> 
> Regards,
> 
> Kashif Rabbani,
> Research Assistant,
> Department of Computer Science,
> Aalborg University, Denmark.
> 
>> On 3 Mar 2020, at 13.43, Andy Seaborne <an...@apache.org> wrote:
>>
>> Hi Kashif,
>>
>> Optimization happens in two stages:
>>
>> 1. Rewrite of the algebra
>> 2. Reordering of the BGPs
>>
>> BGPs can be implemented differnet ways - and they are an inferenece extnesion point in SPARQL.
>>
>> What you see if the first. BGPs are reordered during execution.
>>
>> The algorithm can be stats driven for TDB and TDB2 storage:
>>   https://jena.apache.org/documentation/tdb/optimizer.html
>>
>> The interface is org.apache.jena.sparql.engine.optimizer.reorder.ReorderTransformation
>>
>> and a general purpose reordering is done for in-memory and is the default for TDB.
>>
>> The default reorder is "grounded triples first, leave equal weights alone". It cascades whether a term is bound by an earlier step.
>>
>>>     { ?a  mbz:alias           "Amy Beach" .
>>>       ?b  cmno:hasInfluenced  ?a .
>>>       ?c  mo:composer         ?b ;
>>>           bio:date            ?d
>>>     }
>>
>> That's actually the default order -
>>
>> ?a  mbz:alias           "Amy Beach" .
>>
>> has two bound terms so is done first.
>>
>> and now ?a is bound so
>> ?b  cmno:hasInfluenced  ?a .
>>
>> etc.
>>
>> Given the boundedness of the pattern, and (guess) mbz:alias "Amy Beach" is quite selective, With stats  ? <property> ? would have to be less numerous than ? mbz:alias "Amy Beach".
>>
>> There's no algebra optimization for your example, only BGP reordering.
>>
>> qparse --print=opt shows stage 1 optimizations.
>>
>> Executing with "explain" shows BGP execution.
>>
>>     Andy
>>
>>
>>
>> On 03/03/2020 11:56, Kashif Rabbani wrote:
>>> Hi awesome community,
>>> I have a question,  I am working on optimizing SPARQL query plan and I wonder does the order of triple patterns in the where clause effects the query plan or not?
>>> For example, given a following query:
>>> PREFIX  bio:  <http://purl.org/vocab/bio/0.1/>
>>> PREFIX  mo:   <http://purl.org/ontology/mo/>
>>> PREFIX  mbz:  <http://dbtune.org/musicbrainz/resource/vocab/>
>>> PREFIX  cmno: <http://purl.org/ontology/classicalmusicnav#>
>>> SELECT  ?a ?b ?c
>>> WHERE
>>>    { ?a  mbz:alias           "Amy Beach" .
>>>      ?b  cmno:hasInfluenced  ?a .
>>>      ?c  mo:composer         ?b ;
>>>          bio:date            ?d
>>>    }
>>> // Let’s generate its algebra
>>> Op op = Algebra.compile(query); results into this:
>>> (project (?a ?b ?c)
>>>    (bgp
>>>      (triple ?a <http://dbtune.org/musicbrainz/resource/vocab/alias> "Amy Beach")
>>>      (triple ?b <http://purl.org/ontology/classicalmusicnav#hasInfluenced> ?a)
>>>      (triple ?c <http://purl.org/ontology/mo/composer> ?b)
>>>      (triple ?c <http://purl.org/vocab/bio/0.1/date> ?d)
>>>    ))
>>> The bgp in algebra follows the exact same order as specified in the where clause of the query. Very precisely, does Jena constructs the query plan as it is? or it will change the order at some other level?
>>> I would be happy if someone can guide me about how the Jena's plan actually constructed. If I will use some statistics of the actual RDF graph to change the order of triple patterns in the BGP based on selectivity, would it optimize the plan somehow?
>>> Many Thanks,
>>> Best Regards,
>>> Kashif Rabbani.
>

Re: Order of triple patterns in Where Clause

Posted by Kashif Rabbani <ka...@cs.aau.dk>.

Hi Andy, 

Thanks for your response. I was wondering if there is any detailed documentation of the Jena optimization (rewriting & reordering) available online? If yes, can you please send me the reference?. 

Also, if I create my own query plan (in algebraic form), is it possible to make Jena execute it as it is? I mean how to turn off jena’s optimization (rewriting & reordering)  and force my query plan for execution. 

Thanks again for your help. 

Regards,

Kashif Rabbani, 
Research Assistant, 
Department of Computer Science,
Aalborg University, Denmark.

> On 3 Mar 2020, at 13.43, Andy Seaborne <an...@apache.org> wrote:
> 
> Hi Kashif,
> 
> Optimization happens in two stages:
> 
> 1. Rewrite of the algebra
> 2. Reordering of the BGPs
> 
> BGPs can be implemented differnet ways - and they are an inferenece extnesion point in SPARQL.
> 
> What you see if the first. BGPs are reordered during execution.
> 
> The algorithm can be stats driven for TDB and TDB2 storage:
>  https://jena.apache.org/documentation/tdb/optimizer.html
> 
> The interface is org.apache.jena.sparql.engine.optimizer.reorder.ReorderTransformation
> 
> and a general purpose reordering is done for in-memory and is the default for TDB.
> 
> The default reorder is "grounded triples first, leave equal weights alone". It cascades whether a term is bound by an earlier step.
> 
> >    { ?a  mbz:alias           "Amy Beach" .
> >      ?b  cmno:hasInfluenced  ?a .
> >      ?c  mo:composer         ?b ;
> >          bio:date            ?d
> >    }
> 
> That's actually the default order -
> 
> ?a  mbz:alias           "Amy Beach" .
> 
> has two bound terms so is done first.
> 
> and now ?a is bound so
> ?b  cmno:hasInfluenced  ?a .
> 
> etc.
> 
> Given the boundedness of the pattern, and (guess) mbz:alias "Amy Beach" is quite selective, With stats  ? <property> ? would have to be less numerous than ? mbz:alias "Amy Beach".
> 
> There's no algebra optimization for your example, only BGP reordering.
> 
> qparse --print=opt shows stage 1 optimizations.
> 
> Executing with "explain" shows BGP execution.
> 
>    Andy
> 
> 
> 
> On 03/03/2020 11:56, Kashif Rabbani wrote:
>> Hi awesome community,
>> I have a question,  I am working on optimizing SPARQL query plan and I wonder does the order of triple patterns in the where clause effects the query plan or not?
>> For example, given a following query:
>> PREFIX  bio:  <http://purl.org/vocab/bio/0.1/>
>> PREFIX  mo:   <http://purl.org/ontology/mo/>
>> PREFIX  mbz:  <http://dbtune.org/musicbrainz/resource/vocab/>
>> PREFIX  cmno: <http://purl.org/ontology/classicalmusicnav#>
>> SELECT  ?a ?b ?c
>> WHERE
>>   { ?a  mbz:alias           "Amy Beach" .
>>     ?b  cmno:hasInfluenced  ?a .
>>     ?c  mo:composer         ?b ;
>>         bio:date            ?d
>>   }
>> // Let’s generate its algebra
>> Op op = Algebra.compile(query); results into this:
>> (project (?a ?b ?c)
>>   (bgp
>>     (triple ?a <http://dbtune.org/musicbrainz/resource/vocab/alias> "Amy Beach")
>>     (triple ?b <http://purl.org/ontology/classicalmusicnav#hasInfluenced> ?a)
>>     (triple ?c <http://purl.org/ontology/mo/composer> ?b)
>>     (triple ?c <http://purl.org/vocab/bio/0.1/date> ?d)
>>   ))
>> The bgp in algebra follows the exact same order as specified in the where clause of the query. Very precisely, does Jena constructs the query plan as it is? or it will change the order at some other level?
>> I would be happy if someone can guide me about how the Jena's plan actually constructed. If I will use some statistics of the actual RDF graph to change the order of triple patterns in the BGP based on selectivity, would it optimize the plan somehow?
>> Many Thanks,
>> Best Regards,
>> Kashif Rabbani.

Re: Order of triple patterns in Where Clause

Posted by Andy Seaborne <an...@apache.org>.

Hi Kashif,

Optimization happens in two stages:

1. Rewrite of the algebra
2. Reordering of the BGPs

BGPs can be implemented differnet ways - and they are an inferenece 
extnesion point in SPARQL.

What you see if the first. BGPs are reordered during execution.

The algorithm can be stats driven for TDB and TDB2 storage:
   https://jena.apache.org/documentation/tdb/optimizer.html

The interface is 
org.apache.jena.sparql.engine.optimizer.reorder.ReorderTransformation

and a general purpose reordering is done for in-memory and is the 
default for TDB.

The default reorder is "grounded triples first, leave equal weights 
alone". It cascades whether a term is bound by an earlier step.

 >    { ?a  mbz:alias           "Amy Beach" .
 >      ?b  cmno:hasInfluenced  ?a .
 >      ?c  mo:composer         ?b ;
 >          bio:date            ?d
 >    }

That's actually the default order -

?a  mbz:alias           "Amy Beach" .

has two bound terms so is done first.

and now ?a is bound so
?b  cmno:hasInfluenced  ?a .

etc.

Given the boundedness of the pattern, and (guess) mbz:alias "Amy Beach" 
is quite selective, With stats  ? <property> ? would have to be less 
numerous than ? mbz:alias "Amy Beach".

There's no algebra optimization for your example, only BGP reordering.

qparse --print=opt shows stage 1 optimizations.

Executing with "explain" shows BGP execution.

     Andy



On 03/03/2020 11:56, Kashif Rabbani wrote:
> Hi awesome community,
> 
> I have a question,  I am working on optimizing SPARQL query plan and I wonder does the order of triple patterns in the where clause effects the query plan or not?
> 
> For example, given a following query:
> 
> PREFIX  bio:  <http://purl.org/vocab/bio/0.1/>
> PREFIX  mo:   <http://purl.org/ontology/mo/>
> PREFIX  mbz:  <http://dbtune.org/musicbrainz/resource/vocab/>
> PREFIX  cmno: <http://purl.org/ontology/classicalmusicnav#>
> 
> SELECT  ?a ?b ?c
> WHERE
>    { ?a  mbz:alias           "Amy Beach" .
>      ?b  cmno:hasInfluenced  ?a .
>      ?c  mo:composer         ?b ;
>          bio:date            ?d
>    }
> 
> 
> 
> // Let’s generate its algebra
> Op op = Algebra.compile(query); results into this:
> 
> (project (?a ?b ?c)
>    (bgp
>      (triple ?a <http://dbtune.org/musicbrainz/resource/vocab/alias> "Amy Beach")
>      (triple ?b <http://purl.org/ontology/classicalmusicnav#hasInfluenced> ?a)
>      (triple ?c <http://purl.org/ontology/mo/composer> ?b)
>      (triple ?c <http://purl.org/vocab/bio/0.1/date> ?d)
>    ))
> 
> The bgp in algebra follows the exact same order as specified in the where clause of the query. Very precisely, does Jena constructs the query plan as it is? or it will change the order at some other level?
> 
> I would be happy if someone can guide me about how the Jena's plan actually constructed. If I will use some statistics of the actual RDF graph to change the order of triple patterns in the BGP based on selectivity, would it optimize the plan somehow?
> 
> Many Thanks,
> 
> 
> Best Regards,
> Kashif Rabbani.
>