You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@asterixdb.apache.org by Ildar Absalyamov <il...@gmail.com> on 2016/02/03 21:58:49 UTC

Abstract\Join\Select-IntroduceAccessMethodRule refactoring

Hi all,

In a process of adding cardinality estimates for index probes I have been trying to appropriately modify AbstractIntroduceAccessMethod rule, but encountered long overdue problem that it is really hard to modify that rule. Over the years the rule has grown to almost 1000 lines, but what is even worse is that it tries to be very generic: it introduces index access methods for all index types + tries to work for both select and joins cases. While the former might be more or less abstracted the logic for the latter (if select to something, if join do the other) spans the whole rule, making it very messy. There are non-abstract subtypes of this rule for Select and Join respectively, but they don’t have much logic in them.
I believe this rule needs a major refactoring, clearly separating the join\select logic. Here is the short list of actions, which ideally should be properly abstracted (items with * mark the actions needed for statistics-based cardinality estimation):
1) Parse the plan and extract optimizableSubTree(s) out of it
2) Enumerate all possible accessMethods for particular optSubTree
3) Filter out accessMethods based on predicate and\or type information
4) Create searchArgument for index probe
5*) Find the selectivity for given pair of accessMethod and searchArgument
6*) Calculate the accessMethod cost, given selectivity and cost model
7*) Pick the AM with the lowest cost
8) Apply the plan transformation, needed for particular accessMethod.

Although I do see the need for the refactoring, I don’t quite have a clear picture in my head how would it look like. Maybe @Taewoo or @Jianfeng, who are also working with this rule have some thoughts on that issue?
We can schedule a meeting to discuss the problem and come up with potential solution.

Best regards,
Ildar





Re: Abstract\Join\Select-IntroduceAccessMethodRule refactoring

Posted by Jianfeng Jia <ji...@gmail.com>.
+1!
I’d like to join in the meeting.

One idea for No.2) Enumerate all possible accessMethods for particular optSubTree, we should have a functional stateless interface to produce plans for different access methods, then we can compare/analysis/intersect them much easier. 

> On Feb 3, 2016, at 1:46 PM, Taewoo Kim <wa...@gmail.com> wrote:
> 
> @Ildar: that is a good point. We have a few places -
> AbstractIntroduceAccessMethod rule, IntroduceJoinAccessMethod rule,
> IntroduceSelectAccessMethod rule, AccessMethodUtils, BTreeAccessMethod,
> RTreeAccessMethod, InvertedIndexAccessMethod. All classes are related to
> transform a plan into an index-utilized plan. I don't see a clear picture,
> either. Perhaps, we need to first analyze the current methods and can have
> a meeting.
> 
> Best,
> Taewoo
> 
> On Wed, Feb 3, 2016 at 12:58 PM, Ildar Absalyamov <
> ildar.absalyamov@gmail.com> wrote:
> 
>> Hi all,
>> 
>> In a process of adding cardinality estimates for index probes I have been
>> trying to appropriately modify AbstractIntroduceAccessMethod rule, but
>> encountered long overdue problem that it is really hard to modify that
>> rule. Over the years the rule has grown to almost 1000 lines, but what is
>> even worse is that it tries to be very generic: it introduces index access
>> methods for all index types + tries to work for both select and joins
>> cases. While the former might be more or less abstracted the logic for the
>> latter (if select to something, if join do the other) spans the whole rule,
>> making it very messy. There are non-abstract subtypes of this rule for
>> Select and Join respectively, but they don’t have much logic in them.
>> I believe this rule needs a major refactoring, clearly separating the
>> join\select logic. Here is the short list of actions, which ideally should
>> be properly abstracted (items with * mark the actions needed for
>> statistics-based cardinality estimation):
>> 1) Parse the plan and extract optimizableSubTree(s) out of it
>> 2) Enumerate all possible accessMethods for particular optSubTree
>> 3) Filter out accessMethods based on predicate and\or type information
>> 4) Create searchArgument for index probe
>> 5*) Find the selectivity for given pair of accessMethod and searchArgument
>> 6*) Calculate the accessMethod cost, given selectivity and cost model
>> 7*) Pick the AM with the lowest cost
>> 8) Apply the plan transformation, needed for particular accessMethod.
>> 
>> Although I do see the need for the refactoring, I don’t quite have a clear
>> picture in my head how would it look like. Maybe @Taewoo or @Jianfeng, who
>> are also working with this rule have some thoughts on that issue?
>> We can schedule a meeting to discuss the problem and come up with
>> potential solution.
>> 
>> Best regards,
>> Ildar
>> 
>> 
>> 
>> 
>> 



Best,

Jianfeng Jia
PhD Candidate of Computer Science
University of California, Irvine


Re: Abstract\Join\Select-IntroduceAccessMethodRule refactoring

Posted by Taewoo Kim <wa...@gmail.com>.
@Ildar: that is a good point. We have a few places -
AbstractIntroduceAccessMethod rule, IntroduceJoinAccessMethod rule,
IntroduceSelectAccessMethod rule, AccessMethodUtils, BTreeAccessMethod,
RTreeAccessMethod, InvertedIndexAccessMethod. All classes are related to
transform a plan into an index-utilized plan. I don't see a clear picture,
either. Perhaps, we need to first analyze the current methods and can have
a meeting.

Best,
Taewoo

On Wed, Feb 3, 2016 at 12:58 PM, Ildar Absalyamov <
ildar.absalyamov@gmail.com> wrote:

> Hi all,
>
> In a process of adding cardinality estimates for index probes I have been
> trying to appropriately modify AbstractIntroduceAccessMethod rule, but
> encountered long overdue problem that it is really hard to modify that
> rule. Over the years the rule has grown to almost 1000 lines, but what is
> even worse is that it tries to be very generic: it introduces index access
> methods for all index types + tries to work for both select and joins
> cases. While the former might be more or less abstracted the logic for the
> latter (if select to something, if join do the other) spans the whole rule,
> making it very messy. There are non-abstract subtypes of this rule for
> Select and Join respectively, but they don’t have much logic in them.
> I believe this rule needs a major refactoring, clearly separating the
> join\select logic. Here is the short list of actions, which ideally should
> be properly abstracted (items with * mark the actions needed for
> statistics-based cardinality estimation):
> 1) Parse the plan and extract optimizableSubTree(s) out of it
> 2) Enumerate all possible accessMethods for particular optSubTree
> 3) Filter out accessMethods based on predicate and\or type information
> 4) Create searchArgument for index probe
> 5*) Find the selectivity for given pair of accessMethod and searchArgument
> 6*) Calculate the accessMethod cost, given selectivity and cost model
> 7*) Pick the AM with the lowest cost
> 8) Apply the plan transformation, needed for particular accessMethod.
>
> Although I do see the need for the refactoring, I don’t quite have a clear
> picture in my head how would it look like. Maybe @Taewoo or @Jianfeng, who
> are also working with this rule have some thoughts on that issue?
> We can schedule a meeting to discuss the problem and come up with
> potential solution.
>
> Best regards,
> Ildar
>
>
>
>
>