You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flink.apache.org by yh z <zh...@gmail.com> on 2022/12/29 06:43:33 UTC

[DISCUSS] Adding a option for planner to decide which join reorder rule to choose

Hi, devs,

I'd like to start a discuss about adding an option called
"table.oprimizer.busy-join-reorder-threshold" for planner rule while we try
to introduce a new busy join reorder rule[1] into Flink.

This join reorder rule is based on dynamic programing[2], which can store
all possible intermediate results, and the cost model can be used to select
the optimal join reorder result. Compare with the existing Lopt join
reorder rule, the new rule can give more possible results and the result
can be more accurate. However, the search space of this rule will become
very large as the number of tables increases. So we should introduce an
option to limit the expansion of search space, if the number of table can
be reordered less than the threshold, the new busy join reorder rule is
used. On the contrary, the Lopt rule is used.

The default threshold intended to be set to 12. One reason is that in the
tpc-ds benchmark test, when the number of tables exceeds 12, the
optimization time will be very long. The other reason is that it refers to
relevant engines, like Spark, whose recommended setting is 12.[3]

Looking forward to your feedback.

[1]  https://issues.apache.org/jira/browse/FLINK-30376
[2]
https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
[3]
https://spark.apache.org/docs/3.3.1/configuration.html#runtime-sql-configuration

Best regards,
Yunhong Zheng

Re: [DISCUSS] Adding a option for planner to decide which join reorder rule to choose

Posted by yh z <zh...@gmail.com>.

Hi Jark,

Thanks for your reply.

We are going to use bushy join reorder rule and Lopt join reorder rule at
the same time. By setting the threshold
"table.optimizer.bushy-join-reorder-threshold", when the number of tables
need to be reordered is less than/equals this threshold, bushy join reorder
rule will be used. On the contrary, when the number of tables need to be
reordered is greater than this threshold, Lopt join reorder will be used.
Since for most queries, the number of tables need to be reordered is less
than/equals this threshold, so bushy join reorder rule can be regarded as
the default join reorder rule.

I'm really sorry. Because I didn't carefully check the contents of the
first email, I wrote the wrong words in that email. I have made sure that
the correct word "bushy" is used in pr[1]. The threshold name indeed is
"table.optimizer.bushy-join-reorder-threshold".

[1] https://github.com/apache/flink/pull/21530

Best regards,
Yunhong Zheng

Jark Wu <im...@gmail.com> 于2023年1月3日周二 20:06写道：

> Hi Yuhong,
>
> Thanks for driving the feature.
>
> I just have one question. Is the bushy join reorder optimization enabled
> by default? Does the bushy join reorder will replace the existing Lopt join
> reorder rule?
>
> Besides, I guess the option "table.oprimizer.busy-join-reorder-threshold”
> should be "table.optimizer.bushy-join-reorder-threshold”?  (I guess they
> are just typos, as your last email said, but I just want to clarify as it
> is a public API).
>
> Best,
> Jark
>
>
> > 2023年1月3日 12:53，Benchao Li <li...@apache.org> 写道：
> >
> > Hi Yunhong,
> >
> > Thanks for driving this~
> >
> > I haven't gone deep into the implementation details yet. Regarding the
> > general description, I would ask a few questions firstly:
> >
> > #1, Is there any benchmark results about the optimization latency change
> > compared to current approach? In OLAP scenario, query optimization
> latency
> > is more crucial.
> >
> > #2, About the term "busy join reorder", is there any others systems which
> > also use this term? I know Calcite has a rule[1] which uses the term
> "bushy
> > join".
> >
> > #3, About the implementation, if this does the same work as Calcite
> > MultiJoinOptimizeBushyRule, is it possible to use the Calcite version
> > directly or extend it in some way?
> >
> > [1]
> >
> https://github.com/apache/calcite/blob/9054682145727fbf8a13e3c79b3512be41574349/core/src/main/java/org/apache/calcite/rel/rules/MultiJoinOptimizeBushyRule.java#L78
> >
> > yh z <zh...@gmail.com> 于2022年12月29日周四 14:44写道：
> >
> >> Hi, devs,
> >>
> >> I'd like to start a discuss about adding an option called
> >> "table.oprimizer.busy-join-reorder-threshold" for planner rule while we
> try
> >> to introduce a new busy join reorder rule[1] into Flink.
> >>
> >> This join reorder rule is based on dynamic programing[2], which can
> store
> >> all possible intermediate results, and the cost model can be used to
> select
> >> the optimal join reorder result. Compare with the existing Lopt join
> >> reorder rule, the new rule can give more possible results and the result
> >> can be more accurate. However, the search space of this rule will become
> >> very large as the number of tables increases. So we should introduce an
> >> option to limit the expansion of search space, if the number of table
> can
> >> be reordered less than the threshold, the new busy join reorder rule is
> >> used. On the contrary, the Lopt rule is used.
> >>
> >> The default threshold intended to be set to 12. One reason is that in
> the
> >> tpc-ds benchmark test, when the number of tables exceeds 12, the
> >> optimization time will be very long. The other reason is that it refers
> to
> >> relevant engines, like Spark, whose recommended setting is 12.[3]
> >>
> >> Looking forward to your feedback.
> >>
> >> [1]  https://issues.apache.org/jira/browse/FLINK-30376
> >> [2]
> >>
> >>
> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
> >> [3]
> >>
> >>
> https://spark.apache.org/docs/3.3.1/configuration.html#runtime-sql-configuration
> >>
> >> Best regards,
> >> Yunhong Zheng
> >>
> >
> >
> > --
> >
> > Best,
> > Benchao Li
>
>

Re: [DISCUSS] Adding a option for planner to decide which join reorder rule to choose

Posted by Jark Wu <im...@gmail.com>.

Hi Yuhong,

Thanks for driving the feature. 

I just have one question. Is the bushy join reorder optimization enabled by default? Does the bushy join reorder will replace the existing Lopt join
reorder rule? 

Besides, I guess the option "table.oprimizer.busy-join-reorder-threshold” should be "table.optimizer.bushy-join-reorder-threshold”?  (I guess they are just typos, as your last email said, but I just want to clarify as it is a public API).

Best,
Jark


> 2023年1月3日 12:53，Benchao Li <li...@apache.org> 写道：
> 
> Hi Yunhong,
> 
> Thanks for driving this~
> 
> I haven't gone deep into the implementation details yet. Regarding the
> general description, I would ask a few questions firstly:
> 
> #1, Is there any benchmark results about the optimization latency change
> compared to current approach? In OLAP scenario, query optimization latency
> is more crucial.
> 
> #2, About the term "busy join reorder", is there any others systems which
> also use this term? I know Calcite has a rule[1] which uses the term "bushy
> join".
> 
> #3, About the implementation, if this does the same work as Calcite
> MultiJoinOptimizeBushyRule, is it possible to use the Calcite version
> directly or extend it in some way?
> 
> [1]
> https://github.com/apache/calcite/blob/9054682145727fbf8a13e3c79b3512be41574349/core/src/main/java/org/apache/calcite/rel/rules/MultiJoinOptimizeBushyRule.java#L78
> 
> yh z <zh...@gmail.com> 于2022年12月29日周四 14:44写道：
> 
>> Hi, devs,
>> 
>> I'd like to start a discuss about adding an option called
>> "table.oprimizer.busy-join-reorder-threshold" for planner rule while we try
>> to introduce a new busy join reorder rule[1] into Flink.
>> 
>> This join reorder rule is based on dynamic programing[2], which can store
>> all possible intermediate results, and the cost model can be used to select
>> the optimal join reorder result. Compare with the existing Lopt join
>> reorder rule, the new rule can give more possible results and the result
>> can be more accurate. However, the search space of this rule will become
>> very large as the number of tables increases. So we should introduce an
>> option to limit the expansion of search space, if the number of table can
>> be reordered less than the threshold, the new busy join reorder rule is
>> used. On the contrary, the Lopt rule is used.
>> 
>> The default threshold intended to be set to 12. One reason is that in the
>> tpc-ds benchmark test, when the number of tables exceeds 12, the
>> optimization time will be very long. The other reason is that it refers to
>> relevant engines, like Spark, whose recommended setting is 12.[3]
>> 
>> Looking forward to your feedback.
>> 
>> [1]  https://issues.apache.org/jira/browse/FLINK-30376
>> [2]
>> 
>> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
>> [3]
>> 
>> https://spark.apache.org/docs/3.3.1/configuration.html#runtime-sql-configuration
>> 
>> Best regards,
>> Yunhong Zheng
>> 
> 
> 
> -- 
> 
> Best,
> Benchao Li

Re: [DISCUSS] Adding a option for planner to decide which join reorder rule to choose

Posted by Jark Wu <im...@gmail.com>.

+1 for the option and the default value. 

Best,
Jark

> 2023年1月10日 16:24，yh z <zh...@gmail.com> 写道：
> 
> Hi all,
> 
> Thanks for yours reply.
> 
> After receiving your comments and making targeted modifications. The
> conclusion is that the option
> "table.optimizer.bushy-join-reorder-threshold" can be added. Relevant PR
> [1] has been submitted. Sincerely welcome to review it. Thank you.
> 
> This discussion will be closed soon. Thanks for your comments.
> 
> [1] https://github.com/apache/flink/pull/21530
> 
> Best regards,
> Yunhong Zheng
> 
> godfrey he <go...@gmail.com> 于2023年1月9日周一 10:26写道：
> 
>> Hi Yunhong,
>> 
>> Thanks for driving this discuss!
>> 
>> This option looks good to me,
>> and looking forward to contributing this rule back to Apache Calcite.
>> 
>> Best,
>> Godfrey
>> 
>> 
>> 
>> yh z <zh...@gmail.com> 于2023年1月5日周四 15:32写道：
>>> 
>>> Hi Benchao,
>>> 
>>> Thanks for your reply.
>>> 
>>> Since our existing test results are based on multiple performance
>>> optimization points on the TPC-DS benchmark[1][2], we haven't separately
>>> tested the performance improvement brought by new bushy join reorder
>>> rule. I will complete this test recently and update the results to this
>>> email.
>>> 
>>> I am very happy to contribute to Calcite. Later, I will push the PR of
>> the
>>> bushy join reorder rule to Calcite.
>>> 
>>> [1] https://issues.apache.org/jira/browse/FLINK-27583
>>> [2] https://issues.apache.org/jira/browse/FLINK-29942
>>> 
>>> Best regards,
>>> Yunhong Zheng
>>> 
>>> Benchao Li <li...@apache.org> 于2023年1月4日周三 19:03写道：
>>> 
>>>> Hi Yunhong,
>>>> 
>>>> Thanks for the updating. And introducing the new bushy join reorder
>>>> algorithm would be great. And I also agree with the newly added config
>>>> option "table.optimizer.bushy-join-reorder-threshold" and 12 as the
>> default
>>>> value.
>>>> 
>>>> 
>>>>> As for optimization
>>>>> latency, this is the problem to be solved by the parameters to be
>>>>> introduced in this discussion. When there are many tables need to be
>>>>> reordered, the optimization latency will increase greatly. But when
>> the
>>>>> table numbers less than the threshold, the latency is the same as the
>>>>> LoptOptimizeJoinRule.
>>>> 
>>>> 
>>>> This sounds great. If possible, could you share more numbers to us?
>> E.g.,
>>>> what's the latency of optimization when there are 11/12 tables for both
>>>> approach?
>>>> 
>>>> For question #3: The implementation of Calcite
>> MultiJoinOptimizeBushyRule
>>>>> is very simple, and it will not store the intermediate results at
>> all.
>>>> So,
>>>>> the implementation of Calcite cannot get all possible join reorder
>>>> results
>>>>> and it cannot combine with the current cost model to get more
>> reasonable
>>>>> join reorder results.
>>>> 
>>>> 
>>>> It's ok to do it in Flink as the first step. It would be great to also
>>>> contribute it to Calcite later if possible, it depends on you.
>>>> 
>>>> yh z <zh...@gmail.com> 于2023年1月3日周二 15:27写道：
>>>> 
>>>>> Hi Benchao,
>>>>> 
>>>>> Thanks for your reply.
>>>>> 
>>>>> Actually,  I mistakenly wrote the name "bushy join reorder" to "busy
>> join
>>>>> reorder". I'm sorry for the trouble bring to you. "Bushy join
>> reorder"
>>>>> means we can build a bushy join tree based on cost model, but now
>> Flink
>>>> can
>>>>> only build a left-deep tree using Calcite LoptOptimizeJoinRule. I
>> hope my
>>>>> answers can help you solve the following questions:
>>>>> 
>>>>> For question #1: The biggest advantage of this "bushy join reorder"
>>>>> strategy over the default Flink left-deep tree strategy is that it
>> can
>>>>> retail all possible join reorder plans, and then select the optimal
>> plan
>>>>> according to the cost model. This means that the busy join reorder
>>>> strategy
>>>>> can be better combined with the current cost model to get more
>> reasonable
>>>>> join reorder results. We verified it on the TPC-DS benchmark, with
>> the
>>>>> spark plan as a reference, the new busy join reorder strategy can
>> make
>>>> more
>>>>> TPC-DS query plans be adjusted to be consistent with the Spark plan,
>> and
>>>>> the execution time is signifcantly reduced.  As for optimization
>>>>> latency, this is the problem to be solved by the parameters to be
>>>>> introduced in this discussion. When there are many tables need to be
>>>>> reordered, the optimization latency will increase greatly. But when
>> the
>>>>> table numbers less than the threshold, the latency is the same as the
>>>>> LoptOptimizeJoinRule.
>>>>> 
>>>>> For question #2: According to my research, many compute or database
>>>> systems
>>>>> have the "bushy join reorder" strategies based on dynamic
>> programming.
>>>> For
>>>>> example, Spark and PostgresSql use the same strategy, and the
>> threshold
>>>> be
>>>>> set to 12. Also, some papers, like [1] and [2], have also researched
>> this
>>>>> strategy, and [2] set the threshold to 14.
>>>>> 
>>>>> For question #3: The implementation of Calcite
>> MultiJoinOptimizeBushyRule
>>>>> is very simple, and it will not store the intermediate results at
>> all.
>>>> So,
>>>>> the implementation of Calcite cannot get all possible join reorder
>>>> results
>>>>> and it cannot combine with the current cost model to get more
>> reasonable
>>>>> join reorder results.
>>>>> 
>>>>> 
>>>>> [1]
>>>>> 
>>>>> 
>>>> 
>> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
>>>>> [2] https://db.in.tum.de/~radke/papers/hugejoins.pdf
>>>>> 
>>>>> 
>>>>> 
>>>>> Benchao Li <li...@apache.org> 于2023年1月3日周二 12:54写道：
>>>>> 
>>>>>> Hi Yunhong,
>>>>>> 
>>>>>> Thanks for driving this~
>>>>>> 
>>>>>> I haven't gone deep into the implementation details yet. Regarding
>> the
>>>>>> general description, I would ask a few questions firstly:
>>>>>> 
>>>>>> #1, Is there any benchmark results about the optimization latency
>>>> change
>>>>>> compared to current approach? In OLAP scenario, query optimization
>>>>> latency
>>>>>> is more crucial.
>>>>>> 
>>>>>> #2, About the term "busy join reorder", is there any others systems
>>>> which
>>>>>> also use this term? I know Calcite has a rule[1] which uses the
>> term
>>>>> "bushy
>>>>>> join".
>>>>>> 
>>>>>> #3, About the implementation, if this does the same work as Calcite
>>>>>> MultiJoinOptimizeBushyRule, is it possible to use the Calcite
>> version
>>>>>> directly or extend it in some way?
>>>>>> 
>>>>>> [1]
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> https://github.com/apache/calcite/blob/9054682145727fbf8a13e3c79b3512be41574349/core/src/main/java/org/apache/calcite/rel/rules/MultiJoinOptimizeBushyRule.java#L78
>>>>>> 
>>>>>> yh z <zh...@gmail.com> 于2022年12月29日周四 14:44写道：
>>>>>> 
>>>>>>> Hi, devs,
>>>>>>> 
>>>>>>> I'd like to start a discuss about adding an option called
>>>>>>> "table.oprimizer.busy-join-reorder-threshold" for planner rule
>> while
>>>> we
>>>>>> try
>>>>>>> to introduce a new busy join reorder rule[1] into Flink.
>>>>>>> 
>>>>>>> This join reorder rule is based on dynamic programing[2], which
>> can
>>>>> store
>>>>>>> all possible intermediate results, and the cost model can be
>> used to
>>>>>> select
>>>>>>> the optimal join reorder result. Compare with the existing Lopt
>> join
>>>>>>> reorder rule, the new rule can give more possible results and the
>>>>> result
>>>>>>> can be more accurate. However, the search space of this rule will
>>>>> become
>>>>>>> very large as the number of tables increases. So we should
>> introduce
>>>> an
>>>>>>> option to limit the expansion of search space, if the number of
>> table
>>>>> can
>>>>>>> be reordered less than the threshold, the new busy join reorder
>> rule
>>>> is
>>>>>>> used. On the contrary, the Lopt rule is used.
>>>>>>> 
>>>>>>> The default threshold intended to be set to 12. One reason is
>> that in
>>>>> the
>>>>>>> tpc-ds benchmark test, when the number of tables exceeds 12, the
>>>>>>> optimization time will be very long. The other reason is that it
>>>> refers
>>>>>> to
>>>>>>> relevant engines, like Spark, whose recommended setting is 12.[3]
>>>>>>> 
>>>>>>> Looking forward to your feedback.
>>>>>>> 
>>>>>>> [1]  https://issues.apache.org/jira/browse/FLINK-30376
>>>>>>> [2]
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
>>>>>>> [3]
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> https://spark.apache.org/docs/3.3.1/configuration.html#runtime-sql-configuration
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> Yunhong Zheng
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> Best,
>>>>>> Benchao Li
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> 
>>>> Best,
>>>> Benchao Li
>>>> 
>>

Re: [DISCUSS] Adding a option for planner to decide which join reorder rule to choose

Posted by yh z <zh...@gmail.com>.

Hi all,

Thanks for yours reply.

After receiving your comments and making targeted modifications. The
conclusion is that the option
"table.optimizer.bushy-join-reorder-threshold" can be added. Relevant PR
[1] has been submitted. Sincerely welcome to review it. Thank you.

This discussion will be closed soon. Thanks for your comments.

[1] https://github.com/apache/flink/pull/21530

Best regards,
Yunhong Zheng

godfrey he <go...@gmail.com> 于2023年1月9日周一 10:26写道：

> Hi Yunhong,
>
> Thanks for driving this discuss!
>
> This option looks good to me,
> and looking forward to contributing this rule back to Apache Calcite.
>
> Best,
> Godfrey
>
>
>
> yh z <zh...@gmail.com> 于2023年1月5日周四 15:32写道：
> >
> > Hi Benchao,
> >
> > Thanks for your reply.
> >
> > Since our existing test results are based on multiple performance
> > optimization points on the TPC-DS benchmark[1][2], we haven't separately
> > tested the performance improvement brought by new bushy join reorder
> > rule. I will complete this test recently and update the results to this
> > email.
> >
> > I am very happy to contribute to Calcite. Later, I will push the PR of
> the
> > bushy join reorder rule to Calcite.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-27583
> > [2] https://issues.apache.org/jira/browse/FLINK-29942
> >
> > Best regards,
> > Yunhong Zheng
> >
> > Benchao Li <li...@apache.org> 于2023年1月4日周三 19:03写道：
> >
> > > Hi Yunhong,
> > >
> > > Thanks for the updating. And introducing the new bushy join reorder
> > > algorithm would be great. And I also agree with the newly added config
> > > option "table.optimizer.bushy-join-reorder-threshold" and 12 as the
> default
> > > value.
> > >
> > >
> > > > As for optimization
> > > > latency, this is the problem to be solved by the parameters to be
> > > > introduced in this discussion. When there are many tables need to be
> > > > reordered, the optimization latency will increase greatly. But when
> the
> > > > table numbers less than the threshold, the latency is the same as the
> > > > LoptOptimizeJoinRule.
> > >
> > >
> > > This sounds great. If possible, could you share more numbers to us?
> E.g.,
> > > what's the latency of optimization when there are 11/12 tables for both
> > > approach?
> > >
> > >  For question #3: The implementation of Calcite
> MultiJoinOptimizeBushyRule
> > > > is very simple, and it will not store the intermediate results at
> all.
> > > So,
> > > > the implementation of Calcite cannot get all possible join reorder
> > > results
> > > > and it cannot combine with the current cost model to get more
> reasonable
> > > > join reorder results.
> > >
> > >
> > > It's ok to do it in Flink as the first step. It would be great to also
> > > contribute it to Calcite later if possible, it depends on you.
> > >
> > > yh z <zh...@gmail.com> 于2023年1月3日周二 15:27写道：
> > >
> > > > Hi Benchao,
> > > >
> > > > Thanks for your reply.
> > > >
> > > > Actually,  I mistakenly wrote the name "bushy join reorder" to "busy
> join
> > > > reorder". I'm sorry for the trouble bring to you. "Bushy join
> reorder"
> > > > means we can build a bushy join tree based on cost model, but now
> Flink
> > > can
> > > > only build a left-deep tree using Calcite LoptOptimizeJoinRule. I
> hope my
> > > > answers can help you solve the following questions:
> > > >
> > > > For question #1: The biggest advantage of this "bushy join reorder"
> > > > strategy over the default Flink left-deep tree strategy is that it
> can
> > > > retail all possible join reorder plans, and then select the optimal
> plan
> > > > according to the cost model. This means that the busy join reorder
> > > strategy
> > > > can be better combined with the current cost model to get more
> reasonable
> > > > join reorder results. We verified it on the TPC-DS benchmark, with
> the
> > > > spark plan as a reference, the new busy join reorder strategy can
> make
> > > more
> > > > TPC-DS query plans be adjusted to be consistent with the Spark plan,
> and
> > > > the execution time is signifcantly reduced.  As for optimization
> > > > latency, this is the problem to be solved by the parameters to be
> > > > introduced in this discussion. When there are many tables need to be
> > > > reordered, the optimization latency will increase greatly. But when
> the
> > > > table numbers less than the threshold, the latency is the same as the
> > > > LoptOptimizeJoinRule.
> > > >
> > > > For question #2: According to my research, many compute or database
> > > systems
> > > > have the "bushy join reorder" strategies based on dynamic
> programming.
> > > For
> > > > example, Spark and PostgresSql use the same strategy, and the
> threshold
> > > be
> > > > set to 12. Also, some papers, like [1] and [2], have also researched
> this
> > > > strategy, and [2] set the threshold to 14.
> > > >
> > > > For question #3: The implementation of Calcite
> MultiJoinOptimizeBushyRule
> > > > is very simple, and it will not store the intermediate results at
> all.
> > > So,
> > > > the implementation of Calcite cannot get all possible join reorder
> > > results
> > > > and it cannot combine with the current cost model to get more
> reasonable
> > > > join reorder results.
> > > >
> > > >
> > > > [1]
> > > >
> > > >
> > >
> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
> > > > [2] https://db.in.tum.de/~radke/papers/hugejoins.pdf
> > > >
> > > >
> > > >
> > > > Benchao Li <li...@apache.org> 于2023年1月3日周二 12:54写道：
> > > >
> > > > > Hi Yunhong,
> > > > >
> > > > > Thanks for driving this~
> > > > >
> > > > > I haven't gone deep into the implementation details yet. Regarding
> the
> > > > > general description, I would ask a few questions firstly:
> > > > >
> > > > > #1, Is there any benchmark results about the optimization latency
> > > change
> > > > > compared to current approach? In OLAP scenario, query optimization
> > > > latency
> > > > > is more crucial.
> > > > >
> > > > > #2, About the term "busy join reorder", is there any others systems
> > > which
> > > > > also use this term? I know Calcite has a rule[1] which uses the
> term
> > > > "bushy
> > > > > join".
> > > > >
> > > > > #3, About the implementation, if this does the same work as Calcite
> > > > > MultiJoinOptimizeBushyRule, is it possible to use the Calcite
> version
> > > > > directly or extend it in some way?
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> https://github.com/apache/calcite/blob/9054682145727fbf8a13e3c79b3512be41574349/core/src/main/java/org/apache/calcite/rel/rules/MultiJoinOptimizeBushyRule.java#L78
> > > > >
> > > > > yh z <zh...@gmail.com> 于2022年12月29日周四 14:44写道：
> > > > >
> > > > > > Hi, devs,
> > > > > >
> > > > > > I'd like to start a discuss about adding an option called
> > > > > > "table.oprimizer.busy-join-reorder-threshold" for planner rule
> while
> > > we
> > > > > try
> > > > > > to introduce a new busy join reorder rule[1] into Flink.
> > > > > >
> > > > > > This join reorder rule is based on dynamic programing[2], which
> can
> > > > store
> > > > > > all possible intermediate results, and the cost model can be
> used to
> > > > > select
> > > > > > the optimal join reorder result. Compare with the existing Lopt
> join
> > > > > > reorder rule, the new rule can give more possible results and the
> > > > result
> > > > > > can be more accurate. However, the search space of this rule will
> > > > become
> > > > > > very large as the number of tables increases. So we should
> introduce
> > > an
> > > > > > option to limit the expansion of search space, if the number of
> table
> > > > can
> > > > > > be reordered less than the threshold, the new busy join reorder
> rule
> > > is
> > > > > > used. On the contrary, the Lopt rule is used.
> > > > > >
> > > > > > The default threshold intended to be set to 12. One reason is
> that in
> > > > the
> > > > > > tpc-ds benchmark test, when the number of tables exceeds 12, the
> > > > > > optimization time will be very long. The other reason is that it
> > > refers
> > > > > to
> > > > > > relevant engines, like Spark, whose recommended setting is 12.[3]
> > > > > >
> > > > > > Looking forward to your feedback.
> > > > > >
> > > > > > [1]  https://issues.apache.org/jira/browse/FLINK-30376
> > > > > > [2]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
> > > > > > [3]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> https://spark.apache.org/docs/3.3.1/configuration.html#runtime-sql-configuration
> > > > > >
> > > > > > Best regards,
> > > > > > Yunhong Zheng
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Best,
> > > > > Benchao Li
> > > > >
> > > >
> > >
> > >
> > > --
> > >
> > > Best,
> > > Benchao Li
> > >
>

Re: [DISCUSS] Adding a option for planner to decide which join reorder rule to choose

Posted by godfrey he <go...@gmail.com>.

Hi Yunhong,

Thanks for driving this discuss!

This option looks good to me,
and looking forward to contributing this rule back to Apache Calcite.

Best,
Godfrey



yh z <zh...@gmail.com> 于2023年1月5日周四 15:32写道：
>
> Hi Benchao,
>
> Thanks for your reply.
>
> Since our existing test results are based on multiple performance
> optimization points on the TPC-DS benchmark[1][2], we haven't separately
> tested the performance improvement brought by new bushy join reorder
> rule. I will complete this test recently and update the results to this
> email.
>
> I am very happy to contribute to Calcite. Later, I will push the PR of the
> bushy join reorder rule to Calcite.
>
> [1] https://issues.apache.org/jira/browse/FLINK-27583
> [2] https://issues.apache.org/jira/browse/FLINK-29942
>
> Best regards,
> Yunhong Zheng
>
> Benchao Li <li...@apache.org> 于2023年1月4日周三 19:03写道：
>
> > Hi Yunhong,
> >
> > Thanks for the updating. And introducing the new bushy join reorder
> > algorithm would be great. And I also agree with the newly added config
> > option "table.optimizer.bushy-join-reorder-threshold" and 12 as the default
> > value.
> >
> >
> > > As for optimization
> > > latency, this is the problem to be solved by the parameters to be
> > > introduced in this discussion. When there are many tables need to be
> > > reordered, the optimization latency will increase greatly. But when the
> > > table numbers less than the threshold, the latency is the same as the
> > > LoptOptimizeJoinRule.
> >
> >
> > This sounds great. If possible, could you share more numbers to us? E.g.,
> > what's the latency of optimization when there are 11/12 tables for both
> > approach?
> >
> >  For question #3: The implementation of Calcite MultiJoinOptimizeBushyRule
> > > is very simple, and it will not store the intermediate results at all.
> > So,
> > > the implementation of Calcite cannot get all possible join reorder
> > results
> > > and it cannot combine with the current cost model to get more reasonable
> > > join reorder results.
> >
> >
> > It's ok to do it in Flink as the first step. It would be great to also
> > contribute it to Calcite later if possible, it depends on you.
> >
> > yh z <zh...@gmail.com> 于2023年1月3日周二 15:27写道：
> >
> > > Hi Benchao,
> > >
> > > Thanks for your reply.
> > >
> > > Actually,  I mistakenly wrote the name "bushy join reorder" to "busy join
> > > reorder". I'm sorry for the trouble bring to you. "Bushy join reorder"
> > > means we can build a bushy join tree based on cost model, but now Flink
> > can
> > > only build a left-deep tree using Calcite LoptOptimizeJoinRule. I hope my
> > > answers can help you solve the following questions:
> > >
> > > For question #1: The biggest advantage of this "bushy join reorder"
> > > strategy over the default Flink left-deep tree strategy is that it can
> > > retail all possible join reorder plans, and then select the optimal plan
> > > according to the cost model. This means that the busy join reorder
> > strategy
> > > can be better combined with the current cost model to get more reasonable
> > > join reorder results. We verified it on the TPC-DS benchmark, with the
> > > spark plan as a reference, the new busy join reorder strategy can make
> > more
> > > TPC-DS query plans be adjusted to be consistent with the Spark plan, and
> > > the execution time is signifcantly reduced.  As for optimization
> > > latency, this is the problem to be solved by the parameters to be
> > > introduced in this discussion. When there are many tables need to be
> > > reordered, the optimization latency will increase greatly. But when the
> > > table numbers less than the threshold, the latency is the same as the
> > > LoptOptimizeJoinRule.
> > >
> > > For question #2: According to my research, many compute or database
> > systems
> > > have the "bushy join reorder" strategies based on dynamic programming.
> > For
> > > example, Spark and PostgresSql use the same strategy, and the threshold
> > be
> > > set to 12. Also, some papers, like [1] and [2], have also researched this
> > > strategy, and [2] set the threshold to 14.
> > >
> > > For question #3: The implementation of Calcite MultiJoinOptimizeBushyRule
> > > is very simple, and it will not store the intermediate results at all.
> > So,
> > > the implementation of Calcite cannot get all possible join reorder
> > results
> > > and it cannot combine with the current cost model to get more reasonable
> > > join reorder results.
> > >
> > >
> > > [1]
> > >
> > >
> > https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
> > > [2] https://db.in.tum.de/~radke/papers/hugejoins.pdf
> > >
> > >
> > >
> > > Benchao Li <li...@apache.org> 于2023年1月3日周二 12:54写道：
> > >
> > > > Hi Yunhong,
> > > >
> > > > Thanks for driving this~
> > > >
> > > > I haven't gone deep into the implementation details yet. Regarding the
> > > > general description, I would ask a few questions firstly:
> > > >
> > > > #1, Is there any benchmark results about the optimization latency
> > change
> > > > compared to current approach? In OLAP scenario, query optimization
> > > latency
> > > > is more crucial.
> > > >
> > > > #2, About the term "busy join reorder", is there any others systems
> > which
> > > > also use this term? I know Calcite has a rule[1] which uses the term
> > > "bushy
> > > > join".
> > > >
> > > > #3, About the implementation, if this does the same work as Calcite
> > > > MultiJoinOptimizeBushyRule, is it possible to use the Calcite version
> > > > directly or extend it in some way?
> > > >
> > > > [1]
> > > >
> > > >
> > >
> > https://github.com/apache/calcite/blob/9054682145727fbf8a13e3c79b3512be41574349/core/src/main/java/org/apache/calcite/rel/rules/MultiJoinOptimizeBushyRule.java#L78
> > > >
> > > > yh z <zh...@gmail.com> 于2022年12月29日周四 14:44写道：
> > > >
> > > > > Hi, devs,
> > > > >
> > > > > I'd like to start a discuss about adding an option called
> > > > > "table.oprimizer.busy-join-reorder-threshold" for planner rule while
> > we
> > > > try
> > > > > to introduce a new busy join reorder rule[1] into Flink.
> > > > >
> > > > > This join reorder rule is based on dynamic programing[2], which can
> > > store
> > > > > all possible intermediate results, and the cost model can be used to
> > > > select
> > > > > the optimal join reorder result. Compare with the existing Lopt join
> > > > > reorder rule, the new rule can give more possible results and the
> > > result
> > > > > can be more accurate. However, the search space of this rule will
> > > become
> > > > > very large as the number of tables increases. So we should introduce
> > an
> > > > > option to limit the expansion of search space, if the number of table
> > > can
> > > > > be reordered less than the threshold, the new busy join reorder rule
> > is
> > > > > used. On the contrary, the Lopt rule is used.
> > > > >
> > > > > The default threshold intended to be set to 12. One reason is that in
> > > the
> > > > > tpc-ds benchmark test, when the number of tables exceeds 12, the
> > > > > optimization time will be very long. The other reason is that it
> > refers
> > > > to
> > > > > relevant engines, like Spark, whose recommended setting is 12.[3]
> > > > >
> > > > > Looking forward to your feedback.
> > > > >
> > > > > [1]  https://issues.apache.org/jira/browse/FLINK-30376
> > > > > [2]
> > > > >
> > > > >
> > > >
> > >
> > https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
> > > > > [3]
> > > > >
> > > > >
> > > >
> > >
> > https://spark.apache.org/docs/3.3.1/configuration.html#runtime-sql-configuration
> > > > >
> > > > > Best regards,
> > > > > Yunhong Zheng
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Best,
> > > > Benchao Li
> > > >
> > >
> >
> >
> > --
> >
> > Best,
> > Benchao Li
> >

Re: [DISCUSS] Adding a option for planner to decide which join reorder rule to choose

Posted by yh z <zh...@gmail.com>.

Hi Benchao,

Thanks for your reply.

Since our existing test results are based on multiple performance
optimization points on the TPC-DS benchmark[1][2], we haven't separately
tested the performance improvement brought by new bushy join reorder
rule. I will complete this test recently and update the results to this
email.

I am very happy to contribute to Calcite. Later, I will push the PR of the
bushy join reorder rule to Calcite.

[1] https://issues.apache.org/jira/browse/FLINK-27583
[2] https://issues.apache.org/jira/browse/FLINK-29942

Best regards,
Yunhong Zheng

Benchao Li <li...@apache.org> 于2023年1月4日周三 19:03写道：

> Hi Yunhong,
>
> Thanks for the updating. And introducing the new bushy join reorder
> algorithm would be great. And I also agree with the newly added config
> option "table.optimizer.bushy-join-reorder-threshold" and 12 as the default
> value.
>
>
> > As for optimization
> > latency, this is the problem to be solved by the parameters to be
> > introduced in this discussion. When there are many tables need to be
> > reordered, the optimization latency will increase greatly. But when the
> > table numbers less than the threshold, the latency is the same as the
> > LoptOptimizeJoinRule.
>
>
> This sounds great. If possible, could you share more numbers to us? E.g.,
> what's the latency of optimization when there are 11/12 tables for both
> approach?
>
>  For question #3: The implementation of Calcite MultiJoinOptimizeBushyRule
> > is very simple, and it will not store the intermediate results at all.
> So,
> > the implementation of Calcite cannot get all possible join reorder
> results
> > and it cannot combine with the current cost model to get more reasonable
> > join reorder results.
>
>
> It's ok to do it in Flink as the first step. It would be great to also
> contribute it to Calcite later if possible, it depends on you.
>
> yh z <zh...@gmail.com> 于2023年1月3日周二 15:27写道：
>
> > Hi Benchao,
> >
> > Thanks for your reply.
> >
> > Actually,  I mistakenly wrote the name "bushy join reorder" to "busy join
> > reorder". I'm sorry for the trouble bring to you. "Bushy join reorder"
> > means we can build a bushy join tree based on cost model, but now Flink
> can
> > only build a left-deep tree using Calcite LoptOptimizeJoinRule. I hope my
> > answers can help you solve the following questions:
> >
> > For question #1: The biggest advantage of this "bushy join reorder"
> > strategy over the default Flink left-deep tree strategy is that it can
> > retail all possible join reorder plans, and then select the optimal plan
> > according to the cost model. This means that the busy join reorder
> strategy
> > can be better combined with the current cost model to get more reasonable
> > join reorder results. We verified it on the TPC-DS benchmark, with the
> > spark plan as a reference, the new busy join reorder strategy can make
> more
> > TPC-DS query plans be adjusted to be consistent with the Spark plan, and
> > the execution time is signifcantly reduced.  As for optimization
> > latency, this is the problem to be solved by the parameters to be
> > introduced in this discussion. When there are many tables need to be
> > reordered, the optimization latency will increase greatly. But when the
> > table numbers less than the threshold, the latency is the same as the
> > LoptOptimizeJoinRule.
> >
> > For question #2: According to my research, many compute or database
> systems
> > have the "bushy join reorder" strategies based on dynamic programming.
> For
> > example, Spark and PostgresSql use the same strategy, and the threshold
> be
> > set to 12. Also, some papers, like [1] and [2], have also researched this
> > strategy, and [2] set the threshold to 14.
> >
> > For question #3: The implementation of Calcite MultiJoinOptimizeBushyRule
> > is very simple, and it will not store the intermediate results at all.
> So,
> > the implementation of Calcite cannot get all possible join reorder
> results
> > and it cannot combine with the current cost model to get more reasonable
> > join reorder results.
> >
> >
> > [1]
> >
> >
> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
> > [2] https://db.in.tum.de/~radke/papers/hugejoins.pdf
> >
> >
> >
> > Benchao Li <li...@apache.org> 于2023年1月3日周二 12:54写道：
> >
> > > Hi Yunhong,
> > >
> > > Thanks for driving this~
> > >
> > > I haven't gone deep into the implementation details yet. Regarding the
> > > general description, I would ask a few questions firstly:
> > >
> > > #1, Is there any benchmark results about the optimization latency
> change
> > > compared to current approach? In OLAP scenario, query optimization
> > latency
> > > is more crucial.
> > >
> > > #2, About the term "busy join reorder", is there any others systems
> which
> > > also use this term? I know Calcite has a rule[1] which uses the term
> > "bushy
> > > join".
> > >
> > > #3, About the implementation, if this does the same work as Calcite
> > > MultiJoinOptimizeBushyRule, is it possible to use the Calcite version
> > > directly or extend it in some way?
> > >
> > > [1]
> > >
> > >
> >
> https://github.com/apache/calcite/blob/9054682145727fbf8a13e3c79b3512be41574349/core/src/main/java/org/apache/calcite/rel/rules/MultiJoinOptimizeBushyRule.java#L78
> > >
> > > yh z <zh...@gmail.com> 于2022年12月29日周四 14:44写道：
> > >
> > > > Hi, devs,
> > > >
> > > > I'd like to start a discuss about adding an option called
> > > > "table.oprimizer.busy-join-reorder-threshold" for planner rule while
> we
> > > try
> > > > to introduce a new busy join reorder rule[1] into Flink.
> > > >
> > > > This join reorder rule is based on dynamic programing[2], which can
> > store
> > > > all possible intermediate results, and the cost model can be used to
> > > select
> > > > the optimal join reorder result. Compare with the existing Lopt join
> > > > reorder rule, the new rule can give more possible results and the
> > result
> > > > can be more accurate. However, the search space of this rule will
> > become
> > > > very large as the number of tables increases. So we should introduce
> an
> > > > option to limit the expansion of search space, if the number of table
> > can
> > > > be reordered less than the threshold, the new busy join reorder rule
> is
> > > > used. On the contrary, the Lopt rule is used.
> > > >
> > > > The default threshold intended to be set to 12. One reason is that in
> > the
> > > > tpc-ds benchmark test, when the number of tables exceeds 12, the
> > > > optimization time will be very long. The other reason is that it
> refers
> > > to
> > > > relevant engines, like Spark, whose recommended setting is 12.[3]
> > > >
> > > > Looking forward to your feedback.
> > > >
> > > > [1]  https://issues.apache.org/jira/browse/FLINK-30376
> > > > [2]
> > > >
> > > >
> > >
> >
> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
> > > > [3]
> > > >
> > > >
> > >
> >
> https://spark.apache.org/docs/3.3.1/configuration.html#runtime-sql-configuration
> > > >
> > > > Best regards,
> > > > Yunhong Zheng
> > > >
> > >
> > >
> > > --
> > >
> > > Best,
> > > Benchao Li
> > >
> >
>
>
> --
>
> Best,
> Benchao Li
>

Re: [DISCUSS] Adding a option for planner to decide which join reorder rule to choose

Posted by Benchao Li <li...@apache.org>.

Hi Yunhong,

Thanks for the updating. And introducing the new bushy join reorder
algorithm would be great. And I also agree with the newly added config
option "table.optimizer.bushy-join-reorder-threshold" and 12 as the default
value.


> As for optimization
> latency, this is the problem to be solved by the parameters to be
> introduced in this discussion. When there are many tables need to be
> reordered, the optimization latency will increase greatly. But when the
> table numbers less than the threshold, the latency is the same as the
> LoptOptimizeJoinRule.


This sounds great. If possible, could you share more numbers to us? E.g.,
what's the latency of optimization when there are 11/12 tables for both
approach?

 For question #3: The implementation of Calcite MultiJoinOptimizeBushyRule
> is very simple, and it will not store the intermediate results at all. So,
> the implementation of Calcite cannot get all possible join reorder results
> and it cannot combine with the current cost model to get more reasonable
> join reorder results.


It's ok to do it in Flink as the first step. It would be great to also
contribute it to Calcite later if possible, it depends on you.

yh z <zh...@gmail.com> 于2023年1月3日周二 15:27写道：

> Hi Benchao,
>
> Thanks for your reply.
>
> Actually,  I mistakenly wrote the name "bushy join reorder" to "busy join
> reorder". I'm sorry for the trouble bring to you. "Bushy join reorder"
> means we can build a bushy join tree based on cost model, but now Flink can
> only build a left-deep tree using Calcite LoptOptimizeJoinRule. I hope my
> answers can help you solve the following questions:
>
> For question #1: The biggest advantage of this "bushy join reorder"
> strategy over the default Flink left-deep tree strategy is that it can
> retail all possible join reorder plans, and then select the optimal plan
> according to the cost model. This means that the busy join reorder strategy
> can be better combined with the current cost model to get more reasonable
> join reorder results. We verified it on the TPC-DS benchmark, with the
> spark plan as a reference, the new busy join reorder strategy can make more
> TPC-DS query plans be adjusted to be consistent with the Spark plan, and
> the execution time is signifcantly reduced.  As for optimization
> latency, this is the problem to be solved by the parameters to be
> introduced in this discussion. When there are many tables need to be
> reordered, the optimization latency will increase greatly. But when the
> table numbers less than the threshold, the latency is the same as the
> LoptOptimizeJoinRule.
>
> For question #2: According to my research, many compute or database systems
> have the "bushy join reorder" strategies based on dynamic programming. For
> example, Spark and PostgresSql use the same strategy, and the threshold be
> set to 12. Also, some papers, like [1] and [2], have also researched this
> strategy, and [2] set the threshold to 14.
>
> For question #3: The implementation of Calcite MultiJoinOptimizeBushyRule
> is very simple, and it will not store the intermediate results at all. So,
> the implementation of Calcite cannot get all possible join reorder results
> and it cannot combine with the current cost model to get more reasonable
> join reorder results.
>
>
> [1]
>
> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
> [2] https://db.in.tum.de/~radke/papers/hugejoins.pdf
>
>
>
> Benchao Li <li...@apache.org> 于2023年1月3日周二 12:54写道：
>
> > Hi Yunhong,
> >
> > Thanks for driving this~
> >
> > I haven't gone deep into the implementation details yet. Regarding the
> > general description, I would ask a few questions firstly:
> >
> > #1, Is there any benchmark results about the optimization latency change
> > compared to current approach? In OLAP scenario, query optimization
> latency
> > is more crucial.
> >
> > #2, About the term "busy join reorder", is there any others systems which
> > also use this term? I know Calcite has a rule[1] which uses the term
> "bushy
> > join".
> >
> > #3, About the implementation, if this does the same work as Calcite
> > MultiJoinOptimizeBushyRule, is it possible to use the Calcite version
> > directly or extend it in some way?
> >
> > [1]
> >
> >
> https://github.com/apache/calcite/blob/9054682145727fbf8a13e3c79b3512be41574349/core/src/main/java/org/apache/calcite/rel/rules/MultiJoinOptimizeBushyRule.java#L78
> >
> > yh z <zh...@gmail.com> 于2022年12月29日周四 14:44写道：
> >
> > > Hi, devs,
> > >
> > > I'd like to start a discuss about adding an option called
> > > "table.oprimizer.busy-join-reorder-threshold" for planner rule while we
> > try
> > > to introduce a new busy join reorder rule[1] into Flink.
> > >
> > > This join reorder rule is based on dynamic programing[2], which can
> store
> > > all possible intermediate results, and the cost model can be used to
> > select
> > > the optimal join reorder result. Compare with the existing Lopt join
> > > reorder rule, the new rule can give more possible results and the
> result
> > > can be more accurate. However, the search space of this rule will
> become
> > > very large as the number of tables increases. So we should introduce an
> > > option to limit the expansion of search space, if the number of table
> can
> > > be reordered less than the threshold, the new busy join reorder rule is
> > > used. On the contrary, the Lopt rule is used.
> > >
> > > The default threshold intended to be set to 12. One reason is that in
> the
> > > tpc-ds benchmark test, when the number of tables exceeds 12, the
> > > optimization time will be very long. The other reason is that it refers
> > to
> > > relevant engines, like Spark, whose recommended setting is 12.[3]
> > >
> > > Looking forward to your feedback.
> > >
> > > [1]  https://issues.apache.org/jira/browse/FLINK-30376
> > > [2]
> > >
> > >
> >
> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
> > > [3]
> > >
> > >
> >
> https://spark.apache.org/docs/3.3.1/configuration.html#runtime-sql-configuration
> > >
> > > Best regards,
> > > Yunhong Zheng
> > >
> >
> >
> > --
> >
> > Best,
> > Benchao Li
> >
>


-- 

Best,
Benchao Li

Re: [DISCUSS] Adding a option for planner to decide which join reorder rule to choose

Posted by yh z <zh...@gmail.com>.

Hi Benchao,

Thanks for your reply.

Actually,  I mistakenly wrote the name "bushy join reorder" to "busy join
reorder". I'm sorry for the trouble bring to you. "Bushy join reorder"
means we can build a bushy join tree based on cost model, but now Flink can
only build a left-deep tree using Calcite LoptOptimizeJoinRule. I hope my
answers can help you solve the following questions:

For question #1: The biggest advantage of this "bushy join reorder"
strategy over the default Flink left-deep tree strategy is that it can
retail all possible join reorder plans, and then select the optimal plan
according to the cost model. This means that the busy join reorder strategy
can be better combined with the current cost model to get more reasonable
join reorder results. We verified it on the TPC-DS benchmark, with the
spark plan as a reference, the new busy join reorder strategy can make more
TPC-DS query plans be adjusted to be consistent with the Spark plan, and
the execution time is signifcantly reduced.  As for optimization
latency, this is the problem to be solved by the parameters to be
introduced in this discussion. When there are many tables need to be
reordered, the optimization latency will increase greatly. But when the
table numbers less than the threshold, the latency is the same as the
LoptOptimizeJoinRule.

For question #2: According to my research, many compute or database systems
have the "bushy join reorder" strategies based on dynamic programming. For
example, Spark and PostgresSql use the same strategy, and the threshold be
set to 12. Also, some papers, like [1] and [2], have also researched this
strategy, and [2] set the threshold to 14.

For question #3: The implementation of Calcite MultiJoinOptimizeBushyRule
is very simple, and it will not store the intermediate results at all. So,
the implementation of Calcite cannot get all possible join reorder results
and it cannot combine with the current cost model to get more reasonable
join reorder results.


[1]
https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
[2] https://db.in.tum.de/~radke/papers/hugejoins.pdf



Benchao Li <li...@apache.org> 于2023年1月3日周二 12:54写道：

> Hi Yunhong,
>
> Thanks for driving this~
>
> I haven't gone deep into the implementation details yet. Regarding the
> general description, I would ask a few questions firstly:
>
> #1, Is there any benchmark results about the optimization latency change
> compared to current approach? In OLAP scenario, query optimization latency
> is more crucial.
>
> #2, About the term "busy join reorder", is there any others systems which
> also use this term? I know Calcite has a rule[1] which uses the term "bushy
> join".
>
> #3, About the implementation, if this does the same work as Calcite
> MultiJoinOptimizeBushyRule, is it possible to use the Calcite version
> directly or extend it in some way?
>
> [1]
>
> https://github.com/apache/calcite/blob/9054682145727fbf8a13e3c79b3512be41574349/core/src/main/java/org/apache/calcite/rel/rules/MultiJoinOptimizeBushyRule.java#L78
>
> yh z <zh...@gmail.com> 于2022年12月29日周四 14:44写道：
>
> > Hi, devs,
> >
> > I'd like to start a discuss about adding an option called
> > "table.oprimizer.busy-join-reorder-threshold" for planner rule while we
> try
> > to introduce a new busy join reorder rule[1] into Flink.
> >
> > This join reorder rule is based on dynamic programing[2], which can store
> > all possible intermediate results, and the cost model can be used to
> select
> > the optimal join reorder result. Compare with the existing Lopt join
> > reorder rule, the new rule can give more possible results and the result
> > can be more accurate. However, the search space of this rule will become
> > very large as the number of tables increases. So we should introduce an
> > option to limit the expansion of search space, if the number of table can
> > be reordered less than the threshold, the new busy join reorder rule is
> > used. On the contrary, the Lopt rule is used.
> >
> > The default threshold intended to be set to 12. One reason is that in the
> > tpc-ds benchmark test, when the number of tables exceeds 12, the
> > optimization time will be very long. The other reason is that it refers
> to
> > relevant engines, like Spark, whose recommended setting is 12.[3]
> >
> > Looking forward to your feedback.
> >
> > [1]  https://issues.apache.org/jira/browse/FLINK-30376
> > [2]
> >
> >
> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
> > [3]
> >
> >
> https://spark.apache.org/docs/3.3.1/configuration.html#runtime-sql-configuration
> >
> > Best regards,
> > Yunhong Zheng
> >
>
>
> --
>
> Best,
> Benchao Li
>

Re: [DISCUSS] Adding a option for planner to decide which join reorder rule to choose

Posted by Benchao Li <li...@apache.org>.

Hi Yunhong,

Thanks for driving this~

I haven't gone deep into the implementation details yet. Regarding the
general description, I would ask a few questions firstly:

#1, Is there any benchmark results about the optimization latency change
compared to current approach? In OLAP scenario, query optimization latency
is more crucial.

#2, About the term "busy join reorder", is there any others systems which
also use this term? I know Calcite has a rule[1] which uses the term "bushy
join".

#3, About the implementation, if this does the same work as Calcite
MultiJoinOptimizeBushyRule, is it possible to use the Calcite version
directly or extend it in some way?

[1]
https://github.com/apache/calcite/blob/9054682145727fbf8a13e3c79b3512be41574349/core/src/main/java/org/apache/calcite/rel/rules/MultiJoinOptimizeBushyRule.java#L78

yh z <zh...@gmail.com> 于2022年12月29日周四 14:44写道：

> Hi, devs,
>
> I'd like to start a discuss about adding an option called
> "table.oprimizer.busy-join-reorder-threshold" for planner rule while we try
> to introduce a new busy join reorder rule[1] into Flink.
>
> This join reorder rule is based on dynamic programing[2], which can store
> all possible intermediate results, and the cost model can be used to select
> the optimal join reorder result. Compare with the existing Lopt join
> reorder rule, the new rule can give more possible results and the result
> can be more accurate. However, the search space of this rule will become
> very large as the number of tables increases. So we should introduce an
> option to limit the expansion of search space, if the number of table can
> be reordered less than the threshold, the new busy join reorder rule is
> used. On the contrary, the Lopt rule is used.
>
> The default threshold intended to be set to 12. One reason is that in the
> tpc-ds benchmark test, when the number of tables exceeds 12, the
> optimization time will be very long. The other reason is that it refers to
> relevant engines, like Spark, whose recommended setting is 12.[3]
>
> Looking forward to your feedback.
>
> [1]  https://issues.apache.org/jira/browse/FLINK-30376
> [2]
>
> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
> [3]
>
> https://spark.apache.org/docs/3.3.1/configuration.html#runtime-sql-configuration
>
> Best regards,
> Yunhong Zheng
>


-- 

Best,
Benchao Li