You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Danny Chan <yu...@gmail.com> on 2020/02/14 06:07:01 UTC
[ DISCUSS ] Revert change: CALCITE-3713 Remove column names
from Project#digest
The change in CALCITE-3713 make a radical change to the plan digest, but it only brings in little promotion(at least from Calcite tests and Flink cases)
During the upgrade of Flink Calcite version to 1.22.0-SNAPSHOT, the 2000+ plan changes to due to this patch, it is hard for me to figure out the reason of the change: because of this patch, or because of other bug-fix/promotion.
Even though I change the plan verification to EXPLAIN_ATTRIBUTE, there are still 500+ plan diffs.
This makes the Flink Calcite upgrade almost impossible to go on and I’m burdened out of it.
So, PLEASE, if possible, can we REVERT this patch, 2000+ plan changes for no promotion(useless) (at lease for Flink).
I did show my options when discussing if the patch should be merged, me and Julian all gave -1 for that but the code still merged, sigh !
[1] https://ponymail-vm.apache.org/_GUI_/thread.html/54bf3ed733eb7e725ce3ea397334aad8f1323ead13e450b1753b1521%40%3Cdev.calcite.apache.org%3E
Best,
Danny Chan
Re: [ DISCUSS ] Revert change: CALCITE-3713 Remove column names
from Project#digest
Posted by Danny Chan <yu...@gmail.com>.
> if you use explain_digest attributes, it means you expect it might
change due to implementation details.
That is what you just defined that, why a digest should be changed just because of the implementation.
Best,
Danny Chan
在 2020年2月14日 +0800 PM2:54,Vladimir Sitnikov <si...@gmail.com>,写道:
> The subject of the mail does not match the body.
>
> Please double check and ensure you mention a single change.
>
>
> PS if you use explain_digest attributes, it means you expect it might
> change due to implementation details.
>
> Vladimir
Re: [ DISCUSS ] Revert change: CALCITE-3713 Remove column names from Project#digest
Posted by Danny Chan <yu...@gmail.com>.
Thanks Vladimir, too many config options is hard to maintain, so no need to
add that, i would use the explain_attributes and update the log files.
The explain would not change frequently, right?
Vladimir Sitnikov <si...@gmail.com>于2020年2月18日 周二上午12:45写道:
> > And state that it is going to be removed before say 1.24 (about 6 months
> from now).
>
> I'm ok with keeping the property working as long as it does not impact
> features/bugfixes development provided the property does not make the code
> hard to read.
>
> Vladimir
>
Re: [ DISCUSS ] Revert change: CALCITE-3713 Remove column names from Project#digest
Posted by Vladimir Sitnikov <si...@gmail.com>.
> And state that it is going to be removed before say 1.24 (about 6 months
from now).
I'm ok with keeping the property working as long as it does not impact
features/bugfixes development provided the property does not make the code
hard to read.
Vladimir
Re: [ DISCUSS ] Revert change: CALCITE-3713 Remove column names from Project#digest
Posted by Julian Hyde <jh...@gmail.com>.
I can imagine that changing a lot of log files is painful to the Flink project. There’s no reason to force them to do it right away.
I think a config parameter is fine. How about making it deprecated or experimental, to discourage others from using it? And state that it is going to be removed before say 1.24 (about 6 months from now).
Julian
> On Feb 17, 2020, at 12:06 AM, Vladimir Sitnikov <si...@gmail.com> wrote:
>
>
>>
>> adding an option
>
> I'm ok with adding an option provided:
> 1) It is "remove column names" by default
> 2) It is supported on a best effort basis. In other words, it is NOT to be
> used in production systems on a day-to-day basis
> 3) We do not add an extensive test suite for that property (see #1, #2)
> 4) We do not mean the option to be a "public API". In other words, it is
> there, it pretends to be working, however, it should not impact the
> development (== we can drop it if the maintenance takes significant effort)
>
> That is I agree the property might be helpful to compare behaviors, however
> having too many options (e.g. an option for every bugfix) might complicate
> the code with doubtful gains.
>
> Vladimir
Re: [ DISCUSS ] Revert change: CALCITE-3713 Remove column names from Project#digest
Posted by Vladimir Sitnikov <si...@gmail.com>.
>adding an option
I'm ok with adding an option provided:
1) It is "remove column names" by default
2) It is supported on a best effort basis. In other words, it is NOT to be
used in production systems on a day-to-day basis
3) We do not add an extensive test suite for that property (see #1, #2)
4) We do not mean the option to be a "public API". In other words, it is
there, it pretends to be working, however, it should not impact the
development (== we can drop it if the maintenance takes significant effort)
That is I agree the property might be helpful to compare behaviors, however
having too many options (e.g. an option for every bugfix) might complicate
the code with doubtful gains.
Vladimir
Re: [ DISCUSS ] Revert change: CALCITE-3713 Remove column names from
Project#digest
Posted by Xiening Dai <xn...@gmail.com>.
I think CALCITE-3713 is overall a positive change. We have seen the number of rule applies are reduced in our test (not significant though). After fixing the performance issue, CALCITE-3713 is fine and IMO it should stay. Having say that, I understand the pain of updating test base line. So adding an option is reasonable too.
> On Feb 16, 2020, at 7:33 PM, Danny Chan <yu...@gmail.com> wrote:
>
> Hi, fellows, can we gave a solution here to solve the problem ? This is a blocking issue from the Apache Flink side, we rely heavily on the digest for plan tests.
>
> I have 2 solutions here:
>
>
> 1. Revert the change of CALCITE-3713
> 2. We add a config option to switch the feature off
>
> Either is okey for me, I just need a conclusion ~
>
>
> Best,
> Danny Chan
> 在 2020年2月14日 +0800 PM2:54,Vladimir Sitnikov <si...@gmail.com>,写道:
>> The subject of the mail does not match the body.
>>
>> Please double check and ensure you mention a single change.
>>
>>
>> PS if you use explain_digest attributes, it means you expect it might
>> change due to implementation details.
>>
>> Vladimir
Re: [ DISCUSS ] Revert change: CALCITE-3713 Remove column names from Project#digest
Posted by Vladimir Sitnikov <si...@gmail.com>.
>it is hard for me to figure out the reason of the change
It is written in the JIRA description and in the commit message: it
improves planning performance by reducing the planning space.
For Calcite it reduces slow tests from 64 min to 40 min.
Before (3862sec):
https://github.com/apache/calcite/runs/382954125#step:4:794
After (2402sec):
https://github.com/apache/calcite/pull/1740/checks?check_run_id=382955763#step:4:369
I see you might be struggling with plan issues, however:
* The fix brings 30% improvement for Calcite tests
* It improves some of the plans where the old column was emp0 and the new
one is emp (see PR1740)
* You provide no clue on the nature of the failures you are facing
* You suggest making changes to Calcite based on the number of Flink
failures only
^^ That is sad
Vladimir
Re: [ DISCUSS ] Revert change: CALCITE-3713 Remove column names
from Project#digest
Posted by Danny Chan <yu...@gmail.com>.
Hi, fellows, can we gave a solution here to solve the problem ? This is a blocking issue from the Apache Flink side, we rely heavily on the digest for plan tests.
I have 2 solutions here:
1. Revert the change of CALCITE-3713
2. We add a config option to switch the feature off
Either is okey for me, I just need a conclusion ~
Best,
Danny Chan
在 2020年2月14日 +0800 PM2:54,Vladimir Sitnikov <si...@gmail.com>,写道:
> The subject of the mail does not match the body.
>
> Please double check and ensure you mention a single change.
>
>
> PS if you use explain_digest attributes, it means you expect it might
> change due to implementation details.
>
> Vladimir
Re: [ DISCUSS ] Revert change: CALCITE-3713 Remove column names from Project#digest
Posted by Vladimir Sitnikov <si...@gmail.com>.
The subject of the mail does not match the body.
Please double check and ensure you mention a single change.
PS if you use explain_digest attributes, it means you expect it might
change due to implementation details.
Vladimir