You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by fhueske <gi...@git.apache.org> on 2016/03/15 23:03:42 UTC

[GitHub] flink pull request: [FLINK-3609] [tableAPI] Reorganize selection o...

GitHub user fhueske opened a pull request:

    https://github.com/apache/flink/pull/1797

    [FLINK-3609] [tableAPI] Reorganize selection of optimization rules

    - Remove the join reordering rules to keep the user specified join order (rules can be added if enough statistics are available).
    - Copy and fix a broken Calcite rule (rule will be fixed with Calcite 1.7).
    - Add two test cases.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/fhueske/flink tableRules2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1797.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1797
    
----
commit 049df4079a0a44e7300a8630946dd4cca86a7120
Author: Fabian Hueske <fh...@apache.org>
Date:   2016-03-14T13:30:47Z

    [FLINK-3609] [tableAPI] Reorganize selection of optimization rules

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3609] [tableAPI] Reorganize selection o...

Posted by vasia <gi...@git.apache.org>.
Github user vasia commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1797#discussion_r56304469
  
    --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/rules/FlinkRuleSets.scala ---
    @@ -29,50 +29,45 @@ object FlinkRuleSets {
         */
       val DATASET_OPT_RULES: RuleSet = RuleSets.ofList(
     
    -    // filter rules
    +    // push a filter into a join
         FilterJoinRule.FILTER_ON_JOIN,
    +    // push filter into the children of a join
         FilterJoinRule.JOIN,
    -    FilterMergeRule.INSTANCE,
    -    FilterAggregateTransposeRule.INSTANCE,
    +    // push filter through an aggregation
    +    FlinkFilterAggregateTransposeRule.INSTANCE,
     
    -    // push and merge projection rules
    +    // aggregation and projection rules
         AggregateProjectMergeRule.INSTANCE,
    -    ProjectMergeRule.INSTANCE,
    +    AggregateProjectPullUpConstantsRule.INSTANCE,
    +    // push a projection past a filter or vice versa
         ProjectFilterTransposeRule.INSTANCE,
         FilterProjectTransposeRule.INSTANCE,
    -    AggregateProjectPullUpConstantsRule.INSTANCE,
    -    JoinPushExpressionsRule.INSTANCE,
    +    // push a projection to the children of a join
         ProjectJoinTransposeRule.INSTANCE,
    +    // remove identity project
         ProjectRemoveRule.INSTANCE,
    +    // reorder sort and projection
         SortProjectTransposeRule.INSTANCE,
         ProjectSortTransposeRule.INSTANCE,
     
    -    // merge and push unions rules
    -    // TODO: Add a rule to enforce binary unions
    +    // join rules
    +    JoinPushExpressionsRule.INSTANCE,
    +
    +    // remove union with only a single child
         UnionEliminatorRule.INSTANCE,
    -    FlinkJoinUnionTransposeRule.LEFT_UNION,
    -    FlinkJoinUnionTransposeRule.RIGHT_UNION,
    --- End diff --
    
    Why did you remove these 2?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3609] [tableAPI] Reorganize selection o...

Posted by vasia <gi...@git.apache.org>.
Github user vasia commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1797#discussion_r56304292
  
    --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/rules/FlinkRuleSets.scala ---
    @@ -29,50 +29,45 @@ object FlinkRuleSets {
         */
       val DATASET_OPT_RULES: RuleSet = RuleSets.ofList(
     
    -    // filter rules
    +    // push a filter into a join
         FilterJoinRule.FILTER_ON_JOIN,
    +    // push filter into the children of a join
         FilterJoinRule.JOIN,
    -    FilterMergeRule.INSTANCE,
    -    FilterAggregateTransposeRule.INSTANCE,
    +    // push filter through an aggregation
    +    FlinkFilterAggregateTransposeRule.INSTANCE,
     
    -    // push and merge projection rules
    +    // aggregation and projection rules
         AggregateProjectMergeRule.INSTANCE,
    -    ProjectMergeRule.INSTANCE,
    --- End diff --
    
    And this one? Are we covered by the CalcMergeRule?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3609] [tableAPI] Reorganize selection o...

Posted by vasia <gi...@git.apache.org>.
Github user vasia commented on the pull request:

    https://github.com/apache/flink/pull/1797#issuecomment-197229421
  
    I have some doubts about the filter/project merge rules and the `FlinkJoinUnionTransposeRule`. Otherwise looks good!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3609] [tableAPI] Reorganize selection o...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1797#discussion_r56330836
  
    --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/rules/FlinkRuleSets.scala ---
    @@ -29,50 +29,45 @@ object FlinkRuleSets {
         */
       val DATASET_OPT_RULES: RuleSet = RuleSets.ofList(
     
    -    // filter rules
    +    // push a filter into a join
         FilterJoinRule.FILTER_ON_JOIN,
    +    // push filter into the children of a join
         FilterJoinRule.JOIN,
    -    FilterMergeRule.INSTANCE,
    -    FilterAggregateTransposeRule.INSTANCE,
    +    // push filter through an aggregation
    +    FlinkFilterAggregateTransposeRule.INSTANCE,
     
    -    // push and merge projection rules
    +    // aggregation and projection rules
         AggregateProjectMergeRule.INSTANCE,
    -    ProjectMergeRule.INSTANCE,
    +    AggregateProjectPullUpConstantsRule.INSTANCE,
    +    // push a projection past a filter or vice versa
         ProjectFilterTransposeRule.INSTANCE,
         FilterProjectTransposeRule.INSTANCE,
    -    AggregateProjectPullUpConstantsRule.INSTANCE,
    -    JoinPushExpressionsRule.INSTANCE,
    +    // push a projection to the children of a join
         ProjectJoinTransposeRule.INSTANCE,
    +    // remove identity project
         ProjectRemoveRule.INSTANCE,
    +    // reorder sort and projection
         SortProjectTransposeRule.INSTANCE,
         ProjectSortTransposeRule.INSTANCE,
     
    -    // merge and push unions rules
    -    // TODO: Add a rule to enforce binary unions
    +    // join rules
    +    JoinPushExpressionsRule.INSTANCE,
    +
    +    // remove union with only a single child
         UnionEliminatorRule.INSTANCE,
    -    FlinkJoinUnionTransposeRule.LEFT_UNION,
    -    FlinkJoinUnionTransposeRule.RIGHT_UNION,
    --- End diff --
    
    True, will update the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3609] [tableAPI] Reorganize selection o...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on the pull request:

    https://github.com/apache/flink/pull/1797#issuecomment-197340028
  
    Removed `FlinkJoinUnionTransposeRule` and updated the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3609] [tableAPI] Reorganize selection o...

Posted by vasia <gi...@git.apache.org>.
Github user vasia commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1797#discussion_r56304129
  
    --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/rules/FlinkRuleSets.scala ---
    @@ -29,50 +29,45 @@ object FlinkRuleSets {
         */
       val DATASET_OPT_RULES: RuleSet = RuleSets.ofList(
     
    -    // filter rules
    +    // push a filter into a join
         FilterJoinRule.FILTER_ON_JOIN,
    +    // push filter into the children of a join
         FilterJoinRule.JOIN,
    -    FilterMergeRule.INSTANCE,
    --- End diff --
    
    Why don't we need this one?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3609] [tableAPI] Reorganize selection o...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1797#discussion_r56329514
  
    --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/rules/FlinkRuleSets.scala ---
    @@ -29,50 +29,45 @@ object FlinkRuleSets {
         */
       val DATASET_OPT_RULES: RuleSet = RuleSets.ofList(
     
    -    // filter rules
    +    // push a filter into a join
         FilterJoinRule.FILTER_ON_JOIN,
    +    // push filter into the children of a join
         FilterJoinRule.JOIN,
    -    FilterMergeRule.INSTANCE,
    -    FilterAggregateTransposeRule.INSTANCE,
    +    // push filter through an aggregation
    +    FlinkFilterAggregateTransposeRule.INSTANCE,
     
    -    // push and merge projection rules
    +    // aggregation and projection rules
         AggregateProjectMergeRule.INSTANCE,
    -    ProjectMergeRule.INSTANCE,
    +    AggregateProjectPullUpConstantsRule.INSTANCE,
    +    // push a projection past a filter or vice versa
         ProjectFilterTransposeRule.INSTANCE,
         FilterProjectTransposeRule.INSTANCE,
    -    AggregateProjectPullUpConstantsRule.INSTANCE,
    -    JoinPushExpressionsRule.INSTANCE,
    +    // push a projection to the children of a join
         ProjectJoinTransposeRule.INSTANCE,
    +    // remove identity project
         ProjectRemoveRule.INSTANCE,
    +    // reorder sort and projection
         SortProjectTransposeRule.INSTANCE,
         ProjectSortTransposeRule.INSTANCE,
     
    -    // merge and push unions rules
    -    // TODO: Add a rule to enforce binary unions
    +    // join rules
    +    JoinPushExpressionsRule.INSTANCE,
    +
    +    // remove union with only a single child
         UnionEliminatorRule.INSTANCE,
    -    FlinkJoinUnionTransposeRule.LEFT_UNION,
    -    FlinkJoinUnionTransposeRule.RIGHT_UNION,
    --- End diff --
    
    These rules would push a join into a union. This means that a join is executed on each input of the union and the join results are unioned afterwards. Additional joins are likely to cause higher resource consumption. For example if the unioned inputs are fed into the probe-side of a join, the build side would need to be replicated. 
    
    Unless we have very good statistics and a more fine-grained cost model, I would not try to optimize and trust the user (at least for the Table API).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3609] [tableAPI] Reorganize selection o...

Posted by vasia <gi...@git.apache.org>.
Github user vasia commented on the pull request:

    https://github.com/apache/flink/pull/1797#issuecomment-197342267
  
    Thanks, will merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3609] [tableAPI] Reorganize selection o...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1797#discussion_r56312442
  
    --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/rules/FlinkRuleSets.scala ---
    @@ -29,50 +29,45 @@ object FlinkRuleSets {
         */
       val DATASET_OPT_RULES: RuleSet = RuleSets.ofList(
     
    -    // filter rules
    +    // push a filter into a join
         FilterJoinRule.FILTER_ON_JOIN,
    +    // push filter into the children of a join
         FilterJoinRule.JOIN,
    -    FilterMergeRule.INSTANCE,
    -    FilterAggregateTransposeRule.INSTANCE,
    +    // push filter through an aggregation
    +    FlinkFilterAggregateTransposeRule.INSTANCE,
     
    -    // push and merge projection rules
    +    // aggregation and projection rules
         AggregateProjectMergeRule.INSTANCE,
    -    ProjectMergeRule.INSTANCE,
    --- End diff --
    
    Should be covered by the rules to merge Calcs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3609] [tableAPI] Reorganize selection o...

Posted by vasia <gi...@git.apache.org>.
Github user vasia commented on the pull request:

    https://github.com/apache/flink/pull/1797#issuecomment-197357617
  
    and merged!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3609] [tableAPI] Reorganize selection o...

Posted by vasia <gi...@git.apache.org>.
Github user vasia commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1797#discussion_r56329989
  
    --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/rules/FlinkRuleSets.scala ---
    @@ -29,50 +29,45 @@ object FlinkRuleSets {
         */
       val DATASET_OPT_RULES: RuleSet = RuleSets.ofList(
     
    -    // filter rules
    +    // push a filter into a join
         FilterJoinRule.FILTER_ON_JOIN,
    +    // push filter into the children of a join
         FilterJoinRule.JOIN,
    -    FilterMergeRule.INSTANCE,
    -    FilterAggregateTransposeRule.INSTANCE,
    +    // push filter through an aggregation
    +    FlinkFilterAggregateTransposeRule.INSTANCE,
     
    -    // push and merge projection rules
    +    // aggregation and projection rules
         AggregateProjectMergeRule.INSTANCE,
    -    ProjectMergeRule.INSTANCE,
    +    AggregateProjectPullUpConstantsRule.INSTANCE,
    +    // push a projection past a filter or vice versa
         ProjectFilterTransposeRule.INSTANCE,
         FilterProjectTransposeRule.INSTANCE,
    -    AggregateProjectPullUpConstantsRule.INSTANCE,
    -    JoinPushExpressionsRule.INSTANCE,
    +    // push a projection to the children of a join
         ProjectJoinTransposeRule.INSTANCE,
    +    // remove identity project
         ProjectRemoveRule.INSTANCE,
    +    // reorder sort and projection
         SortProjectTransposeRule.INSTANCE,
         ProjectSortTransposeRule.INSTANCE,
     
    -    // merge and push unions rules
    -    // TODO: Add a rule to enforce binary unions
    +    // join rules
    +    JoinPushExpressionsRule.INSTANCE,
    +
    +    // remove union with only a single child
         UnionEliminatorRule.INSTANCE,
    -    FlinkJoinUnionTransposeRule.LEFT_UNION,
    -    FlinkJoinUnionTransposeRule.RIGHT_UNION,
    --- End diff --
    
    Alright. Then `FlinkJoinUnionTransposeRule` can be removed since it's not used. I can do that before merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3609] [tableAPI] Reorganize selection o...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske closed the pull request at:

    https://github.com/apache/flink/pull/1797


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3609] [tableAPI] Reorganize selection o...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1797#discussion_r56312432
  
    --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/rules/FlinkRuleSets.scala ---
    @@ -29,50 +29,45 @@ object FlinkRuleSets {
         */
       val DATASET_OPT_RULES: RuleSet = RuleSets.ofList(
     
    -    // filter rules
    +    // push a filter into a join
         FilterJoinRule.FILTER_ON_JOIN,
    +    // push filter into the children of a join
         FilterJoinRule.JOIN,
    -    FilterMergeRule.INSTANCE,
    --- End diff --
    
    Should be covered by the rules to merge Calcs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---