You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by fhueske <gi...@git.apache.org> on 2016/03/15 23:07:57 UTC

[GitHub] flink pull request: [FLINK-3503] [tableAPI] Add cost model for Dat...

GitHub user fhueske opened a pull request:

    https://github.com/apache/flink/pull/1798

    [FLINK-3503] [tableAPI] Add cost model for DataSet RelNodes to improve plan selection

    Calcite's default cost model does not take IO and CPU into account. As a consequence, projections are not pushed down (as described in FLINK-3503).
    
    This PR adds a basic cost model to push filters and projections towards the sources.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/fhueske/flink tableRules

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1798.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1798
    
----
commit 257b6b0162c4f426042bbbb6f9bb4812dc8a4e81
Author: Fabian Hueske <fh...@apache.org>
Date:   2016-03-14T18:57:53Z

    [FLINK-3503] [tableAPI] Add cost model for DataSet RelNodes to improve plan selection.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3503] [tableAPI] Add cost model for Dat...

Posted by vasia <gi...@git.apache.org>.
Github user vasia commented on the pull request:

    https://github.com/apache/flink/pull/1798#issuecomment-197357543
  
    merged :))


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3503] [tableAPI] Add cost model for Dat...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske closed the pull request at:

    https://github.com/apache/flink/pull/1798


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3503] [tableAPI] Add cost model for Dat...

Posted by vasia <gi...@git.apache.org>.
Github user vasia commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1798#discussion_r56309062
  
    --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/nodes/dataset/DataSetCalc.scala ---
    @@ -66,8 +67,28 @@ class DataSetCalc(
         super.explainTerms(pw).item("name", opName)
       }
     
    +  override def computeSelfCost (planner: RelOptPlanner): RelOptCost = {
    +
    +    val child = this.getInput
    +    val rowCnt = RelMetadataQuery.getRowCount(child)
    +    val exprCnt = calcProgram.getExprCount
    +    planner.getCostFactory.makeCost(rowCnt, rowCnt * exprCnt, 0)
    --- End diff --
    
    why is the io cost 0?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3503] [tableAPI] Add cost model for Dat...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1798#discussion_r56312918
  
    --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/nodes/dataset/DataSetCalc.scala ---
    @@ -66,8 +67,28 @@ class DataSetCalc(
         super.explainTerms(pw).item("name", opName)
       }
     
    +  override def computeSelfCost (planner: RelOptPlanner): RelOptCost = {
    +
    +    val child = this.getInput
    +    val rowCnt = RelMetadataQuery.getRowCount(child)
    +    val exprCnt = calcProgram.getExprCount
    +    planner.getCostFactory.makeCost(rowCnt, rowCnt * exprCnt, 0)
    --- End diff --
    
    Calc is implemented as `MapFunction` and will never spill to disk or cause network traffic.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3503] [tableAPI] Add cost model for Dat...

Posted by vasia <gi...@git.apache.org>.
Github user vasia commented on the pull request:

    https://github.com/apache/flink/pull/1798#issuecomment-197321103
  
    Merging this one too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---