You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Markus Holzemer (JIRA)" <ji...@apache.org> on 2014/06/30 11:59:25 UTC

[jira] [Assigned] (FLINK-795) Possibly extend the cost model of the optimizer

     [ https://issues.apache.org/jira/browse/FLINK-795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Markus Holzemer reassigned FLINK-795:
-------------------------------------

    Assignee: Markus Holzemer

> Possibly extend the cost model of the optimizer
> -----------------------------------------------
>
>                 Key: FLINK-795
>                 URL: https://issues.apache.org/jira/browse/FLINK-795
>             Project: Flink
>          Issue Type: Bug
>            Reporter: GitHub Import
>            Assignee: Markus Holzemer
>              Labels: github-import
>             Fix For: pre-apache
>
>
> I have started the task to integrate the AbstractCachedBuildSideMatchDriver into the optimizer. The driver caches one side of the join and thereby can accellerate iterations if there are joins with static (non-changing) datasets inside the iteration.
> The current way of calculating the cost of operators inside of iterations is basically to multiply them by the number of iterations. I would like to propose to extend this to have one static part of costs, that is counted only once for all iterations, and one dynamic part that is multiplied by the number of iterations.
> In my opinion that would be the cleanest way to intergrate the cached match, by assigning it a higher starting cost then the regular match and a cheaper dynamic part.
> One other approach would be to always use the cached match inside of iterations. For that I would probably have to add a new RequestedLocalProperty that tells the optimizer if the operator is used inside of a iteration.
> A simple hacked solution could also be to simply exchange all suitable regular matches inside of an iteration by the cached alternative.
> What do you think is the best approach?
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/issues/795
> Created by: [markus-h|https://github.com/markus-h]
> Labels: 
> Created at: Mon May 12 18:51:51 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.2#6252)