You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Kurt Young (JIRA)" <ji...@apache.org> on 2019/02/28 10:14:00 UTC
[jira] [Resolved] (FLINK-11714) Add cost model for both batch and
streaming
[ https://issues.apache.org/jira/browse/FLINK-11714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kurt Young resolved FLINK-11714.
--------------------------------
Resolution: Implemented
Fix Version/s: 1.9.0
> Add cost model for both batch and streaming
> -------------------------------------------
>
> Key: FLINK-11714
> URL: https://issues.apache.org/jira/browse/FLINK-11714
> Project: Flink
> Issue Type: New Feature
> Components: API / Table SQL
> Reporter: godfrey he
> Assignee: godfrey he
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.9.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> Calcite's default cost model only contains ROWS, IO and CPU, and does not take IO and CPU into account when the cost is compared.
> There are two improvements:
> 1. Add NETWORK and MEMORY to represents distribution cost and memory usage.
> 2. The optimization goal is to use minimal resources now, so the comparison order of factors is:
> (1). first compare CPU. Each operator will use CPU, so we think it's the most important factor.
> (2). then compare MEMORY, NETWORK and IO as a normalized value. Comparison order of them is not easy to decide, so convert them to CPU cost by different ratio.
> (3). finally compare ROWS. ROWS has been counted when calculating other factory.
> e.g. CPU of Sort = nLogN(ROWS) * number of sort keys, CPU of Filter = ROWS * condition cost on a row.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)