You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Jark Wu (Jira)" <ji...@apache.org> on 2020/07/30 02:11:00 UTC
[jira] [Commented] (FLINK-18508) Dynamic source supports statistics
and parallelism report
[ https://issues.apache.org/jira/browse/FLINK-18508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17167594#comment-17167594 ]
Jark Wu commented on FLINK-18508:
---------------------------------
Should we have a public discussion for the public API before opening pull request?
> Dynamic source supports statistics and parallelism report
> ---------------------------------------------------------
>
> Key: FLINK-18508
> URL: https://issues.apache.org/jira/browse/FLINK-18508
> Project: Flink
> Issue Type: New Feature
> Components: Table SQL / API
> Reporter: Jingsong Lee
> Assignee: Jingsong Lee
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Add SupportsStatisticsReport and SupportsParallelismReport to dynamic source, source can return some information to help table optimizer.
> This information can be more accurate from the source rather than the catalog.
> * First, the information is computed base on real data, for iceberg / filesystem connector, it can be calculated from real files. Although it is related to physical/runtime, it is real and exact.
> * Second, For example, for iceberg / filesystem connector, after filter and partition pushdown, the statistics have been greatly adjusted, and many files may have been filtered out.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)