You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "陈梓立 (JIRA)" <ji...@apache.org> on 2018/08/03 05:13:00 UTC

[jira] [Commented] (FLINK-10038) Parallel the creation of InputSplit if necessary

    [ https://issues.apache.org/jira/browse/FLINK-10038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16567780#comment-16567780 ] 

陈梓立 commented on FLINK-10038:
-----------------------------

After taking a look of InputSplit and InputFormat, I find it that the interface for the creation of input splits is InputSplitSource#createInputSplits, whose implementations varies from FileInputFormat to JDBCInputFormat and so on.

Since we need to decide how to create input split in a specific input source, the parallelize logic is various inside the implementation, so implement the parallelize logic case by case if possible and necessary.

What about you guys' opinions? Are there other interfaces we need for the creation of input splits? What is the most elegant and effective way to do this parallelize and gain benefits from it you think?

Looking forward to your comments.

> Parallel the creation of InputSplit if necessary
> ------------------------------------------------
>
>                 Key: FLINK-10038
>                 URL: https://issues.apache.org/jira/browse/FLINK-10038
>             Project: Flink
>          Issue Type: Improvement
>    Affects Versions: 1.5.0
>            Reporter: 陈梓立
>            Priority: Major
>
> As a continue to the discussion in the PR about parallelize the creation of ExecutionJobVertex [here|https://github.com/apache/flink/pull/6353].
> [~StephanEwen] suggested that we could parallelize the creation of InputSplit, from which we gain performance improvements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)