You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flink.apache.org by "XuPingyong (JIRA)" <ji...@apache.org> on 2019/06/11 09:31:00 UTC

[jira] [Created] (FLINK-12801) Set parallelism for batch SQL

XuPingyong created FLINK-12801:
----------------------------------

             Summary: Set parallelism for batch SQL
                 Key: FLINK-12801
                 URL: https://issues.apache.org/jira/browse/FLINK-12801
             Project: Flink
          Issue Type: Task
          Components: Table SQL / Planner
            Reporter: XuPingyong


       DataStream user can set parallelism by SingleOutputStreamOperator#setParallelism and DataStreamSink#setParallelism. But SQL users cannot set parallelism  to operators while compiled jobGraphs from SQL are usally complex.

       Now we first set parallelism for batch SQL by config. We introduce two resourceSetting mode:

       InferMode.NONE:  User can set parallelism to source, sink and other nodes separately.

       InferMode.ONLY_SOURCE： Relative to  InferMode.NONE, source paralelism can be inferred by source row count.

        We also introduce ShuffleStage to make adjacent operatos parallelism same that there is no data shuffle between them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)