You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Jingsong Lee (Jira)" <ji...@apache.org> on 2020/05/21 09:40:00 UTC

[jira] [Commented] (FLINK-17863) flink streaming sql read hive with lots small files need to control parallelism

    [ https://issues.apache.org/jira/browse/FLINK-17863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113012#comment-17113012 ] 

Jingsong Lee commented on FLINK-17863:
--------------------------------------

Hi [~yantianyu], there is a "table.exec.hive.infer-source-parallelism.max" to control max parallelism

> flink streaming sql  read hive with lots small files  need to control  parallelism
> ----------------------------------------------------------------------------------
>
>                 Key: FLINK-17863
>                 URL: https://issues.apache.org/jira/browse/FLINK-17863
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / Hive
>    Affects Versions: 1.10.1
>            Reporter: richt richt
>            Priority: Major
>
> the table wy.cartest  has 19 rows with 19 files  
> so when i query the table use *streaming* mode it will require 19 slots , my cluster cannot allocate so much resource to the task.
> ----
> Caused by: org.apache.flink.runtime.JobException: Vertex Source: HiveTableSource(carid, time, num, var) TablePath: wy.cartest, Par
> titionPruned: false, PartitionNums: null -> SinkConversionToTuple2's parallelism (19) is higher than the max parallelism (2). Plea
> se lower the parallelism or increase the max parallelism.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)