You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Jingsong Lee (Jira)" <ji...@apache.org> on 2020/05/21 09:40:00 UTC
[jira] [Commented] (FLINK-17863) flink streaming sql read hive
with lots small files need to control parallelism
[ https://issues.apache.org/jira/browse/FLINK-17863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113012#comment-17113012 ]
Jingsong Lee commented on FLINK-17863:
--------------------------------------
Hi [~yantianyu], there is a "table.exec.hive.infer-source-parallelism.max" to control max parallelism
> flink streaming sql read hive with lots small files need to control parallelism
> ----------------------------------------------------------------------------------
>
> Key: FLINK-17863
> URL: https://issues.apache.org/jira/browse/FLINK-17863
> Project: Flink
> Issue Type: Improvement
> Components: Connectors / Hive
> Affects Versions: 1.10.1
> Reporter: richt richt
> Priority: Major
>
> the table wy.cartest has 19 rows with 19 files
> so when i query the table use *streaming* mode it will require 19 slots , my cluster cannot allocate so much resource to the task.
> ----
> Caused by: org.apache.flink.runtime.JobException: Vertex Source: HiveTableSource(carid, time, num, var) TablePath: wy.cartest, Par
> titionPruned: false, PartitionNums: null -> SinkConversionToTuple2's parallelism (19) is higher than the max parallelism (2). Plea
> se lower the parallelism or increase the max parallelism.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)