You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jungtaek Lim (Jira)" <ji...@apache.org> on 2019/09/25 20:21:00 UTC

[jira] [Commented] (SPARK-29248) Pass in number of partitions to BuildWriter

    [ https://issues.apache.org/jira/browse/SPARK-29248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16938041#comment-16938041 ] 

Jungtaek Lim commented on SPARK-29248:
--------------------------------------

SPARK-23889 would be the correct approach to address this - we need to provide the information "how" to repartition as well. So maybe this can be closed as duplicated if you think otherwise?

Please vote to SPARK-23889 as well as leave a comment to show your interest. Thanks!

> Pass in number of partitions to BuildWriter
> -------------------------------------------
>
>                 Key: SPARK-29248
>                 URL: https://issues.apache.org/jira/browse/SPARK-29248
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Ximo Guanter
>            Priority: Major
>
> When implementing a ScanBuilder, we require the implementor to provide the schema of the data and the number of partitions.
> However, when someone is implementing WriteBuilder we only pass them the schema, but not the number of partitions. This is an asymetrical developer experience. Passing in the number of partitions on the WriteBuilder would enable data sources to provision their write targets before starting to write. For example, it could be used to provision a Kafka topic with a specific number of partitions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org