You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2016/09/14 09:45:21 UTC

[jira] [Commented] (SPARK-17533) I think it's necessary to have an overrided method of union in sparkContext

    [ https://issues.apache.org/jira/browse/SPARK-17533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15489971#comment-15489971 ] 

Sean Owen commented on SPARK-17533:
-----------------------------------

You can always repartition() the result. I'm not sure this API makes sense, because the point of a union is to make one RDD whose partitions are the partitions of all the underlying RDDs. Of course, you can subsequently do anything you want with the result including repartition it, but this API doesn't have an intrinsic need to expose that.

> I think it's necessary to have an overrided method of union in sparkContext 
> ----------------------------------------------------------------------------
>
>                 Key: SPARK-17533
>                 URL: https://issues.apache.org/jira/browse/SPARK-17533
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>    Affects Versions: 2.0.0
>            Reporter: WangJianfei
>            Priority: Minor
>              Labels: features
>
> I think it's necessary to have an override method of union in sparkContext
> for the purpose of that the user can desinate the number of partitions and the Partitioner.
> A func like this
> ```
>  def union[T: ClassTag](rdds: Seq[RDD[T], numPartitions: Int, partitioner: Partitioner): RDD[T] = withScope {
>      
>   }
> ```
> we can discuss here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org