You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2016/09/13 09:31:21 UTC

[jira] [Commented] (SPARK-17521) Error when I use sparkContext.makeRDD(Seq())

    [ https://issues.apache.org/jira/browse/SPARK-17521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15486773#comment-15486773 ] 

Sean Owen commented on SPARK-17521:
-----------------------------------

Wow, I admit I didn't even know this overload existed:

{code}
  def makeRDD[T: ClassTag](seq: Seq[(T, Seq[String])]): RDD[T] = withScope {
    assertNotStopped()
    val indexToPrefs = seq.zipWithIndex.map(t => (t._2, t._1._2)).toMap
    new ParallelCollectionRDD[T](this, seq.map(_._1), seq.size, indexToPrefs)
  }
{code}

This is the only place the 'location prefs' is used. It looks like this is a very old API: https://github.com/apache/spark/commit/c36ca10241991d46f2f1513b2c0c5e369d8b34f9

Anyway, your fix is correct and easy, sure. But I also wonder if we should deprecated makeRDD. The other version just calls parallelize anyway.

> Error when I use sparkContext.makeRDD(Seq())
> --------------------------------------------
>
>                 Key: SPARK-17521
>                 URL: https://issues.apache.org/jira/browse/SPARK-17521
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.0.0
>            Reporter: WangJianfei
>            Priority: Minor
>              Labels: easyfix
>
> when i use sc.makeRDD below
> ```
>     val data3 = sc.makeRDD(Seq())
>     println(data3.partitions.length)
> ```
> I got an error:
> Exception in thread "main" java.lang.IllegalArgumentException: Positive number of slices required
> We can fix this bug just modify the last line ,do a check of seq.size
> ````
>   def makeRDD[T: ClassTag](seq: Seq[(T, Seq[String])]): RDD[T] = withScope {
>     assertNotStopped()
>     val indexToPrefs = seq.zipWithIndex.map(t => (t._2, t._1._2)).toMap
>     new ParallelCollectionRDD[T](this, seq.map(_._1), seq.size, indexToPrefs)
>   }
> ```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org