You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Pat McDonough (JIRA)" <ji...@apache.org> on 2014/09/18 02:45:34 UTC

[jira] [Created] (SPARK-3580) Add Consistent Method For Number of RDD Partitions Across Differnet Languages

Pat McDonough created SPARK-3580:
------------------------------------

             Summary: Add Consistent Method For Number of RDD Partitions Across Differnet Languages
                 Key: SPARK-3580
                 URL: https://issues.apache.org/jira/browse/SPARK-3580
             Project: Spark
          Issue Type: Improvement
          Components: PySpark, Spark Core
    Affects Versions: 1.1.0
            Reporter: Pat McDonough


Programmatically retrieving the number of partitions is not consistent between python and scala. A consistent method should be defined and made public across both languages.

RDD.partitions.size is also used quite frequently throughout the internal code, so that might be worth refactoring as well once the new method is available.

What we have today is below.

In Scala:
{code}
scala> someRDD.partitions.size
res0: Int = 30
{code}

In Python:
{code}
In [2]: someRDD.getNumPartitions()
Out[2]: 30
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org