You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Shixiong Zhu <zs...@gmail.com> on 2014/11/06 12:12:37 UTC
About implicit rddToPairRDDFunctions
I saw many people asked how to convert a RDD to a PairRDDFunctions. I would
like to ask a question about it. Why not put the following implicit into
"pacakge object rdd" or "object rdd"?
implicit def rddToPairRDDFunctions[K, V](rdd: RDD[(K, V)])
(implicit kt: ClassTag[K], vt: ClassTag[V], ord: Ordering[K] = null)
= {
new PairRDDFunctions(rdd)
}
If so, the converting will be automatic and not need to
import org.apache.spark.SparkContext._
I tried to search some discussion but found nothing.
Best Regards,
Shixiong Zhu
Re: About implicit rddToPairRDDFunctions
Posted by Shixiong Zhu <zs...@gmail.com>.
OK. I'll take it.
Best Regards,
Shixiong Zhu
2014-11-14 12:34 GMT+08:00 Reynold Xin <rx...@databricks.com>:
> That seems like a great idea. Can you submit a pull request?
>
>
> On Thu, Nov 13, 2014 at 7:13 PM, Shixiong Zhu <zs...@gmail.com> wrote:
>
>> If we put the `implicit` into "pacakge object rdd" or "object rdd", when
>> we write `rdd.groupbykey()`, because rdd is an object of RDD, Scala
>> compiler will search `object rdd`(companion object) and `package object rdd`(pacakge
>> object) by default. We don't need to import them explicitly. Here is a
>> post about the implicit search logic:
>> http://eed3si9n.com/revisiting-implicits-without-import-tax
>>
>> To maintain the compatibility, we can keep `rddToPairRDDFunctions` in
>> the SparkContext but remove `implicit`. The disadvantage is there are
>> two copies of same codes.
>>
>>
>>
>>
>> Best Regards,
>> Shixiong Zhu
>>
>> 2014-11-14 3:57 GMT+08:00 Reynold Xin <rx...@databricks.com>:
>>
>>> Do people usually important o.a.spark.rdd._ ?
>>>
>>> Also in order to maintain source and binary compatibility, we would need
>>> to keep both right?
>>>
>>>
>>> On Thu, Nov 6, 2014 at 3:12 AM, Shixiong Zhu <zs...@gmail.com> wrote:
>>>
>>>> I saw many people asked how to convert a RDD to a PairRDDFunctions. I
>>>> would
>>>> like to ask a question about it. Why not put the following implicit into
>>>> "pacakge object rdd" or "object rdd"?
>>>>
>>>> implicit def rddToPairRDDFunctions[K, V](rdd: RDD[(K, V)])
>>>> (implicit kt: ClassTag[K], vt: ClassTag[V], ord: Ordering[K] =
>>>> null)
>>>> = {
>>>> new PairRDDFunctions(rdd)
>>>> }
>>>>
>>>> If so, the converting will be automatic and not need to
>>>> import org.apache.spark.SparkContext._
>>>>
>>>> I tried to search some discussion but found nothing.
>>>>
>>>> Best Regards,
>>>> Shixiong Zhu
>>>>
>>>
>>>
>>
>
Re: About implicit rddToPairRDDFunctions
Posted by Reynold Xin <rx...@databricks.com>.
That seems like a great idea. Can you submit a pull request?
On Thu, Nov 13, 2014 at 7:13 PM, Shixiong Zhu <zs...@gmail.com> wrote:
> If we put the `implicit` into "pacakge object rdd" or "object rdd", when
> we write `rdd.groupbykey()`, because rdd is an object of RDD, Scala
> compiler will search `object rdd`(companion object) and `package object rdd`(pacakge
> object) by default. We don't need to import them explicitly. Here is a
> post about the implicit search logic:
> http://eed3si9n.com/revisiting-implicits-without-import-tax
>
> To maintain the compatibility, we can keep `rddToPairRDDFunctions` in the
> SparkContext but remove `implicit`. The disadvantage is there are two
> copies of same codes.
>
>
>
>
> Best Regards,
> Shixiong Zhu
>
> 2014-11-14 3:57 GMT+08:00 Reynold Xin <rx...@databricks.com>:
>
>> Do people usually important o.a.spark.rdd._ ?
>>
>> Also in order to maintain source and binary compatibility, we would need
>> to keep both right?
>>
>>
>> On Thu, Nov 6, 2014 at 3:12 AM, Shixiong Zhu <zs...@gmail.com> wrote:
>>
>>> I saw many people asked how to convert a RDD to a PairRDDFunctions. I
>>> would
>>> like to ask a question about it. Why not put the following implicit into
>>> "pacakge object rdd" or "object rdd"?
>>>
>>> implicit def rddToPairRDDFunctions[K, V](rdd: RDD[(K, V)])
>>> (implicit kt: ClassTag[K], vt: ClassTag[V], ord: Ordering[K] =
>>> null)
>>> = {
>>> new PairRDDFunctions(rdd)
>>> }
>>>
>>> If so, the converting will be automatic and not need to
>>> import org.apache.spark.SparkContext._
>>>
>>> I tried to search some discussion but found nothing.
>>>
>>> Best Regards,
>>> Shixiong Zhu
>>>
>>
>>
>
Re: About implicit rddToPairRDDFunctions
Posted by Shixiong Zhu <zs...@gmail.com>.
If we put the `implicit` into "pacakge object rdd" or "object rdd", when we
write `rdd.groupbykey()`, because rdd is an object of RDD, Scala compiler
will search `object rdd`(companion object) and `package object rdd`(pacakge
object) by default. We don't need to import them explicitly. Here is a post
about the implicit search logic:
http://eed3si9n.com/revisiting-implicits-without-import-tax
To maintain the compatibility, we can keep `rddToPairRDDFunctions` in the
SparkContext but remove `implicit`. The disadvantage is there are two
copies of same codes.
Best Regards,
Shixiong Zhu
2014-11-14 3:57 GMT+08:00 Reynold Xin <rx...@databricks.com>:
> Do people usually important o.a.spark.rdd._ ?
>
> Also in order to maintain source and binary compatibility, we would need
> to keep both right?
>
>
> On Thu, Nov 6, 2014 at 3:12 AM, Shixiong Zhu <zs...@gmail.com> wrote:
>
>> I saw many people asked how to convert a RDD to a PairRDDFunctions. I
>> would
>> like to ask a question about it. Why not put the following implicit into
>> "pacakge object rdd" or "object rdd"?
>>
>> implicit def rddToPairRDDFunctions[K, V](rdd: RDD[(K, V)])
>> (implicit kt: ClassTag[K], vt: ClassTag[V], ord: Ordering[K] = null)
>> = {
>> new PairRDDFunctions(rdd)
>> }
>>
>> If so, the converting will be automatic and not need to
>> import org.apache.spark.SparkContext._
>>
>> I tried to search some discussion but found nothing.
>>
>> Best Regards,
>> Shixiong Zhu
>>
>
>
Re: About implicit rddToPairRDDFunctions
Posted by Reynold Xin <rx...@databricks.com>.
Do people usually important o.a.spark.rdd._ ?
Also in order to maintain source and binary compatibility, we would need to
keep both right?
On Thu, Nov 6, 2014 at 3:12 AM, Shixiong Zhu <zs...@gmail.com> wrote:
> I saw many people asked how to convert a RDD to a PairRDDFunctions. I would
> like to ask a question about it. Why not put the following implicit into
> "pacakge object rdd" or "object rdd"?
>
> implicit def rddToPairRDDFunctions[K, V](rdd: RDD[(K, V)])
> (implicit kt: ClassTag[K], vt: ClassTag[V], ord: Ordering[K] = null)
> = {
> new PairRDDFunctions(rdd)
> }
>
> If so, the converting will be automatic and not need to
> import org.apache.spark.SparkContext._
>
> I tried to search some discussion but found nothing.
>
> Best Regards,
> Shixiong Zhu
>