You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Hyukjin Kwon <gu...@gmail.com> on 2016/04/17 08:06:09 UTC

Question about Scala style, explicit typing within transformation functions and anonymous val.

Hi all,

First of all, I am sorry that this is relatively trivial and too minor but
I just want to be clear on this and careful for the more PRs in the future.

Recently, I have submitted a PR (https://github.com/apache/spark/pull/12413)
about Scala style and this was merged. In this PR, I changed

1.

from

.map(item => {
  ...
})

to

.map { item =>
  ...
}



2.
from

words.foreachRDD { (rdd: RDD[String], time: Time) => ...

to

words.foreachRDD { (rdd, time) => ...



3.

from

.map { x =>
  function(x)
}

to

.map(function(_))


My question is, I think it looks 2. and 3. are arguable (please see the
discussion in the PR).
I agree that I might not have to change those in the future but I just
wonder if I should revert 2. and 3..

FYI,
- The usage of 2. is pretty rare.
- 3. is pretty a lot. but the PR corrects ones like above only when the val
within closure looks obviously meaningless (such as x or a) and with only
single line.

I would appreciate that if you add some comments and opinions on this.

Thanks!

Re: Question about Scala style, explicit typing within transformation functions and anonymous val.

Posted by Koert Kuipers <ko...@tresata.com>.
i find version 3 without the _ also more readable

On Sun, Apr 17, 2016 at 3:02 AM, Mark Hamstra <ma...@clearstorydata.com>
wrote:

> I actually find my version of 3 more readable than the one with the `_`,
> which looks too much like a partially applied function.  It's a minor
> issue, though.
>
> On Sat, Apr 16, 2016 at 11:56 PM, Hyukjin Kwon <gu...@gmail.com>
> wrote:
>
>> Hi Mark,
>>
>> I know but that could harm readability. AFAIK, for this reason, that is
>> not (or rarely) used in Spark.
>>
>> 2016-04-17 15:54 GMT+09:00 Mark Hamstra <ma...@clearstorydata.com>:
>>
>>> FWIW, 3 should work as just `.map(function)`.
>>>
>>> On Sat, Apr 16, 2016 at 11:48 PM, Reynold Xin <rx...@databricks.com>
>>> wrote:
>>>
>>>> Hi Hyukjin,
>>>>
>>>> Thanks for asking.
>>>>
>>>> For 1 the change is almost always better.
>>>>
>>>> For 2 it depends on the context. In general if the type is not obvious,
>>>> it helps readability to explicitly declare them.
>>>>
>>>> For 3 again it depends on context.
>>>>
>>>>
>>>> So while it is a good idea to change 1 to reflect a more consistent
>>>> code base (and maybe we should codify it), it is almost always a bad idea
>>>> to change 2 and 3 just for the sake of changing them.
>>>>
>>>>
>>>>
>>>> On Sat, Apr 16, 2016 at 11:06 PM, Hyukjin Kwon <gu...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> First of all, I am sorry that this is relatively trivial and too minor
>>>>> but I just want to be clear on this and careful for the more PRs in the
>>>>> future.
>>>>>
>>>>> Recently, I have submitted a PR (
>>>>> https://github.com/apache/spark/pull/12413) about Scala style and
>>>>> this was merged. In this PR, I changed
>>>>>
>>>>> 1.
>>>>>
>>>>> from
>>>>>
>>>>> .map(item => {
>>>>>   ...
>>>>> })
>>>>>
>>>>> to
>>>>>
>>>>> .map { item =>
>>>>>   ...
>>>>> }
>>>>>
>>>>>
>>>>>
>>>>> 2.
>>>>> from
>>>>>
>>>>> words.foreachRDD { (rdd: RDD[String], time: Time) => ...
>>>>>
>>>>> to
>>>>>
>>>>> words.foreachRDD { (rdd, time) => ...
>>>>>
>>>>>
>>>>>
>>>>> 3.
>>>>>
>>>>> from
>>>>>
>>>>> .map { x =>
>>>>>   function(x)
>>>>> }
>>>>>
>>>>> to
>>>>>
>>>>> .map(function(_))
>>>>>
>>>>>
>>>>> My question is, I think it looks 2. and 3. are arguable (please see
>>>>> the discussion in the PR).
>>>>> I agree that I might not have to change those in the future but I just
>>>>> wonder if I should revert 2. and 3..
>>>>>
>>>>> FYI,
>>>>> - The usage of 2. is pretty rare.
>>>>> - 3. is pretty a lot. but the PR corrects ones like above only when
>>>>> the val within closure looks obviously meaningless (such as x or a) and
>>>>> with only single line.
>>>>>
>>>>> I would appreciate that if you add some comments and opinions on this.
>>>>>
>>>>> Thanks!
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Question about Scala style, explicit typing within transformation functions and anonymous val.

Posted by Mark Hamstra <ma...@clearstorydata.com>.
I actually find my version of 3 more readable than the one with the `_`,
which looks too much like a partially applied function.  It's a minor
issue, though.

On Sat, Apr 16, 2016 at 11:56 PM, Hyukjin Kwon <gu...@gmail.com> wrote:

> Hi Mark,
>
> I know but that could harm readability. AFAIK, for this reason, that is
> not (or rarely) used in Spark.
>
> 2016-04-17 15:54 GMT+09:00 Mark Hamstra <ma...@clearstorydata.com>:
>
>> FWIW, 3 should work as just `.map(function)`.
>>
>> On Sat, Apr 16, 2016 at 11:48 PM, Reynold Xin <rx...@databricks.com>
>> wrote:
>>
>>> Hi Hyukjin,
>>>
>>> Thanks for asking.
>>>
>>> For 1 the change is almost always better.
>>>
>>> For 2 it depends on the context. In general if the type is not obvious,
>>> it helps readability to explicitly declare them.
>>>
>>> For 3 again it depends on context.
>>>
>>>
>>> So while it is a good idea to change 1 to reflect a more consistent code
>>> base (and maybe we should codify it), it is almost always a bad idea to
>>> change 2 and 3 just for the sake of changing them.
>>>
>>>
>>>
>>> On Sat, Apr 16, 2016 at 11:06 PM, Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> First of all, I am sorry that this is relatively trivial and too minor
>>>> but I just want to be clear on this and careful for the more PRs in the
>>>> future.
>>>>
>>>> Recently, I have submitted a PR (
>>>> https://github.com/apache/spark/pull/12413) about Scala style and this
>>>> was merged. In this PR, I changed
>>>>
>>>> 1.
>>>>
>>>> from
>>>>
>>>> .map(item => {
>>>>   ...
>>>> })
>>>>
>>>> to
>>>>
>>>> .map { item =>
>>>>   ...
>>>> }
>>>>
>>>>
>>>>
>>>> 2.
>>>> from
>>>>
>>>> words.foreachRDD { (rdd: RDD[String], time: Time) => ...
>>>>
>>>> to
>>>>
>>>> words.foreachRDD { (rdd, time) => ...
>>>>
>>>>
>>>>
>>>> 3.
>>>>
>>>> from
>>>>
>>>> .map { x =>
>>>>   function(x)
>>>> }
>>>>
>>>> to
>>>>
>>>> .map(function(_))
>>>>
>>>>
>>>> My question is, I think it looks 2. and 3. are arguable (please see the
>>>> discussion in the PR).
>>>> I agree that I might not have to change those in the future but I just
>>>> wonder if I should revert 2. and 3..
>>>>
>>>> FYI,
>>>> - The usage of 2. is pretty rare.
>>>> - 3. is pretty a lot. but the PR corrects ones like above only when the
>>>> val within closure looks obviously meaningless (such as x or a) and with
>>>> only single line.
>>>>
>>>> I would appreciate that if you add some comments and opinions on this.
>>>>
>>>> Thanks!
>>>>
>>>
>>>
>>
>

Re: Question about Scala style, explicit typing within transformation functions and anonymous val.

Posted by Hyukjin Kwon <gu...@gmail.com>.
Hi Mark,

I know but that could harm readability. AFAIK, for this reason, that is not
(or rarely) used in Spark.

2016-04-17 15:54 GMT+09:00 Mark Hamstra <ma...@clearstorydata.com>:

> FWIW, 3 should work as just `.map(function)`.
>
> On Sat, Apr 16, 2016 at 11:48 PM, Reynold Xin <rx...@databricks.com> wrote:
>
>> Hi Hyukjin,
>>
>> Thanks for asking.
>>
>> For 1 the change is almost always better.
>>
>> For 2 it depends on the context. In general if the type is not obvious,
>> it helps readability to explicitly declare them.
>>
>> For 3 again it depends on context.
>>
>>
>> So while it is a good idea to change 1 to reflect a more consistent code
>> base (and maybe we should codify it), it is almost always a bad idea to
>> change 2 and 3 just for the sake of changing them.
>>
>>
>>
>> On Sat, Apr 16, 2016 at 11:06 PM, Hyukjin Kwon <gu...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> First of all, I am sorry that this is relatively trivial and too minor
>>> but I just want to be clear on this and careful for the more PRs in the
>>> future.
>>>
>>> Recently, I have submitted a PR (
>>> https://github.com/apache/spark/pull/12413) about Scala style and this
>>> was merged. In this PR, I changed
>>>
>>> 1.
>>>
>>> from
>>>
>>> .map(item => {
>>>   ...
>>> })
>>>
>>> to
>>>
>>> .map { item =>
>>>   ...
>>> }
>>>
>>>
>>>
>>> 2.
>>> from
>>>
>>> words.foreachRDD { (rdd: RDD[String], time: Time) => ...
>>>
>>> to
>>>
>>> words.foreachRDD { (rdd, time) => ...
>>>
>>>
>>>
>>> 3.
>>>
>>> from
>>>
>>> .map { x =>
>>>   function(x)
>>> }
>>>
>>> to
>>>
>>> .map(function(_))
>>>
>>>
>>> My question is, I think it looks 2. and 3. are arguable (please see the
>>> discussion in the PR).
>>> I agree that I might not have to change those in the future but I just
>>> wonder if I should revert 2. and 3..
>>>
>>> FYI,
>>> - The usage of 2. is pretty rare.
>>> - 3. is pretty a lot. but the PR corrects ones like above only when the
>>> val within closure looks obviously meaningless (such as x or a) and with
>>> only single line.
>>>
>>> I would appreciate that if you add some comments and opinions on this.
>>>
>>> Thanks!
>>>
>>
>>
>

Re: Question about Scala style, explicit typing within transformation functions and anonymous val.

Posted by Mark Hamstra <ma...@clearstorydata.com>.
FWIW, 3 should work as just `.map(function)`.

On Sat, Apr 16, 2016 at 11:48 PM, Reynold Xin <rx...@databricks.com> wrote:

> Hi Hyukjin,
>
> Thanks for asking.
>
> For 1 the change is almost always better.
>
> For 2 it depends on the context. In general if the type is not obvious, it
> helps readability to explicitly declare them.
>
> For 3 again it depends on context.
>
>
> So while it is a good idea to change 1 to reflect a more consistent code
> base (and maybe we should codify it), it is almost always a bad idea to
> change 2 and 3 just for the sake of changing them.
>
>
>
> On Sat, Apr 16, 2016 at 11:06 PM, Hyukjin Kwon <gu...@gmail.com>
> wrote:
>
>> Hi all,
>>
>> First of all, I am sorry that this is relatively trivial and too minor
>> but I just want to be clear on this and careful for the more PRs in the
>> future.
>>
>> Recently, I have submitted a PR (
>> https://github.com/apache/spark/pull/12413) about Scala style and this
>> was merged. In this PR, I changed
>>
>> 1.
>>
>> from
>>
>> .map(item => {
>>   ...
>> })
>>
>> to
>>
>> .map { item =>
>>   ...
>> }
>>
>>
>>
>> 2.
>> from
>>
>> words.foreachRDD { (rdd: RDD[String], time: Time) => ...
>>
>> to
>>
>> words.foreachRDD { (rdd, time) => ...
>>
>>
>>
>> 3.
>>
>> from
>>
>> .map { x =>
>>   function(x)
>> }
>>
>> to
>>
>> .map(function(_))
>>
>>
>> My question is, I think it looks 2. and 3. are arguable (please see the
>> discussion in the PR).
>> I agree that I might not have to change those in the future but I just
>> wonder if I should revert 2. and 3..
>>
>> FYI,
>> - The usage of 2. is pretty rare.
>> - 3. is pretty a lot. but the PR corrects ones like above only when the
>> val within closure looks obviously meaningless (such as x or a) and with
>> only single line.
>>
>> I would appreciate that if you add some comments and opinions on this.
>>
>> Thanks!
>>
>
>

Re: Question about Scala style, explicit typing within transformation functions and anonymous val.

Posted by Reynold Xin <rx...@databricks.com>.
Hi Hyukjin,

Thanks for asking.

For 1 the change is almost always better.

For 2 it depends on the context. In general if the type is not obvious, it
helps readability to explicitly declare them.

For 3 again it depends on context.


So while it is a good idea to change 1 to reflect a more consistent code
base (and maybe we should codify it), it is almost always a bad idea to
change 2 and 3 just for the sake of changing them.



On Sat, Apr 16, 2016 at 11:06 PM, Hyukjin Kwon <gu...@gmail.com> wrote:

> Hi all,
>
> First of all, I am sorry that this is relatively trivial and too minor but
> I just want to be clear on this and careful for the more PRs in the future.
>
> Recently, I have submitted a PR (
> https://github.com/apache/spark/pull/12413) about Scala style and this
> was merged. In this PR, I changed
>
> 1.
>
> from
>
> .map(item => {
>   ...
> })
>
> to
>
> .map { item =>
>   ...
> }
>
>
>
> 2.
> from
>
> words.foreachRDD { (rdd: RDD[String], time: Time) => ...
>
> to
>
> words.foreachRDD { (rdd, time) => ...
>
>
>
> 3.
>
> from
>
> .map { x =>
>   function(x)
> }
>
> to
>
> .map(function(_))
>
>
> My question is, I think it looks 2. and 3. are arguable (please see the
> discussion in the PR).
> I agree that I might not have to change those in the future but I just
> wonder if I should revert 2. and 3..
>
> FYI,
> - The usage of 2. is pretty rare.
> - 3. is pretty a lot. but the PR corrects ones like above only when the
> val within closure looks obviously meaningless (such as x or a) and with
> only single line.
>
> I would appreciate that if you add some comments and opinions on this.
>
> Thanks!
>