You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by charles li <ch...@gmail.com> on 2016/01/19 04:34:06 UTC

rdd.foreach return value

code snippet


​
the 'print' actually print info on the worker node, but I feel confused
where the 'return' value
goes to. for I get nothing on the driver node.
-- 
*--------------------------------------*
a spark lover, a quant, a developer and a good man.

http://github.com/litaotao

Re: rdd.foreach return value

Posted by charles li <ch...@gmail.com>.
thanks, david and ted, I know that the content of RDD can be returned to
driver using `collect


​

On Tue, Jan 19, 2016 at 11:44 AM, Ted Yu <yu...@gmail.com> wrote:

> Here is signature for foreach:
>  def foreach(f: T => Unit): Unit = withScope {
>
> I don't think you can return element in the way shown in the snippet.
>
> On Mon, Jan 18, 2016 at 7:34 PM, charles li <ch...@gmail.com>
> wrote:
>
>> code snippet
>>
>>
>> ​
>> the 'print' actually print info on the worker node, but I feel confused
>> where the 'return' value
>> goes to. for I get nothing on the driver node.
>> --
>> *--------------------------------------*
>> a spark lover, a quant, a developer and a good man.
>>
>> http://github.com/litaotao
>>
>
>


-- 
*--------------------------------------*
a spark lover, a quant, a developer and a good man.

http://github.com/litaotao

Re: rdd.foreach return value

Posted by Ted Yu <yu...@gmail.com>.
For #2, RDD is immutable. 

> On Jan 18, 2016, at 8:10 PM, charles li <ch...@gmail.com> wrote:
> 
> 
> hi, great thanks to david and ted, I know that the content of RDD can be returned to driver using 'collect' method.
> 
> but my question is:
> 
> 
> 1. cause we can write any code we like in the function put into 'foreach', so what happened when we actually write a 'return' sentence in the foreach function?
> 2. as the photo shows bellow, the content of RDD doesn't change after foreach function, why?
> 3. I feel a little confused about the 'foreach' method, it should be an 'action', right? cause it return nothing. or is there any best practice of the 'foreach' funtion? or can some one put your code snippet when using 'foreach' method in your application, that would be awesome. 
> 
> 
> great thanks again
> 
> 
> 
> ​
> 
>> On Tue, Jan 19, 2016 at 11:44 AM, Ted Yu <yu...@gmail.com> wrote:
>> Here is signature for foreach:
>>  def foreach(f: T => Unit): Unit = withScope {
>> 
>> I don't think you can return element in the way shown in the snippet.
>> 
>>> On Mon, Jan 18, 2016 at 7:34 PM, charles li <ch...@gmail.com> wrote:
>>> code snippet
>>> 
>>> <屏幕快照 2016-01-19 上午11.32.05.png>
>>> ​
>>> the 'print' actually print info on the worker node, but I feel confused where the 'return' value 
>>> goes to. for I get nothing on the driver node.
>>> -- 
>>> --------------------------------------
>>> a spark lover, a quant, a developer and a good man.
>>> 
>>> http://github.com/litaotao
> 
> 
> 
> -- 
> --------------------------------------
> a spark lover, a quant, a developer and a good man.
> 
> http://github.com/litaotao

Re: rdd.foreach return value

Posted by charles li <ch...@gmail.com>.
got it, great thanks, Vishal, Ted and David

On Tue, Jan 19, 2016 at 1:10 PM, Vishal Maru <vz...@gmail.com> wrote:

> 1. foreach doesn't expect any value from function being passed (in your
> func_foreach). so nothing happens. The return values are just lost. it's
> like calling a function without saving return value to another var.
> foreach also doesn't return anything so you don't get modified RDD (like
> map*).
> 2. RDD's are immutable. All transform functions (map*,groupBy*,reduceBy
> etc.) return new RDD.
> 3. Yes. It's just iterates through elements and calls the function being
> passed. That's it. It doesn't collect the values and don't return any new
> modified RDD.
>
>
> On Mon, Jan 18, 2016 at 11:10 PM, charles li <ch...@gmail.com>
> wrote:
>
>>
>> hi, great thanks to david and ted, I know that the content of RDD can be
>> returned to driver using 'collect' method.
>>
>> but my question is:
>>
>>
>> 1. cause we can write any code we like in the function put into
>> 'foreach', so what happened when we actually write a 'return' sentence in
>> the foreach function?
>> 2. as the photo shows bellow, the content of RDD doesn't change after
>> foreach function, why?
>> 3. I feel a little confused about the 'foreach' method, it should be an
>> 'action', right? cause it return nothing. or is there any best practice of
>> the 'foreach' funtion? or can some one put your code snippet when using
>> 'foreach' method in your application, that would be awesome.
>>
>>
>> great thanks again
>>
>>
>>
>> ​
>>
>> On Tue, Jan 19, 2016 at 11:44 AM, Ted Yu <yu...@gmail.com> wrote:
>>
>>> Here is signature for foreach:
>>>  def foreach(f: T => Unit): Unit = withScope {
>>>
>>> I don't think you can return element in the way shown in the snippet.
>>>
>>> On Mon, Jan 18, 2016 at 7:34 PM, charles li <ch...@gmail.com>
>>> wrote:
>>>
>>>> code snippet
>>>>
>>>>
>>>> ​
>>>> the 'print' actually print info on the worker node, but I feel confused
>>>> where the 'return' value
>>>> goes to. for I get nothing on the driver node.
>>>> --
>>>> *--------------------------------------*
>>>> a spark lover, a quant, a developer and a good man.
>>>>
>>>> http://github.com/litaotao
>>>>
>>>
>>>
>>
>>
>> --
>> *--------------------------------------*
>> a spark lover, a quant, a developer and a good man.
>>
>> http://github.com/litaotao
>>
>
>


-- 
*--------------------------------------*
a spark lover, a quant, a developer and a good man.

http://github.com/litaotao

Re: rdd.foreach return value

Posted by Vishal Maru <vz...@gmail.com>.
1. foreach doesn't expect any value from function being passed (in your
func_foreach). so nothing happens. The return values are just lost. it's
like calling a function without saving return value to another var.
foreach also doesn't return anything so you don't get modified RDD (like
map*).
2. RDD's are immutable. All transform functions (map*,groupBy*,reduceBy
etc.) return new RDD.
3. Yes. It's just iterates through elements and calls the function being
passed. That's it. It doesn't collect the values and don't return any new
modified RDD.


On Mon, Jan 18, 2016 at 11:10 PM, charles li <ch...@gmail.com>
wrote:

>
> hi, great thanks to david and ted, I know that the content of RDD can be
> returned to driver using 'collect' method.
>
> but my question is:
>
>
> 1. cause we can write any code we like in the function put into 'foreach',
> so what happened when we actually write a 'return' sentence in the foreach
> function?
> 2. as the photo shows bellow, the content of RDD doesn't change after
> foreach function, why?
> 3. I feel a little confused about the 'foreach' method, it should be an
> 'action', right? cause it return nothing. or is there any best practice of
> the 'foreach' funtion? or can some one put your code snippet when using
> 'foreach' method in your application, that would be awesome.
>
>
> great thanks again
>
>
>
> ​
>
> On Tue, Jan 19, 2016 at 11:44 AM, Ted Yu <yu...@gmail.com> wrote:
>
>> Here is signature for foreach:
>>  def foreach(f: T => Unit): Unit = withScope {
>>
>> I don't think you can return element in the way shown in the snippet.
>>
>> On Mon, Jan 18, 2016 at 7:34 PM, charles li <ch...@gmail.com>
>> wrote:
>>
>>> code snippet
>>>
>>>
>>> ​
>>> the 'print' actually print info on the worker node, but I feel confused
>>> where the 'return' value
>>> goes to. for I get nothing on the driver node.
>>> --
>>> *--------------------------------------*
>>> a spark lover, a quant, a developer and a good man.
>>>
>>> http://github.com/litaotao
>>>
>>
>>
>
>
> --
> *--------------------------------------*
> a spark lover, a quant, a developer and a good man.
>
> http://github.com/litaotao
>

Re: rdd.foreach return value

Posted by charles li <ch...@gmail.com>.
hi, great thanks to david and ted, I know that the content of RDD can be
returned to driver using 'collect' method.

but my question is:


1. cause we can write any code we like in the function put into 'foreach',
so what happened when we actually write a 'return' sentence in the foreach
function?
2. as the photo shows bellow, the content of RDD doesn't change after
foreach function, why?
3. I feel a little confused about the 'foreach' method, it should be an
'action', right? cause it return nothing. or is there any best practice of
the 'foreach' funtion? or can some one put your code snippet when using
'foreach' method in your application, that would be awesome.


great thanks again



​

On Tue, Jan 19, 2016 at 11:44 AM, Ted Yu <yu...@gmail.com> wrote:

> Here is signature for foreach:
>  def foreach(f: T => Unit): Unit = withScope {
>
> I don't think you can return element in the way shown in the snippet.
>
> On Mon, Jan 18, 2016 at 7:34 PM, charles li <ch...@gmail.com>
> wrote:
>
>> code snippet
>>
>>
>> ​
>> the 'print' actually print info on the worker node, but I feel confused
>> where the 'return' value
>> goes to. for I get nothing on the driver node.
>> --
>> *--------------------------------------*
>> a spark lover, a quant, a developer and a good man.
>>
>> http://github.com/litaotao
>>
>
>


-- 
*--------------------------------------*
a spark lover, a quant, a developer and a good man.

http://github.com/litaotao

Re: rdd.foreach return value

Posted by Ted Yu <yu...@gmail.com>.
Here is signature for foreach:
 def foreach(f: T => Unit): Unit = withScope {

I don't think you can return element in the way shown in the snippet.

On Mon, Jan 18, 2016 at 7:34 PM, charles li <ch...@gmail.com> wrote:

> code snippet
>
>
> ​
> the 'print' actually print info on the worker node, but I feel confused
> where the 'return' value
> goes to. for I get nothing on the driver node.
> --
> *--------------------------------------*
> a spark lover, a quant, a developer and a good man.
>
> http://github.com/litaotao
>

Re: rdd.foreach return value

Posted by David Russell <th...@protonmail.com>.
The foreach operation on RDD has a void (Unit) return type. See attached. So there is no return value to the driver.

David

"All that is gold does not glitter, Not all those who wander are lost."



-------- Original Message --------
Subject: rdd.foreach return value
Local Time: January 18 2016 10:34 pm
UTC Time: January 19 2016 3:34 am
From: charles.upboy@gmail.com
To: user@spark.apache.org


code snippet




the 'print' actually print info on the worker node, but I feel confused where the 'return' value

goes to. for I get nothing on the driver node.
--


--------------------------------------
a spark lover, a quant, a developer and a good man.

http://github.com/litaotao