You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ashic Mahtab <as...@live.com> on 2014/12/19 10:53:09 UTC

How to run an action and get output?

Hi,Say we have an operation that writes something to an external resource and gets some output. For example:
val doSomething(entry:SomeEntry, session:Session) : SomeOutput = {    val result = session.SomeOp(entry)    SomeOutput(entry.Key, result.SomeProp)} 
I could use a transformation for rdd.map, but in case of failures, the map would run on another executor for the same rdd. I could do rdd.foreach, but that returns unit. Is there something like a foreach that can return values?
Thanks,
Ashic. 		 	   		  

Re: How to run an action and get output?

Posted by Tobias Pfeiffer <tg...@preferred.jp>.
Hi,

On Fri, Dec 19, 2014 at 6:53 PM, Ashic Mahtab <as...@live.com> wrote:
>
> val doSomething(entry:SomeEntry, session:Session) : SomeOutput = {
>     val result = session.SomeOp(entry)
>     SomeOutput(entry.Key, result.SomeProp)
> }
>
> I could use a transformation for rdd.map, but in case of failures, the map
> would run on another executor for the same rdd. I could do rdd.foreach, but
> that returns unit. Is there something like a foreach that can return values?
>

I think `map()` is pretty much "`foreach()` that can return values". If you
want to prevent re-execution on errors, wrap the whole thing in a
scala.util.Try{} block or something.

rdd.map(item => {
  Try{ ... }
}).flatMap(_ match {
  case Success(something) => Some(something)
  case Failure(e) => None
})

or so.

Tobias