You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by ansriniv <an...@gmail.com> on 2014/05/30 05:47:14 UTC
getPreferredLocations
I am building my own custom RDD class.
1) Is there a guarantee that a partition will only be processed on a node
which is in the "getPreferredLocations" set of nodes returned by the RDD ?
2) I am implementing this custom RDD in Java and plan to extend JavaRDD.
However, I dont see a "getPreferredLocations" method in
http://spark.apache.org/docs/latest/api/core/index.html#org.apache.spark.api.java.JavaRDD
Will I be able to implement my own custom RDD in Java and be able to
override the "getPreferredLocations" method ?
Thanks
Anand
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/getPreferredLocations-tp6554.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: getPreferredLocations
Posted by Patrick Wendell <pw...@gmail.com>.
> 1) Is there a guarantee that a partition will only be processed on a node
> which is in the "getPreferredLocations" set of nodes returned by the RDD ?
No there isn't, by default Spark may schedule in a "non preferred"
location after `spark.locality.wait` has expired.
http://spark.apache.org/docs/latest/configuration.html#scheduling
If you want to have the behavior that this is treated as a constraint,
you can turn spark.locality.wait to a very high value. Keep in mind
though, this will starve your job if all of the preferred locations
are on nodes that are not alive.
>
> 2) I am implementing this custom RDD in Java and plan to extend JavaRDD.
> However, I dont see a "getPreferredLocations" method in
There is not currently support for writing custom RDD classes in Java.
Or at least, you'd need to write some of the internals in Scala and
then you could write a Java wrapper.
Can I ask though - what is the reason you are trying to write a custom
RDD? This is usually a somewhat advanced use case if you are, e.g.
writing a Spark integration with a new storage system or something
like that.