You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by jbeynon <jb...@gmail.com> on 2014/09/09 23:54:28 UTC

Yarn Driver OOME (Java heap space) when executors request map output locations

I'm running on Yarn with relatively small instances with 4gb memory. I'm not
caching any data but when the map stage ends and shuffling begins all of the
executors request the map output locations at the same time which seems to
kill the driver when the number of executors is turned up.

For example, the "size of output statuses" is about 10mb and with 500
executors the driver appears to be making 500 (5gb of data) copies of this
data to send out and running out of memory. When I turn down the number of
executors everything runs fine.

Has anyone else run into this? Maybe I'm misunderstanding the underlying
cause. I don't have a copy of the stack trace handy but can recreate it if
necessary. It was somewhere in the <init> for HeapByteBuffer. Any advice
would be helpful.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Yarn-Driver-OOME-Java-heap-space-when-executors-request-map-output-locations-tp13827.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Yarn Driver OOME (Java heap space) when executors request map output locations

Posted by Kostas Sakellis <ko...@cloudera.com>.
Hey,

If you are interested in more details there is also a thread about this
issue here:
http://apache-spark-developers-list.1001551.n3.nabble.com/Eliminate-copy-while-sending-data-any-Akka-experts-here-td7127.html

Kostas

On Tue, Sep 9, 2014 at 3:01 PM, jbeynon <jb...@gmail.com> wrote:

> Thanks Marcelo, that looks like the same thing. I'll follow the Jira ticket
> for updates.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Yarn-Driver-OOME-Java-heap-space-when-executors-request-map-output-locations-tp13827p13829.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Re: Yarn Driver OOME (Java heap space) when executors request map output locations

Posted by jbeynon <jb...@gmail.com>.
Thanks Marcelo, that looks like the same thing. I'll follow the Jira ticket
for updates.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Yarn-Driver-OOME-Java-heap-space-when-executors-request-map-output-locations-tp13827p13829.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Yarn Driver OOME (Java heap space) when executors request map output locations

Posted by Marcelo Vanzin <va...@cloudera.com>.
Hi,

Yes, this is a problem, and I'm not aware of any simple workarounds
(or complex one for that matter). There are people working to fix
this, you can follow progress here:
https://issues.apache.org/jira/browse/SPARK-1239

On Tue, Sep 9, 2014 at 2:54 PM, jbeynon <jb...@gmail.com> wrote:
> I'm running on Yarn with relatively small instances with 4gb memory. I'm not
> caching any data but when the map stage ends and shuffling begins all of the
> executors request the map output locations at the same time which seems to
> kill the driver when the number of executors is turned up.
>
> For example, the "size of output statuses" is about 10mb and with 500
> executors the driver appears to be making 500 (5gb of data) copies of this
> data to send out and running out of memory. When I turn down the number of
> executors everything runs fine.
>
> Has anyone else run into this? Maybe I'm misunderstanding the underlying
> cause. I don't have a copy of the stack trace handy but can recreate it if
> necessary. It was somewhere in the <init> for HeapByteBuffer. Any advice
> would be helpful.
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Yarn-Driver-OOME-Java-heap-space-when-executors-request-map-output-locations-tp13827.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org