You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Artur Sukhenko <ar...@gmail.com> on 2016/11/15 20:45:43 UTC
NodeManager heap size with ExternalShuffleService
Hello guys,
When you enable ExternalShuffleService (spark-shuffle) in NodeManager,
there are no suggestions of increasing NM heap size in Spark docs or
anywhere else, shouldn't we include this in spark's documentation?
I have seen NM take a lot of memory 5+ gb with default 1g, and in case of
its GC pauses spark can become very slow when tasks are doing shuffle. I
don't think users are aware of NM becoming bottleneck.
Sincerely,
Artur Sukhenko
--
--
Artur Sukhenko
Re: NodeManager heap size with ExternalShuffleService
Posted by Artur Sukhenko <ar...@gmail.com>.
Sure Reynold,
Here is pull request - [YARN][DOC] Increasing NodeManager's heap size with
External Shuffle Service <https://github.com/apache/spark/pull/15906>
On Wed, Nov 16, 2016, 04:07 Reynold Xin <rx...@databricks.com> wrote:
Can you submit a pull request to add that to the documentation?
On November 15, 2016 at 12:45:57 PM, Artur Sukhenko (
artur.sukhenko@gmail.com) wrote:
Hello guys,
When you enable ExternalShuffleService (spark-shuffle) in NodeManager,
there are no suggestions of increasing NM heap size in Spark docs or
anywhere else, shouldn't we include this in spark's documentation?
I have seen NM take a lot of memory 5+ gb with default 1g, and in case of
its GC pauses spark can become very slow when tasks are doing shuffle. I
don't think users are aware of NM becoming bottleneck.
Sincerely,
Artur Sukhenko
--
--
Artur Sukhenko
--
--
Artur Sukhenko
Re: NodeManager heap size with ExternalShuffleService
Posted by Reynold Xin <rx...@databricks.com>.
Can you submit a pull request to add that to the documentation?
On November 15, 2016 at 12:45:57 PM, Artur Sukhenko (
artur.sukhenko@gmail.com) wrote:
Hello guys,
When you enable ExternalShuffleService (spark-shuffle) in NodeManager,
there are no suggestions of increasing NM heap size in Spark docs or
anywhere else, shouldn't we include this in spark's documentation?
I have seen NM take a lot of memory 5+ gb with default 1g, and in case of
its GC pauses spark can become very slow when tasks are doing shuffle. I
don't think users are aware of NM becoming bottleneck.
Sincerely,
Artur Sukhenko
--
--
Artur Sukhenko