You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Jy Chen <ch...@gmail.com> on 2016/10/13 15:32:38 UTC

Spark 2.0.0 TreeAggregate with larger depth will be OOM?

Hi,all
I'm using Spark 2.0.0 to train a model with 1000w+ parameters, about 500GB
data. The treeAggregate is used to aggregate the gradient, when I set the
depth = 2 or 3, it works, and depth equals to 3 is faster.
So I set depth = 4 to obtain better performance, but now some executors
will be OOM in the shuffle phase. Why would this happen? With deeper depth,
each executor should aggregate less records and use less memory, I don't
know why OOM happens. Can someone help?