You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Bijay Pathak <bi...@cloudwick.com> on 2015/03/23 22:25:11 UTC

Shuffle Spill Memory and Shuffle Spill Disk

Hello,

I am running  TeraSort <https://github.com/ehiggs/spark-terasort> on 100GB
of data. The final metrics I am getting on Shuffle Spill are:

Shuffle Spill(Memory): 122.5 GB
Shuffle Spill(Disk): 3.4 GB

What's the difference and relation between these two metrics? Does these
mean 122.5 GB was spill from memory during the shuffle?

thank you,
bijay

Re: Shuffle Spill Memory and Shuffle Spill Disk

Posted by Bijay Pathak <bi...@cloudwick.com>.
It looks this is not the right place for this question, I have send the
question to user group.

thank you,
bijay

On Mon, Mar 23, 2015 at 2:25 PM, Bijay Pathak <bi...@cloudwick.com>
wrote:

> Hello,
>
> I am running  TeraSort <https://github.com/ehiggs/spark-terasort> on
> 100GB of data. The final metrics I am getting on Shuffle Spill are:
>
> Shuffle Spill(Memory): 122.5 GB
> Shuffle Spill(Disk): 3.4 GB
>
> What's the difference and relation between these two metrics? Does these
> mean 122.5 GB was spill from memory during the shuffle?
>
> thank you,
> bijay
>