You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by András Kolbert <ko...@gmail.com> on 2020/05/17 17:35:54 UTC

Spark Streaming Memory

Hi,

I have a streaming job (Spark 2.4.4) in which the memory usage keeps
increasing over time.

Periodically (20-25) mins the executors fall over
(org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output
location for shuffle 6987) due to out of memory. In the UI, I can see that
the memory keeps increasing batch by batch, although I do not keep more
data in memory (I keep unpersisting, checkpointing and caching new data
frames though), the storage tabs shows only the expected 4 objects overtime.

I wish I just missed a parameter in the spark configuration (like garbage
collection, reference tracking, etc) that would solve my issue. I have seen
a few JIRA tickets around memory leak (SPARK-19644
<https://issues.apache.org/jira/browse/SPARK-19644>, SPARK-29055
<https://issues.apache.org/jira/browse/SPARK-29055>, SPARK-29321
<https://issues.apache.org/jira/browse/SPARK-29321>) it might be the same
issue?

     ("spark.cleaner.referenceTracking.cleanCheckpoints", "true"),
     ('spark.cleaner.periodicGC.interval', '1min'),
     ('spark.cleaner.referenceTracking','true'),
     ('spark.cleaner.referenceTracking.blocking.shuffle','true'),
     ('spark.sql.streaming.minBatchesToRetain', '2'),
     ('spark.sql.streaming.maxBatchesToRetainInMemory', '5'),
     ('spark.ui.retainedJobs','50' ),
     ('spark.ui.retainedStages','50'),
     ('spark.ui.retainedTasks','500'),
     ('spark.worker.ui.retainedExecutors','50'),
     ('spark.worker.ui.retainedDrivers','50'),
     ('spark.sql.ui.retainedExecutions','50'),
     ('spark.streaming.ui.retainedBatches','1440'),
     ('spark.executor.JavaOptions','-XX:+UseG1GC -verbose:gc
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps')

I've tried lowering the spark.streaming.ui.retainedBatches to 8, did not
help.

The application works fine apart from the fact that the processing
some batches take longer (when the executors fall over).

[image: image.png]
[image: image.png]


Any ideas?

I've attached my code.


Thanks,
Andras

Re: Spark Streaming Memory

Posted by Ali Gouta <al...@gmail.com>.

The spark UI is misleading in spark 2.4.4. I moved to spark 2.4.5 and it
fixed it. Now, your problem should be somewhere else. Probably related to
memory consumption but not the one you see in the UI.

Best regards,
Ali Gouta.

On Sun, May 17, 2020 at 7:36 PM András Kolbert <ko...@gmail.com>
wrote:

> Hi,
>
> I have a streaming job (Spark 2.4.4) in which the memory usage keeps
> increasing over time.
>
> Periodically (20-25) mins the executors fall over
> (org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output
> location for shuffle 6987) due to out of memory. In the UI, I can see that
> the memory keeps increasing batch by batch, although I do not keep more
> data in memory (I keep unpersisting, checkpointing and caching new data
> frames though), the storage tabs shows only the expected 4 objects overtime.
>
> I wish I just missed a parameter in the spark configuration (like garbage
> collection, reference tracking, etc) that would solve my issue. I have seen
> a few JIRA tickets around memory leak (SPARK-19644
> <https://issues.apache.org/jira/browse/SPARK-19644>, SPARK-29055
> <https://issues.apache.org/jira/browse/SPARK-29055>, SPARK-29321
> <https://issues.apache.org/jira/browse/SPARK-29321>) it might be the same
> issue?
>
>      ("spark.cleaner.referenceTracking.cleanCheckpoints", "true"),
>      ('spark.cleaner.periodicGC.interval', '1min'),
>      ('spark.cleaner.referenceTracking','true'),
>      ('spark.cleaner.referenceTracking.blocking.shuffle','true'),
>      ('spark.sql.streaming.minBatchesToRetain', '2'),
>      ('spark.sql.streaming.maxBatchesToRetainInMemory', '5'),
>      ('spark.ui.retainedJobs','50' ),
>      ('spark.ui.retainedStages','50'),
>      ('spark.ui.retainedTasks','500'),
>      ('spark.worker.ui.retainedExecutors','50'),
>      ('spark.worker.ui.retainedDrivers','50'),
>      ('spark.sql.ui.retainedExecutions','50'),
>      ('spark.streaming.ui.retainedBatches','1440'),
>      ('spark.executor.JavaOptions','-XX:+UseG1GC -verbose:gc
> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps')
>
> I've tried lowering the spark.streaming.ui.retainedBatches to 8, did not
> help.
>
> The application works fine apart from the fact that the processing
> some batches take longer (when the executors fall over).
>
> [image: image.png]
> [image: image.png]
>
>
> Any ideas?
>
> I've attached my code.
>
>
> Thanks,
> Andras
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org