You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by Nehal Syed <ne...@gmail.com> on 2017/03/09 05:45:19 UTC

Spark Streaming Logs Rotation

Hey Everyone,
I am stuck in an issue of logs rotation with spark streaming job which run
on yarn. Yarn continuously write container stderr and stdout logs to
containers/ folder and it fills up disk space and crash cluster. I want to
continuously move logs to hdfs or s3 and then truncate source file.

How can I split and truncate those open container log files?
Can I use any RollingFileAppender to keep the file size small?
Is there any workaround to handle this growing file?

I am using AWS EMR 5.3.0 that is packaged with:
Spark: Spark 2.1.0 on Hadoop 2.7.3, YARN with Ganglia 3.7.2 and Zeppelin
0.6.2

I have already tried running 'truncate' command and 'logrotate truncate' as
well, nothing changed growing file to zero. AWS support has accepted this
as true and also that they didn't know about this issue before I opened
ticket with them.

Please help me if you have any knowledge.

Thanks
Nehal