You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:20:15 UTC
[jira] [Updated] (SPARK-10774) Put different event log to different
directory according to different conditions
[ https://issues.apache.org/jira/browse/SPARK-10774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon updated SPARK-10774:
---------------------------------
Labels: bulk-closed (was: )
> Put different event log to different directory according to different conditions
> --------------------------------------------------------------------------------
>
> Key: SPARK-10774
> URL: https://issues.apache.org/jira/browse/SPARK-10774
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 1.4.1
> Reporter: wyp
> Priority: Minor
> Labels: bulk-closed
>
> Right now, Spark logging all event logs(inprogress or finished) into the some directory(configuration by the **spark.eventLog.dir** parameter) as following:
> {noformat}
> [yangping.wu@l-sparkcluster.data.cn5 /]$ sudo hadoop fs -ls /spark-jobs/eventLog
> Found 58 items
> -rwxrwxrwx 3 spark aaa 8438 2015-09-17 15:14 /spark-jobs/eventLog/application_1440152921247_0047_1.lz4
> -rwxrwxrwx 3 spark aaa 44002 2015-09-17 15:15 /spark-jobs/eventLog/application_1440152921247_0190_1
> -rwxrwxrwx 3 spark aaa 44696 2015-09-17 15:15 /spark-jobs/eventLog/application_1440152921247_0190_2
> -rwxrwxrwx 3 spark aaa 40813 2015-09-17 15:25 /spark-jobs/eventLog/application_1440152921247_0191_1
> -rwxrwxrwx 3 spark aaa 44680 2015-09-17 15:25 /spark-jobs/eventLog/application_1440152921247_0191_2
> -rwxrwxrwx 3 spark aaa 42572 2015-09-17 15:36 /spark-jobs/eventLog/application_1440152921247_0192_1
> -rwxrwxrwx 3 spark aaa 44680 2015-09-17 15:36 /spark-jobs/eventLog/application_1440152921247_0192_2
> -rwxrwxrwx 3 spark aaa 45052 2015-09-17 16:09 /spark-jobs/eventLog/application_1440152921247_0193_1
> -rwxrwxrwx 3 spark aaa 44688 2015-09-17 16:09 /spark-jobs/eventLog/application_1440152921247_0193_2
> -rwxrwxrwx 3 spark aaa 41686 2015-09-17 16:11 /spark-jobs/eventLog/application_1440152921247_0194_1
> -rwxrwxrwx 3 spark aaa 44522 2015-09-17 16:11 /spark-jobs/eventLog/application_1440152921247_0194_2
> -rwxrwxrwx 3 spark aaa 32261 2015-09-17 16:13 /spark-jobs/eventLog/application_1440152921247_0195_1
> -rwxrwxrwx 3 spark aaa 31178 2015-09-17 16:13 /spark-jobs/eventLog/application_1440152921247_0195_2
> -rwxrwxrwx 3 spark aaa 39124467712 2015-09-18 11:58 /spark-jobs/eventLog/application_1440152921247_0205_1.inprogress
> -rwxrwxrwx 3 spark aaa 790045092 2015-09-18 20:40 /spark-jobs/eventLog/application_1440152921247_0206
> ........
> {noformat}
> As time goes by, there will be a lot of event log in the **spark.eventLog.dir** directory and will not easy to manage. In hadoop, there are two types of directory to save different type event logs: done-dir and intermediate-done-dir, configuration by **mapreduce.jobhistory.done-dir** and **mapreduce.jobhistory.intermediate-done-dir** respectively. and in the "done-dir", event logs were save to different directory according to the running time of the job as following:
> {noformat}
> [yangping.wu@l-sparkcluster.data.cn5 /]$sudo hadoop fs -ls /hadoop-jobs/done/2015/09/
> Found 23 items
> drwxrwxrwx - hadoop supergroup 0 2015-09-04 16:59 /hadoop-jobs/done/2015/09/01
> drwxrwxrwx - hadoop supergroup 0 2015-09-05 16:59 /hadoop-jobs/done/2015/09/02
> drwxrwxrwx - hadoop supergroup 0 2015-09-06 16:59 /hadoop-jobs/done/2015/09/03
> drwxrwxrwx - hadoop supergroup 0 2015-09-07 16:59 /hadoop-jobs/done/2015/09/04
> drwxrwxrwx - hadoop supergroup 0 2015-09-08 16:59 /hadoop-jobs/done/2015/09/05
> drwxrwxrwx - hadoop supergroup 0 2015-09-09 16:59 /hadoop-jobs/done/2015/09/06
> drwxrwxrwx - hadoop supergroup 0 2015-09-10 16:59 /hadoop-jobs/done/2015/09/07
> drwxrwxrwx - hadoop supergroup 0 2015-09-11 16:59 /hadoop-jobs/done/2015/09/08
> drwxrwxrwx - hadoop supergroup 0 2015-09-12 16:59 /hadoop-jobs/done/2015/09/09
> drwxrwxrwx - hadoop supergroup 0 2015-09-13 16:59 /hadoop-jobs/done/2015/09/10
> drwxrwx--- - hadoop supergroup 0 2015-09-14 16:59 /hadoop-jobs/done/2015/09/11
> drwxrwx--- - hadoop supergroup 0 2015-09-15 16:59 /hadoop-jobs/done/2015/09/12
> drwxrwxrwx - hadoop supergroup 0 2015-09-16 16:59 /hadoop-jobs/done/2015/09/13
> drwxrwxrwx - hadoop supergroup 0 2015-09-17 16:59 /hadoop-jobs/done/2015/09/14
> drwxrwxrwx - hadoop supergroup 0 2015-09-18 16:59 /hadoop-jobs/done/2015/09/15
> drwxrwxrwx - hadoop supergroup 0 2015-09-19 16:59 /hadoop-jobs/done/2015/09/16
> drwxrwxrwx - hadoop supergroup 0 2015-09-20 16:59 /hadoop-jobs/done/2015/09/17
> drwxrwx--- - hadoop supergroup 0 2015-09-21 16:59 /hadoop-jobs/done/2015/09/18
> drwxrwx--- - hadoop supergroup 0 2015-09-22 16:59 /hadoop-jobs/done/2015/09/19
> drwxrwx--- - hadoop supergroup 0 2015-09-23 16:59 /hadoop-jobs/done/2015/09/20
> drwxrwx--- - hadoop supergroup 0 2015-09-23 16:59 /hadoop-jobs/done/2015/09/21
> drwxrwx--- - hadoop supergroup 0 2015-09-22 23:43 /hadoop-jobs/done/2015/09/22
> drwxrwx--- - hadoop supergroup 0 2015-09-23 18:55 /hadoop-jobs/done/2015/09/23
> {noformat}
> In Spark, I think we can do the same thing.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org