You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:20:15 UTC

[jira] [Updated] (SPARK-10774) Put different event log to different directory according to different conditions

     [ https://issues.apache.org/jira/browse/SPARK-10774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon updated SPARK-10774:
---------------------------------
    Labels: bulk-closed  (was: )

> Put different event log to different directory according to different conditions
> --------------------------------------------------------------------------------
>
>                 Key: SPARK-10774
>                 URL: https://issues.apache.org/jira/browse/SPARK-10774
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 1.4.1
>            Reporter: wyp
>            Priority: Minor
>              Labels: bulk-closed
>
> Right now, Spark logging all event logs(inprogress or finished)  into the some directory(configuration by the **spark.eventLog.dir** parameter) as following:
> {noformat}
> [yangping.wu@l-sparkcluster.data.cn5 /]$ sudo hadoop fs -ls /spark-jobs/eventLog
> Found 58 items
> -rwxrwxrwx   3 spark aaa        8438 2015-09-17 15:14 /spark-jobs/eventLog/application_1440152921247_0047_1.lz4
> -rwxrwxrwx   3 spark aaa       44002 2015-09-17 15:15 /spark-jobs/eventLog/application_1440152921247_0190_1
> -rwxrwxrwx   3 spark aaa       44696 2015-09-17 15:15 /spark-jobs/eventLog/application_1440152921247_0190_2
> -rwxrwxrwx   3 spark aaa       40813 2015-09-17 15:25 /spark-jobs/eventLog/application_1440152921247_0191_1
> -rwxrwxrwx   3 spark aaa       44680 2015-09-17 15:25 /spark-jobs/eventLog/application_1440152921247_0191_2
> -rwxrwxrwx   3 spark aaa       42572 2015-09-17 15:36 /spark-jobs/eventLog/application_1440152921247_0192_1
> -rwxrwxrwx   3 spark aaa       44680 2015-09-17 15:36 /spark-jobs/eventLog/application_1440152921247_0192_2
> -rwxrwxrwx   3 spark aaa       45052 2015-09-17 16:09 /spark-jobs/eventLog/application_1440152921247_0193_1
> -rwxrwxrwx   3 spark aaa       44688 2015-09-17 16:09 /spark-jobs/eventLog/application_1440152921247_0193_2
> -rwxrwxrwx   3 spark aaa       41686 2015-09-17 16:11 /spark-jobs/eventLog/application_1440152921247_0194_1
> -rwxrwxrwx   3 spark aaa       44522 2015-09-17 16:11 /spark-jobs/eventLog/application_1440152921247_0194_2
> -rwxrwxrwx   3 spark aaa       32261 2015-09-17 16:13 /spark-jobs/eventLog/application_1440152921247_0195_1
> -rwxrwxrwx   3 spark aaa       31178 2015-09-17 16:13 /spark-jobs/eventLog/application_1440152921247_0195_2
> -rwxrwxrwx   3 spark aaa 39124467712 2015-09-18 11:58 /spark-jobs/eventLog/application_1440152921247_0205_1.inprogress
> -rwxrwxrwx   3 spark aaa   790045092 2015-09-18 20:40 /spark-jobs/eventLog/application_1440152921247_0206
> ........
> {noformat}
> As time goes by, there will be a lot of event log in the **spark.eventLog.dir** directory and will not easy to manage.  In hadoop, there  are two types of directory to save different type event logs: done-dir and intermediate-done-dir, configuration by **mapreduce.jobhistory.done-dir** and **mapreduce.jobhistory.intermediate-done-dir** respectively. and in the "done-dir", event logs were save to different  directory  according to the running time of the job as following:
> {noformat}
> [yangping.wu@l-sparkcluster.data.cn5 /]$sudo hadoop fs -ls  /hadoop-jobs/done/2015/09/
> Found 23 items
> drwxrwxrwx   - hadoop supergroup    0 2015-09-04 16:59 /hadoop-jobs/done/2015/09/01
> drwxrwxrwx   - hadoop supergroup    0 2015-09-05 16:59 /hadoop-jobs/done/2015/09/02
> drwxrwxrwx   - hadoop supergroup    0 2015-09-06 16:59 /hadoop-jobs/done/2015/09/03
> drwxrwxrwx   - hadoop supergroup    0 2015-09-07 16:59 /hadoop-jobs/done/2015/09/04
> drwxrwxrwx   - hadoop supergroup    0 2015-09-08 16:59 /hadoop-jobs/done/2015/09/05
> drwxrwxrwx   - hadoop supergroup    0 2015-09-09 16:59 /hadoop-jobs/done/2015/09/06
> drwxrwxrwx   - hadoop supergroup    0 2015-09-10 16:59 /hadoop-jobs/done/2015/09/07
> drwxrwxrwx   - hadoop supergroup    0 2015-09-11 16:59 /hadoop-jobs/done/2015/09/08
> drwxrwxrwx   - hadoop supergroup    0 2015-09-12 16:59 /hadoop-jobs/done/2015/09/09
> drwxrwxrwx   - hadoop supergroup    0 2015-09-13 16:59 /hadoop-jobs/done/2015/09/10
> drwxrwx---   - hadoop supergroup    0 2015-09-14 16:59 /hadoop-jobs/done/2015/09/11
> drwxrwx---   - hadoop supergroup    0 2015-09-15 16:59 /hadoop-jobs/done/2015/09/12
> drwxrwxrwx   - hadoop supergroup    0 2015-09-16 16:59 /hadoop-jobs/done/2015/09/13
> drwxrwxrwx   - hadoop supergroup    0 2015-09-17 16:59 /hadoop-jobs/done/2015/09/14
> drwxrwxrwx   - hadoop supergroup    0 2015-09-18 16:59 /hadoop-jobs/done/2015/09/15
> drwxrwxrwx   - hadoop supergroup    0 2015-09-19 16:59 /hadoop-jobs/done/2015/09/16
> drwxrwxrwx   - hadoop supergroup    0 2015-09-20 16:59 /hadoop-jobs/done/2015/09/17
> drwxrwx---   - hadoop supergroup    0 2015-09-21 16:59 /hadoop-jobs/done/2015/09/18
> drwxrwx---   - hadoop supergroup    0 2015-09-22 16:59 /hadoop-jobs/done/2015/09/19
> drwxrwx---   - hadoop supergroup    0 2015-09-23 16:59 /hadoop-jobs/done/2015/09/20
> drwxrwx---   - hadoop supergroup    0 2015-09-23 16:59 /hadoop-jobs/done/2015/09/21
> drwxrwx---   - hadoop supergroup    0 2015-09-22 23:43 /hadoop-jobs/done/2015/09/22
> drwxrwx---   - hadoop supergroup    0 2015-09-23 18:55 /hadoop-jobs/done/2015/09/23
> {noformat}
> In Spark, I think we can do the same thing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org