You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by jerryshao <gi...@git.apache.org> on 2016/06/16 17:57:07 UTC

[GitHub] spark pull request #13712: [SPARK-15990][YARN] Add rolling log aggregation s...

GitHub user jerryshao opened a pull request:

    https://github.com/apache/spark/pull/13712

    [SPARK-15990][YARN] Add rolling log aggregation support for Spark on yarn

    ## What changes were proposed in this pull request?
    
    Yarn supports rolling log aggregation since 2.6, previously log will only be aggregated to HDFS after application is finished, it is quite painful for long running applications like Spark Streaming, thriftserver. Also out of disk problem will be occurred when log file is too large. So here propose to add support of rolling log aggregation for Spark on yarn.
    
    One limitation for this is that log4j should be set to change to file appender, now in Spark itself uses console appender by default, in which file will not be created again once removed after aggregation. But I think lots of production users should have changed their log4j configuration instead of default on, so this is not a big problem.
    
    ## How was this patch tested?
    
    Manually verified with Hadoop 2.7.1.
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jerryshao/apache-spark SPARK-15990

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13712.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13712
    
----

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13712: [SPARK-15990][YARN] Add rolling log aggregation support ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13712
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13712: [SPARK-15990][YARN] Add rolling log aggregation support ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13712
  
    **[Test build #61437 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61437/consoleFull)** for PR 13712 at commit [`f5094da`](https://github.com/apache/spark/commit/f5094dac65e94954b224e5d5bf1f18b2c4333a3a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13712: [SPARK-15990][YARN] Add rolling log aggregation support ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13712
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61437/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13712: [SPARK-15990][YARN] Add rolling log aggregation s...

Posted by jerryshao <gi...@git.apache.org>.

Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13712#discussion_r99450752
  
    --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -271,6 +271,33 @@ private[spark] class Client(
             appContext.setResource(capability)
         }
     
    +    sparkConf.get(ROLLED_LOG_INCLUDE_PATTERN).foreach { includePattern =>
    --- End diff --
    
    Yes, I think your understanding is also right. In my previous assumption, exclude patterns will only be effective when include pattern is set, this might be OK. But I think your change should also be fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13712: [SPARK-15990][YARN] Add rolling log aggregation support ...

Posted by jerryshao <gi...@git.apache.org>.

Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/13712
  
    Thanks a lot @tgravescs for your review. I tested locally with lower version of Hadoop both in compiling and running, it works fine except this feature.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13712: [SPARK-15990][YARN] Add rolling log aggregation s...

Posted by jerryshao <gi...@git.apache.org>.

Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13712#discussion_r68681679
  
    --- Diff: docs/running-on-yarn.md ---
    @@ -472,6 +472,29 @@ To use a custom metrics.properties for the application master and executors, upd
       Currently supported services are: <code>hive</code>, <code>hbase</code>
       </td>
     </tr>
    +<tr>
    +  <td><code>spark.yarn.rolledLog.includePattern</code></td>
    +  <td>(none)</td>
    +  <td>
    +  Java Regex to filter the log files which match the defined include pattern
    +  and those log files will be aggregated in a rolling fashion.
    +  This will be used with YARN's rolling log aggregation, to enable this feature in YARN side
    +  <code>yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds</code> should be
    +  configured in yarn-site.xml.
    +  Besides this feature can only be used with Hadoop 2.6.1+. And the log4j appender should be changed to
    +  File appender. Based on the file name configured in log4j configuration (like spark.log),
    --- End diff --
    
    Hi @tgravescs the problem here is that by default Spark's log4j is using `ConsoleAppender` and redirecting to `stdout` and `stderr` files in the yarn application start command, the behavior of yarn rolling log aggregation is to collect the logs and then delete them, in this case once `stdout` and `stderr` files are collected and deleted, the new `stdout` and `stderr` files will not be generated again, so logs will be missing. But for FileAppender, new files will be created once deleted, so that's why only FileAppender can be worked.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13712: [SPARK-15990][YARN] Add rolling log aggregation support ...

Posted by tgravescs <gi...@git.apache.org>.

Github user tgravescs commented on the issue:

    https://github.com/apache/spark/pull/13712
  
    +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13712: [SPARK-15990][YARN] Add rolling log aggregation s...

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13712#discussion_r99353603
  
    --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -271,6 +271,33 @@ private[spark] class Client(
             appContext.setResource(capability)
         }
     
    +    sparkConf.get(ROLLED_LOG_INCLUDE_PATTERN).foreach { includePattern =>
    --- End diff --
    
    @jerryshao  on this old commit -- am I right that this outer foreach is not supposed to include all of the rest of the body here? it looks like for each include pattern, it will process all exclude patterns, and set the context again. are these supposed to be two separate loops or did I miss the point?
    
    No worries if so, I may be able to change this if needed while making changes for 2.6.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13712: [SPARK-15990][YARN] Add rolling log aggregation support ...

Posted by tgravescs <gi...@git.apache.org>.

Github user tgravescs commented on the issue:

    https://github.com/apache/spark/pull/13712
  
    @jerryshao what exactly did you do for testing for this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13712: [SPARK-15990][YARN] Add rolling log aggregation support ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13712
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13712: [SPARK-15990][YARN] Add rolling log aggregation support ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13712
  
    **[Test build #60653 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60653/consoleFull)** for PR 13712 at commit [`20140b1`](https://github.com/apache/spark/commit/20140b1ba47a17e8821b5bbf6b05bb14ad728822).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13712: [SPARK-15990][YARN] Add rolling log aggregation support ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13712
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13712: [SPARK-15990][YARN] Add rolling log aggregation support ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13712
  
    **[Test build #60653 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60653/consoleFull)** for PR 13712 at commit [`20140b1`](https://github.com/apache/spark/commit/20140b1ba47a17e8821b5bbf6b05bb14ad728822).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13712: [SPARK-15990][YARN] Add rolling log aggregation support ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13712
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60653/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13712: [SPARK-15990][YARN] Add rolling log aggregation s...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/13712


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13712: [SPARK-15990][YARN] Add rolling log aggregation support ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13712
  
    **[Test build #61437 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61437/consoleFull)** for PR 13712 at commit [`f5094da`](https://github.com/apache/spark/commit/f5094dac65e94954b224e5d5bf1f18b2c4333a3a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13712: [SPARK-15990][YARN] Add rolling log aggregation s...

Posted by tgravescs <gi...@git.apache.org>.

Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13712#discussion_r68241098
  
    --- Diff: docs/running-on-yarn.md ---
    @@ -472,6 +472,28 @@ To use a custom metrics.properties for the application master and executors, upd
       Currently supported services are: <code>hive</code>, <code>hbase</code>
       </td>
     </tr>
    +<tr>
    +  <td><code>spark.yarn.rolledLog.includePattern</code></td>
    +  <td>(none)</td>
    +  <td>
    +  Java Regex to filter the log files which match the defined include pattern
    +  and those log files will be aggregated in a rolling fashion.
    +  This will be used with YARN's rolling log aggregation, to enable this feature in YARN side
    +  <code>yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds</code> should be
    +  configured in yarn-site.xml.
    +  Besides this feature can only be used with Hadoop 2.6.1+. And the log4j appender should be changed to
    --- End diff --
    
    it might be nice to say what the user should set this to be default.  ie if I'm running stock out of the box spark, I need to include stdout, stderr then you may need to include others based on your logging setup.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13712: [SPARK-15990][YARN] Add rolling log aggregation s...

Posted by jerryshao <gi...@git.apache.org>.

Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13712#discussion_r68876375
  
    --- Diff: docs/running-on-yarn.md ---
    @@ -472,6 +472,29 @@ To use a custom metrics.properties for the application master and executors, upd
       Currently supported services are: <code>hive</code>, <code>hbase</code>
       </td>
     </tr>
    +<tr>
    +  <td><code>spark.yarn.rolledLog.includePattern</code></td>
    +  <td>(none)</td>
    +  <td>
    +  Java Regex to filter the log files which match the defined include pattern
    +  and those log files will be aggregated in a rolling fashion.
    +  This will be used with YARN's rolling log aggregation, to enable this feature in YARN side
    +  <code>yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds</code> should be
    +  configured in yarn-site.xml.
    +  Besides this feature can only be used with Hadoop 2.6.1+. And the log4j appender should be changed to
    +  File appender. Based on the file name configured in log4j configuration (like spark.log),
    --- End diff --
    
    Thanks a lot @tgravescs for your suggestion, I will change it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13712: [SPARK-15990][YARN] Add rolling log aggregation support ...

Posted by jerryshao <gi...@git.apache.org>.

Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/13712
  
    I just enabled rolling log aggregation in yarn side and run the application with this patch on to see if log is actually aggregated periodically.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13712: [SPARK-15990][YARN] Add rolling log aggregation support ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13712
  
    **[Test build #61293 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61293/consoleFull)** for PR 13712 at commit [`140f2ee`](https://github.com/apache/spark/commit/140f2eefab651f476d3157892cd0e8e03b846564).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13712: [SPARK-15990][YARN] Add rolling log aggregation s...

Posted by tgravescs <gi...@git.apache.org>.

Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13712#discussion_r68765722
  
    --- Diff: docs/running-on-yarn.md ---
    @@ -472,6 +472,29 @@ To use a custom metrics.properties for the application master and executors, upd
       Currently supported services are: <code>hive</code>, <code>hbase</code>
       </td>
     </tr>
    +<tr>
    +  <td><code>spark.yarn.rolledLog.includePattern</code></td>
    +  <td>(none)</td>
    +  <td>
    +  Java Regex to filter the log files which match the defined include pattern
    +  and those log files will be aggregated in a rolling fashion.
    +  This will be used with YARN's rolling log aggregation, to enable this feature in YARN side
    +  <code>yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds</code> should be
    +  configured in yarn-site.xml.
    +  Besides this feature can only be used with Hadoop 2.6.1+. And the log4j appender should be changed to
    +  File appender. Based on the file name configured in log4j configuration (like spark.log),
    --- End diff --
    
    Ah, ok that makes sense. 
    
    Can we change the wording slightly just to clarify, maybe something like:
    
    This feature can only be used with Hadoop 2.6.1+. The Spark log4j appender needs be changed to use FileAppender or another appender that can handle the files being removed while its running. Based on the file name configured in the log4j configuration (like spark.log), the user should set the regex (spark*) to include all the log files that need to be aggregated.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13712: [SPARK-15990][YARN] Add rolling log aggregation s...

Posted by tgravescs <gi...@git.apache.org>.

Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13712#discussion_r68579015
  
    --- Diff: docs/running-on-yarn.md ---
    @@ -472,6 +472,29 @@ To use a custom metrics.properties for the application master and executors, upd
       Currently supported services are: <code>hive</code>, <code>hbase</code>
       </td>
     </tr>
    +<tr>
    +  <td><code>spark.yarn.rolledLog.includePattern</code></td>
    +  <td>(none)</td>
    +  <td>
    +  Java Regex to filter the log files which match the defined include pattern
    +  and those log files will be aggregated in a rolling fashion.
    +  This will be used with YARN's rolling log aggregation, to enable this feature in YARN side
    +  <code>yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds</code> should be
    +  configured in yarn-site.xml.
    +  Besides this feature can only be used with Hadoop 2.6.1+. And the log4j appender should be changed to
    +  File appender. Based on the file name configured in log4j configuration (like spark.log),
    --- End diff --
    
    thanks for updating, now that I re-read this I"m wondering why I have to use FileAppender?  I can't just tell it to handle the current stderr, stdout files?
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13712: [SPARK-15990][YARN] Add rolling log aggregation support ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13712
  
    **[Test build #61293 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61293/consoleFull)** for PR 13712 at commit [`140f2ee`](https://github.com/apache/spark/commit/140f2eefab651f476d3157892cd0e8e03b846564).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13712: [SPARK-15990][YARN] Add rolling log aggregation support ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13712
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61293/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13712: [SPARK-15990][YARN] Add rolling log aggregation support ...

Posted by tgravescs <gi...@git.apache.org>.

Github user tgravescs commented on the issue:

    https://github.com/apache/spark/pull/13712
  
    other then the minor doc changes it looks good. It might be nice to test compile against older version of hadoop just as test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org