You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by tsudukim <gi...@git.apache.org> on 2014/07/24 03:14:01 UTC

[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...

GitHub user tsudukim opened a pull request:

    https://github.com/apache/spark/pull/1558

    [SPARK-2458] Make failed application log visible on History Server

    Modified to show uncompleted applications in History Server ui.
    Modified apps sort rule to startTime base (originally it was endTime base) because uncompleted apps doesn't have proper endTime.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tsudukim/spark feature/SPARK-2458

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1558.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1558
    
----
commit 503d8abb9ae24eb6b211481a60f4b348d125a69a
Author: Masayoshi TSUZUKI <ts...@oss.nttdata.co.jp>
Date:   2014-07-16T00:12:42Z

    [SPARK-2458] Make failed application log visible on History Server
    
    Modified to show completed applications in History Server ui.
    Modified apps sort rule to startTime base (originally it was endTime base) because failed apps doesn't have proper endTime.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/1558#issuecomment-55334845
  
    Also, looks like this has merge conflicts. It would be great if you could rebase to master. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/1558#issuecomment-55333925
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...

Posted by tsudukim <gi...@git.apache.org>.
Github user tsudukim commented on the pull request:

    https://github.com/apache/spark/pull/1558#issuecomment-50060159
  
    Thank you for following this PR.
    Let me explain a little.
    I'm sorry I made you misunderstand my purpose with the improper word "uncompleted". The purpose of this PR is to show "failed" apps in the HS, but not the running apps. But it is true that we can't recognize if the app already failed or still running from the log in this way, so as a result they both show up in the HS.
    
    First point, the purpose is to show failed apps in the past, so this PR still matches for the concept of HS.
    Second point, the target of this PR is apps that never go into "finished" state.
    And third point, sorting ways are the same in the both mode. But your suggestion makes sense. Separate table or tab might be better.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...

Posted by tsudukim <gi...@git.apache.org>.
Github user tsudukim commented on the pull request:

    https://github.com/apache/spark/pull/1558#issuecomment-57430441
  
    Thank you @andrewor14
    I've researched this problem these days with our environment and it turned out to be a very rare case as @vanzin suggested first.
     (like jvm lost and failed to call SparkContext::stop(), failed to write to HDFS for some reason, etc)
    And my PR is not the smart way to solve the rare case.
    so I drop this PR.
    Thank you for your comments again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/1558#issuecomment-50060945
  
    Hmm. A properly-written app that fails should still show up as finished:
    
        val sc = new SparkContext(blah)
        try {
          doStuff()
        } finally {
          sc.stop()
        }
    
    Of course that's not guaranteed to work 100% of the time (for that we'd need an external entity monitoring the app, since we can't trust the app itself to do the right thing), but should cover most cases.
    
    re: sorting, I see what you mean. Still, I think sorting by end time is more natural for someone checking app history. Perhaps at some point we should let the user pick how to sort / filter the list, but that's a separate discussion.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1558#issuecomment-55334841
  
    QA results for PR 1558:<br>- This patch FAILED unit tests.<br><br>For more information see test ouptut:<br>https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20169/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1558#issuecomment-55164137
  
    QA tests have started for PR 1558. This patch DID NOT merge cleanly! <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/14/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1558#issuecomment-49956952
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/1558#issuecomment-55334577
  
    Hi @tsudukim, how does the user see the incomplete applications? As @vanzin suggested, the semantics of a history server is that it displays completed applications only. That said, since we can't distinguish running and failed applications, we might want a way to expose the potentially failed applications.
    
    I think the UI should have a subtle "Show incomplete applications" link that only expands if the user clicks on it. These should be in a separate table by themselves so we don't mix them with the ones we know are complete. As for sorting, I agree with @vanzin that end time is more natural than start time. For incomplete applications, actually, won't the end time always be infinity or some special value? Maybe we can use that to detect whether an application has finished.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/1558#issuecomment-57500764
  
    @tsudukim Actually the high-level fix here is not a bad idea. Right now if the logs don't show up here the user has to manually figure out whether the APPLICATION_COMPLETE file is present. It would be good to show some feedback to the user so they don't have to guess if their paths are set properly or their application terminated properly etc.
    
    Let me know if you're interested in submitting a new PR that addresses the comments raised in this one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1558#issuecomment-55164303
  
    QA results for PR 1558:<br>- This patch FAILED unit tests.<br><br>For more information see test ouptut:<br>https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/14/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...

Posted by tsudukim <gi...@git.apache.org>.
Github user tsudukim closed the pull request at:

    https://github.com/apache/spark/pull/1558


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...

Posted by tsudukim <gi...@git.apache.org>.
Github user tsudukim commented on the pull request:

    https://github.com/apache/spark/pull/1558#issuecomment-64510185
  
    @andrewor14 I created a new PR (#3467) as your comment. Please check it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/1558#issuecomment-50052724
  
    Disclaimer: haven't looked at the code yet.
    
    I'm a little conflicted about exposing running apps in the history server, especially this way. First, "history" sort of implies things that happened in the past.
    
    Second, misbehaving apps can cause log files to never go into a "finished" state (e.g. by failing to call `SparkContext::stop()`) - although you can make the argument that anyone can write anything to the root log dir anyway.
    
    Third, the user experience from your screenshots is very weird. When just looking at finished apps, things are sorted one way, but when including unfinished ones, they're sorted another way. That's super confusing, especially when you have paging.
    
    If listing running apps in the HS is really wanted, I'd suggest an approach where running apps are shown separately from finished ones. Either in a separate table, or a separate tab in the UI.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1558#issuecomment-55334702
  
    QA tests have started for PR 1558. This patch DID NOT merge cleanly! <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20169/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...

Posted by tsudukim <gi...@git.apache.org>.
Github user tsudukim commented on the pull request:

    https://github.com/apache/spark/pull/1558#issuecomment-49956854
  
    We get the same ui as now by default.
    ![spark-2458-notinclude](https://cloud.githubusercontent.com/assets/8070366/3682544/dca4bb96-12cf-11e4-9965-0efa231babd9.png)
    
    When clicked the link above the table, we can also get the list that also include the apps which doesn't finished successfully.
    ![spark-2458-include](https://cloud.githubusercontent.com/assets/8070366/3682546/e191fc54-12cf-11e4-9c93-a4a3115f82f2.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1558#issuecomment-54694566
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org