You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by carsonwang <gi...@git.apache.org> on 2015/12/01 07:49:50 UTC

[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

GitHub user carsonwang opened a pull request:

    https://github.com/apache/spark/pull/10061

    [SPARK-11206] Support SQL UI on the history server (resubmit)

    On the live web UI, there is a SQL tab which provides valuable information for the SQL query. But once the workload is finished, we won't see the SQL tab on the history server. It will be helpful if we support SQL UI on the history server so we can analyze it even after its execution.
    
    To support SQL UI on the history server:
    1. I added an onOtherEvent method to the SparkListener trait and post all SQL related events to the same event bus.
    2. Two SQL events SparkListenerSQLExecutionStart and SparkListenerSQLExecutionEnd are defined in the sql module.
    3. The new SQL events are written to event log using Jackson.
    4. A new trait SparkHistoryListenerFactory is added to allow the history server to feed events to the SQL history listener. The SQL implementation is loaded at runtime using java.util.ServiceLoader.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/carsonwang/spark SqlHistoryUI

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10061.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10061
    
----
commit fdf9d28362fe991a1df0d5392c9021db78fa7541
Author: Carson Wang <ca...@intel.com>
Date:   2015-10-16T07:00:11Z

    Update SparkListener to handle other events

commit ff4075d89b480d99d0320390d00a5b10aadc9a93
Author: Carson Wang <ca...@intel.com>
Date:   2015-10-20T06:06:07Z

    Write sql events to event log

commit b9870e6579a59628c82470e55f2cb6c4ec8fa2a7
Author: Carson Wang <ca...@intel.com>
Date:   2015-10-21T06:29:37Z

    Move sql UI classes to core

commit 3833055b7c94247b700e6fd0565629e71611d307
Author: Carson Wang <ca...@intel.com>
Date:   2015-10-21T06:32:28Z

    rename SqlMetricInfo class name

commit c0abfc6b7432750812a47790fe51b35dacba7429
Author: Carson Wang <ca...@intel.com>
Date:   2015-10-22T02:22:51Z

    Update sql metric param

commit a5b1cf42a4847b2c9c9674ce0d6aa4d332498ca2
Author: Carson Wang <ca...@intel.com>
Date:   2015-10-26T02:09:32Z

    handle accumulator updates in sql history UI

commit 7b30bc736f09600b25772e30636f8a5c19c6db5e
Author: Carson Wang <ca...@intel.com>
Date:   2015-10-26T06:51:25Z

    Use a single SQL Tab for all SparkContext

commit d52288bb2e18a5f0e898110893a091919e13ea84
Author: Carson Wang <ca...@intel.com>
Date:   2015-10-27T08:49:30Z

    update style

commit 7a2acedfc524e5c5887bd783e6fbbe289313306a
Author: Carson Wang <ca...@intel.com>
Date:   2015-10-28T05:08:23Z

    Throws exception for unknown events

commit caab0bab0299d4eb985b2e5e68cc5813faac6dfb
Author: Carson Wang <ca...@intel.com>
Date:   2015-10-28T05:39:07Z

    Fix build error

commit 0af5afeafe894614e7c1cb83f343db0a0869ad77
Author: Carson Wang <ca...@intel.com>
Date:   2015-10-28T06:07:28Z

    Fix style

commit 8d565f24ab365482333d4ff2331c72b01d02358a
Author: Carson Wang <ca...@intel.com>
Date:   2015-11-06T02:46:35Z

    Avoid moving the sql classes

commit 1954d71d674dc35659cba6f9e3b56d42a756778e
Author: Carson Wang <ca...@intel.com>
Date:   2015-11-06T06:15:59Z

    code clean

commit 927bae84244f900724c3865aac969e4639594760
Author: Carson Wang <ca...@intel.com>
Date:   2015-11-06T07:17:45Z

    code clean

commit b03d98bcb6854a01b65fd5e43368580dec192482
Author: Carson Wang <ca...@intel.com>
Date:   2015-11-09T05:24:04Z

    minor fix

commit 51f913bed91dd95029b404092e8cd7c25b02cad6
Author: Carson Wang <ca...@intel.com>
Date:   2015-11-10T02:04:09Z

    Fix unit tests

commit bca3f5ffca9a9e6f5c56633812432892af66b4e7
Author: Carson Wang <ca...@intel.com>
Date:   2015-11-12T06:07:36Z

    Address comments

commit 60033f8da8bd564c01811c25dc9dbfdf897896b7
Author: Carson Wang <ca...@intel.com>
Date:   2015-11-12T06:28:31Z

    Fix RAT test

commit fe5c16529d821098658cb61ca0f0a21d2a568270
Author: Carson Wang <ca...@intel.com>
Date:   2015-11-12T06:36:20Z

    Fix style

commit 8d94707d4e44741d5e3c24a146d2bf526cf884ce
Author: Carson Wang <ca...@intel.com>
Date:   2015-11-19T03:24:50Z

    Address vanzin's comments

commit 56f24bafb3f5b9306bf393220d018e0620f1296d
Author: Carson Wang <ca...@intel.com>
Date:   2015-11-19T07:37:23Z

    Address hao's comments

commit 690277e37c76b98ba57bc4bd5b0ae2c87669578d
Author: Carson Wang <ca...@intel.com>
Date:   2015-11-20T02:39:40Z

    Remove one empty line

commit 5270209504a337b50cdbafe8cc3e8f83c6ea9745
Author: Carson Wang <ca...@intel.com>
Date:   2015-12-01T06:04:40Z

    clear sqlListener

commit 8222a0c61a1b05acde18c9d1825528805dedd16f
Author: Carson Wang <ca...@intel.com>
Date:   2015-12-01T06:39:02Z

    Merge branch 'master' into SqlHistoryUI

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10061#issuecomment-161765808
  
    **[Test build #47161 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47161/consoleFull)** for PR 10061 at commit [`8222a0c`](https://github.com/apache/spark/commit/8222a0c61a1b05acde18c9d1825528805dedd16f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/10061#issuecomment-161061904
  
    Looks ok to me; can't really comment on the multiple UIs issue (aside from what Carson said being correct, if that functionality is desired).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10061#discussion_r63822163
  
    --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
    @@ -96,6 +100,7 @@ private[spark] object JsonProtocol {
             executorMetricsUpdateToJson(metricsUpdate)
           case blockUpdated: SparkListenerBlockUpdated =>
             throw new MatchError(blockUpdated)  // TODO(ekl) implement this
    +      case _ => parse(mapper.writeValueAsString(event))
    --- End diff --
    
    > Events are a public API, and they should be carefully crafted, since changing them affects user applications (including event logs). If there is unnecessary information in the event, then it's a bug in the event definition, not here.
    
    Yea. I totally agree. However, my concern is that having this line at here will make the developer harder to spot issues during the development. Since the serialization works automatically, we are not making a self-review on what will be serialized and what methods will be called during serialization a mandatory step, which makes the auditing work much harder. Although it introduces more work to the developer to make every event explicitly handled, when we review the pull request, we can clearly know what will be serialized and how a event is serialized when a pull request is submitted. What do you think?
    
    btw, if I am missing any context, please let me know :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/10061#issuecomment-160878059
  
    One question that I had from the other PR: why is it okay to merge the event log streams from different SQLContexts into the same UI tab? Are relevant identifiers used by those SQL contexts (such as execution IDs) guaranteed to be unique as long as the SQLContexts belong to the same SparkContext?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by carsonwang <gi...@git.apache.org>.
Github user carsonwang commented on the pull request:

    https://github.com/apache/spark/pull/10061#issuecomment-160889037
  
    Hi @JoshRosen, the execution IDs are from the static `SQLExecution` object. So I think they are always unique. Yes, previously each `SQLContext` has its own `SQLListener`. Since the SQL events are now sent to the same event bus, all `SQLListeners` will receives SQL events executed from other `SQLContexts`. If we still want to keep one UI Tab for each `SQLContext`, I think we need keep something like a SQLContext ID in the SQL event so that the `SQLListener` knows if it need handle that event. Is there any strong requirement to keep one UI Tab for each `SQLContext`? I remember some users don't want so many UI tabs because they create many `SQLContext`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10061#issuecomment-160913661
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10061#discussion_r46245537
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala ---
    @@ -1263,6 +1264,8 @@ object SQLContext {
        */
       @transient private val instantiatedContext = new AtomicReference[SQLContext]()
     
    +  @transient private val sqlListener = new AtomicReference[SQLListener]()
    --- End diff --
    
    Just to make sure that it's not overlooked, please see my comment on the other PRs regarding whether this needs to actually hold a `SQLListener` instance or whether it can simply be an `AtomicBoolean`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10061#discussion_r63913361
  
    --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
    @@ -96,6 +100,7 @@ private[spark] object JsonProtocol {
             executorMetricsUpdateToJson(metricsUpdate)
           case blockUpdated: SparkListenerBlockUpdated =>
             throw new MatchError(blockUpdated)  // TODO(ekl) implement this
    +      case _ => parse(mapper.writeValueAsString(event))
    --- End diff --
    
    I'm perfectly ok with making auditing of these events harder if it means you're not writing manual serialization and de-serialization code like JsonProtocol.scala. The drawbacks of the latter are much worse for code readability and maintainability.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/10061


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10061#discussion_r63812300
  
    --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
    @@ -96,6 +100,7 @@ private[spark] object JsonProtocol {
             executorMetricsUpdateToJson(metricsUpdate)
           case blockUpdated: SparkListenerBlockUpdated =>
             throw new MatchError(blockUpdated)  // TODO(ekl) implement this
    +      case _ => parse(mapper.writeValueAsString(event))
    --- End diff --
    
    As I've said in similar conversations in other contexts, I'm strongly opposed to what you're suggesting. In fact I'm an advocate for exactly the opposite, and that's why I filed SPARK-12141.
    
    BTW just removing that line would break the feature this patch is implementing, unless you write a whole lot of code to manually serialize all the SQL-related events.
    
    Events are a public API, and they should be carefully crafted, since changing them affects user applications (including event logs). If there is unnecessary information in the event, then it's a bug in the event definition, not here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10061#issuecomment-161800279
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10061#issuecomment-160878538
  
    **[Test build #46949 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46949/consoleFull)** for PR 10061 at commit [`8222a0c`](https://github.com/apache/spark/commit/8222a0c61a1b05acde18c9d1825528805dedd16f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10061#discussion_r63808009
  
    --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
    @@ -96,6 +100,7 @@ private[spark] object JsonProtocol {
             executorMetricsUpdateToJson(metricsUpdate)
           case blockUpdated: SparkListenerBlockUpdated =>
             throw new MatchError(blockUpdated)  // TODO(ekl) implement this
    +      case _ => parse(mapper.writeValueAsString(event))
    --- End diff --
    
    It is very possible that we silently pull in unnecessary information. If we have new event types, we should handle those explicitly instead of relying on this line. I am proposing to revert this line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/10061#issuecomment-161096888
  
    Single tab is fine, just wanted to understand and make sure that it would be safe.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10061#discussion_r63921800
  
    --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
    @@ -96,6 +100,7 @@ private[spark] object JsonProtocol {
             executorMetricsUpdateToJson(metricsUpdate)
           case blockUpdated: SparkListenerBlockUpdated =>
             throw new MatchError(blockUpdated)  // TODO(ekl) implement this
    +      case _ => parse(mapper.writeValueAsString(event))
    --- End diff --
    
    Could you guys please comment on the bug I opened or the mailing list? Commenting on a long closed github PR is not really the best forum.
    
    I'd really like to understand why you think automatic serialization is a bad idea, since we use it in so many places. I think exactly the opposite - manual serialization is unmaintainable, error-prone, and a waste of developer time.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/10061#issuecomment-161837903
  
    Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10061#issuecomment-161800116
  
    **[Test build #47161 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47161/consoleFull)** for PR 10061 at commit [`8222a0c`](https://github.com/apache/spark/commit/8222a0c61a1b05acde18c9d1825528805dedd16f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `class SparkPlanInfo(`\n  * `class SQLMetricInfo(`\n  * `case class SparkListenerSQLExecutionStart(`\n  * `case class SparkListenerSQLExecutionEnd(executionId: Long, time: Long)`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by carsonwang <gi...@git.apache.org>.
Github user carsonwang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10061#discussion_r46245965
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala ---
    @@ -1263,6 +1264,8 @@ object SQLContext {
        */
       @transient private val instantiatedContext = new AtomicReference[SQLContext]()
     
    +  @transient private val sqlListener = new AtomicReference[SQLListener]()
    --- End diff --
    
    Many unit tests use `sqlContext.listener`. Can you please suggest how to update the unit tests if we changed to use an AtomicBoolean?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10061#discussion_r46248402
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala ---
    @@ -1263,6 +1264,8 @@ object SQLContext {
        */
       @transient private val instantiatedContext = new AtomicReference[SQLContext]()
     
    +  @transient private val sqlListener = new AtomicReference[SQLListener]()
    --- End diff --
    
    Ah, I see that we need this in order to be able to return a value from `createListenerAndUI`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/10061#issuecomment-161764510
  
    alright, if no one else has comments I'll merge this after: retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10061#discussion_r63921076
  
    --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
    @@ -96,6 +100,7 @@ private[spark] object JsonProtocol {
             executorMetricsUpdateToJson(metricsUpdate)
           case blockUpdated: SparkListenerBlockUpdated =>
             throw new MatchError(blockUpdated)  // TODO(ekl) implement this
    +      case _ => parse(mapper.writeValueAsString(event))
    --- End diff --
    
    Yes I think this is a terrible idea. Actually back in the days when we introduced magic serialization, I was against it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10061#issuecomment-160913046
  
    **[Test build #46949 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46949/consoleFull)** for PR 10061 at commit [`8222a0c`](https://github.com/apache/spark/commit/8222a0c61a1b05acde18c9d1825528805dedd16f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `  case class SubExprEliminationState(isNull: String, value: String)`\n  * `class SparkPlanInfo(`\n  * `class SQLMetricInfo(`\n  * `case class SparkListenerSQLExecutionStart(`\n  * `case class SparkListenerSQLExecutionEnd(executionId: Long, time: Long)`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10061#issuecomment-161800283
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47161/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10061#discussion_r63914121
  
    --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
    @@ -96,6 +100,7 @@ private[spark] object JsonProtocol {
             executorMetricsUpdateToJson(metricsUpdate)
           case blockUpdated: SparkListenerBlockUpdated =>
             throw new MatchError(blockUpdated)  // TODO(ekl) implement this
    +      case _ => parse(mapper.writeValueAsString(event))
    --- End diff --
    
    BTW this is really not the right forum to discuss this. If you want to discuss big changes like you're proposing, please discuss on the bug I opened (referenced above) or start a thread on the mailing list.
    
    Your suggestion of removing that line will just break the feature and, to restore it, would require an insane amount of code motion and new code to be written. To start with, the SQL events are not even available in "core", so you can't reference the classes here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10061#issuecomment-160913665
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46949/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10061#discussion_r63804220
  
    --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
    @@ -96,6 +100,7 @@ private[spark] object JsonProtocol {
             executorMetricsUpdateToJson(metricsUpdate)
           case blockUpdated: SparkListenerBlockUpdated =>
             throw new MatchError(blockUpdated)  // TODO(ekl) implement this
    +      case _ => parse(mapper.writeValueAsString(event))
    --- End diff --
    
    What is the reason to add this line? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org