You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by rxin <gi...@git.apache.org> on 2016/05/19 20:28:14 UTC

[GitHub] spark pull request: [SPARK-15075][SQL] Cleanup dependencies betwee...

GitHub user rxin opened a pull request:

    https://github.com/apache/spark/pull/13200

    [SPARK-15075][SQL] Cleanup dependencies between SQLContext and SparkS…

    ## What changes were proposed in this pull request?
    Currently SparkSession.Builder use SQLContext.getOrCreate. It should probably the the other way around, i.e. all the core logic goes in SparkSession, and SQLContext just calls that. This patch does that.
    
    ## How was this patch tested?
    Updated tests to reflect the change.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rxin/spark SPARK-15075

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13200.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13200
    
----
commit b3510243d9c557983f0107c6bd5bd273091295d1
Author: Reynold Xin <rx...@databricks.com>
Date:   2016-05-19T20:27:14Z

    [SPARK-15075][SQL] Cleanup dependencies between SQLContext and SparkSession

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220499663
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220482444
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by dilipbiswal <gi...@git.apache.org>.
Github user dilipbiswal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13200#discussion_r63993278
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala ---
    @@ -735,29 +731,130 @@ object SparkSession {
         }
     
         /**
    -     * Gets an existing [[SparkSession]] or, if there is no existing one, creates a new one
    -     * based on the options set in this builder.
    +     * Gets an existing [[SparkSession]] or, if there is no existing one, creates a new
    +     * one based on the options set in this builder.
    +     *
    +     * This method first checks whether there is a valid thread-local SparkSession,
    +     * and if yes, return that one. It then checks whether there is a valid global
    +     * default SparkSession, and if yes, return that one. If no valid global default
    +     * SparkSession exists, the method creates a new SparkSession and assigns the
    +     * newly created SparkSession as the global default.
    +     *
    +     * In case an existing SparkSession is returned, the config options specified in
    +     * this builder will be applied to the existing SparkSession.
          *
          * @since 2.0.0
          */
         def getOrCreate(): SparkSession = synchronized {
    -      // Step 1. Create a SparkConf
    -      // Step 2. Get a SparkContext
    -      // Step 3. Get a SparkSession
    -      val sparkConf = new SparkConf()
    -      options.foreach { case (k, v) => sparkConf.set(k, v) }
    -      val sparkContext = SparkContext.getOrCreate(sparkConf)
    -
    -      SQLContext.getOrCreate(sparkContext).sparkSession
    +      // Get the session from current thread's active session.
    +      var session = activeThreadSession.get()
    +      if ((session ne null) && !session.sparkContext.isStopped) {
    +        options.foreach { case (k, v) => session.conf.set(k, v) }
    +        return session
    +      }
    +
    +      // Global synchronization so we will only set the default session once.
    +      SparkSession.synchronized {
    +        // If the current thread does not have an active session, get it from the global session.
    +        session = defaultSession.get()
    +        if ((session ne null) && !session.sparkContext.isStopped) {
    +          options.foreach { case (k, v) => session.conf.set(k, v) }
    +          return session
    +        }
    +
    +        // No active nor global default session. Create a new one.
    +        val sparkContext = userSuppliedContext.getOrElse {
    +          // set app name if not given
    +          if (!options.contains("spark.app.name")) {
    +            options += "spark.app.name" -> java.util.UUID.randomUUID().toString
    +          }
    +
    +          val sparkConf = new SparkConf()
    +          options.foreach { case (k, v) => sparkConf.set(k, v) }
    +          SparkContext.getOrCreate(sparkConf)
    +        }
    +        session = new SparkSession(sparkContext)
    +        options.foreach { case (k, v) => session.conf.set(k, v) }
    +        defaultSession.set(session)
    --- End diff --
    
    @rxin Ok. Got it. Thank you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220516955
  
    Thanks - merging in master/2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220508596
  
    **[Test build #58931 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58931/consoleFull)** for PR 13200 at commit [`4173a72`](https://github.com/apache/spark/commit/4173a72d9564caba19300f0ddb67751186f0bdf8).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SQL] Clean up dependencies betwe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220467897
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SQL] Clean up SparkSession build...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220480244
  
    cc @andrewor14 too who wrote some of this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220526609
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220482447
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58911/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220490045
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220485808
  
    **[Test build #58919 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58919/consoleFull)** for PR 13200 at commit [`918b47b`](https://github.com/apache/spark/commit/918b47b3f55776c6f68d2f282d0a38927cd0e4ca).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220496707
  
    **[Test build #58918 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58918/consoleFull)** for PR 13200 at commit [`af481d6`](https://github.com/apache/spark/commit/af481d678fe1e9bd998a74f89dd7e7484a24a627).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220482350
  
    **[Test build #58911 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58911/consoleFull)** for PR 13200 at commit [`c665771`](https://github.com/apache/spark/commit/c6657713b411e42b97b0f1fc05acff4163c04be6).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SQL] Cleanup dependencies betwee...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220446766
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SQL] Cleanup dependencies betwee...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220443688
  
    **[Test build #58899 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58899/consoleFull)** for PR 13200 at commit [`b351024`](https://github.com/apache/spark/commit/b3510243d9c557983f0107c6bd5bd273091295d1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220490047
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58913/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220499533
  
    **[Test build #58921 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58921/consoleFull)** for PR 13200 at commit [`c475453`](https://github.com/apache/spark/commit/c475453c59e6f795ea9283271fe83b35ef71bee6).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SQL] Cleanup dependencies betwee...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220442699
  
    Note that I still need to add some tests to cover the behavior of the new builders.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SQL] Cleanup dependencies betwee...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220446770
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58899/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SQL] Clean up dependencies betwe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220463818
  
    **[Test build #58907 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58907/consoleFull)** for PR 13200 at commit [`55ef850`](https://github.com/apache/spark/commit/55ef8505219aab9cc0075879e1a86193bbd20a92).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220516531
  
    @marmbrus i know you were looking at this. Did you end up going through it?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220526448
  
    **[Test build #58939 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58939/consoleFull)** for PR 13200 at commit [`e4a4bc1`](https://github.com/apache/spark/commit/e4a4bc1f590770ff95f3fb0277b3e0e8050cec72).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `class ConsoleSink(options: Map[String, String]) extends Sink with Logging `
      * `class ConsoleSinkProvider extends StreamSinkProvider with DataSourceRegister `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220493033
  
    We should fix `session.py` to use the new scala thing as well. Also tests are failing because python is still trying to call `wrapped`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220497120
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220487194
  
    OK I've pushed a commit to handle sql listener properly.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220508703
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220496856
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58918/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SQL] Clean up dependencies betwe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220470605
  
    **[Test build #58911 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58911/consoleFull)** for PR 13200 at commit [`c665771`](https://github.com/apache/spark/commit/c6657713b411e42b97b0f1fc05acff4163c04be6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SQL] Clean up dependencies betwe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220467898
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58907/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220499665
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58921/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220485088
  
    **[Test build #58918 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58918/consoleFull)** for PR 13200 at commit [`af481d6`](https://github.com/apache/spark/commit/af481d678fe1e9bd998a74f89dd7e7484a24a627).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220487878
  
    **[Test build #58921 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58921/consoleFull)** for PR 13200 at commit [`c475453`](https://github.com/apache/spark/commit/c475453c59e6f795ea9283271fe83b35ef71bee6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220494012
  
    but not the builder, which still uses `SQLContext.getOrCreate`: https://github.com/rxin/spark/blob/c475453c59e6f795ea9283271fe83b35ef71bee6/python/pyspark/sql/session.py#L147


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/13200


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220526612
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58939/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by dilipbiswal <gi...@git.apache.org>.
Github user dilipbiswal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13200#discussion_r63992730
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala ---
    @@ -735,29 +731,130 @@ object SparkSession {
         }
     
         /**
    -     * Gets an existing [[SparkSession]] or, if there is no existing one, creates a new one
    -     * based on the options set in this builder.
    +     * Gets an existing [[SparkSession]] or, if there is no existing one, creates a new
    +     * one based on the options set in this builder.
    +     *
    +     * This method first checks whether there is a valid thread-local SparkSession,
    +     * and if yes, return that one. It then checks whether there is a valid global
    +     * default SparkSession, and if yes, return that one. If no valid global default
    +     * SparkSession exists, the method creates a new SparkSession and assigns the
    +     * newly created SparkSession as the global default.
    +     *
    +     * In case an existing SparkSession is returned, the config options specified in
    +     * this builder will be applied to the existing SparkSession.
          *
          * @since 2.0.0
          */
         def getOrCreate(): SparkSession = synchronized {
    -      // Step 1. Create a SparkConf
    -      // Step 2. Get a SparkContext
    -      // Step 3. Get a SparkSession
    -      val sparkConf = new SparkConf()
    -      options.foreach { case (k, v) => sparkConf.set(k, v) }
    -      val sparkContext = SparkContext.getOrCreate(sparkConf)
    -
    -      SQLContext.getOrCreate(sparkContext).sparkSession
    +      // Get the session from current thread's active session.
    +      var session = activeThreadSession.get()
    +      if ((session ne null) && !session.sparkContext.isStopped) {
    +        options.foreach { case (k, v) => session.conf.set(k, v) }
    +        return session
    +      }
    +
    +      // Global synchronization so we will only set the default session once.
    +      SparkSession.synchronized {
    +        // If the current thread does not have an active session, get it from the global session.
    +        session = defaultSession.get()
    +        if ((session ne null) && !session.sparkContext.isStopped) {
    +          options.foreach { case (k, v) => session.conf.set(k, v) }
    +          return session
    +        }
    +
    +        // No active nor global default session. Create a new one.
    +        val sparkContext = userSuppliedContext.getOrElse {
    +          // set app name if not given
    +          if (!options.contains("spark.app.name")) {
    +            options += "spark.app.name" -> java.util.UUID.randomUUID().toString
    +          }
    +
    +          val sparkConf = new SparkConf()
    +          options.foreach { case (k, v) => sparkConf.set(k, v) }
    +          SparkContext.getOrCreate(sparkConf)
    +        }
    +        session = new SparkSession(sparkContext)
    +        options.foreach { case (k, v) => session.conf.set(k, v) }
    +        defaultSession.set(session)
    --- End diff --
    
    @rxin Hi Reynold, i had a minor question just for my understanding. When users do a 
    new SQLContext() , we create a implicit SparkSession. Should this session be made
    the defaultSession ? If we call, 1) new SQLContext 2) builder.getOrCreate() then whats the expected behaviour ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220496978
  
    **[Test build #58919 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58919/consoleFull)** for PR 13200 at commit [`918b47b`](https://github.com/apache/spark/commit/918b47b3f55776c6f68d2f282d0a38927cd0e4ca).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220493901
  
    Ah, yes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SQL] Cleanup dependencies betwee...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220446712
  
    **[Test build #58899 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58899/consoleFull)** for PR 13200 at commit [`b351024`](https://github.com/apache/spark/commit/b3510243d9c557983f0107c6bd5bd273091295d1).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220497122
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58919/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220516804
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220498825
  
    **[Test build #58931 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58931/consoleFull)** for PR 13200 at commit [`4173a72`](https://github.com/apache/spark/commit/4173a72d9564caba19300f0ddb67751186f0bdf8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13200#discussion_r63973275
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala ---
    @@ -738,26 +734,127 @@ object SparkSession {
          * Gets an existing [[SparkSession]] or, if there is no existing one, creates a new one
          * based on the options set in this builder.
          *
    +     * This method first checks whether there is a valid thread-local SparkSession, and if yes,
    +     * return that one. It then checks whether there is a valid global default SparkSession,
    +     * and if yes, return that one. If no valid global default SparkSession exists, the method
    +     * creates a new SparkSession and assigns the newly created SparkSession as the global default.
    +     *
    +     * In case an existing SparkSession is returned, the config options specified in this builder
    +     * will be applied to the existing SparkSession.
    +     *
          * @since 2.0.0
          */
         def getOrCreate(): SparkSession = synchronized {
    -      // Step 1. Create a SparkConf
    -      // Step 2. Get a SparkContext
    -      // Step 3. Get a SparkSession
    -      val sparkConf = new SparkConf()
    -      options.foreach { case (k, v) => sparkConf.set(k, v) }
    -      val sparkContext = SparkContext.getOrCreate(sparkConf)
    -
    -      SQLContext.getOrCreate(sparkContext).sparkSession
    +      // Get the session from current thread's active session.
    +      var session = activeThreadSession.get()
    +      if ((session ne null) && !session.sparkContext.isStopped) {
    +        options.foreach { case (k, v) => session.conf.set(k, v) }
    +        return session
    +      }
    +
    +      // Global synchronization so we will only set the default session once.
    +      SparkSession.synchronized {
    +        // If the current thread does not have an active session, get it from the global session.
    +        session = defaultSession.get()
    +        if ((session ne null) && !session.sparkContext.isStopped) {
    +          options.foreach { case (k, v) => session.conf.set(k, v) }
    +          return session
    +        }
    +
    +        // No active nor global default session. Create a new one.
    +        val sparkContext = userSuppliedContext.getOrElse {
    +          // set app name if not given
    +          if (!options.contains("spark.app.name")) {
    +            options += "spark.app.name" -> java.util.UUID.randomUUID().toString
    +          }
    +
    +          val sparkConf = new SparkConf()
    +          options.foreach { case (k, v) => sparkConf.set(k, v) }
    +          SparkContext.getOrCreate(sparkConf)
    +        }
    +        session = new SparkSession(sparkContext)
    +        options.foreach { case (k, v) => session.conf.set(k, v) }
    +        defaultSession.set(session)
    +
    +        // Register a successfully instantiated context to the singleton. This should be at the
    +        // end of the class definition so that the singleton is updated only if there is no
    +        // exception in the construction of the instance.
    +        sparkContext.addSparkListener(new SparkListener {
    +          override def onApplicationEnd(applicationEnd: SparkListenerApplicationEnd): Unit = {
    +            defaultSession.set(null)
    +            // TODO(rxin): Do we need to also clear SQL listener?
    --- End diff --
    
    We need to clear it. Otherwise, after stopping the SparkContext, we leak it in object SQLContext.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SQL] Clean up SparkSession build...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220480176
  
    **[Test build #58913 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58913/consoleFull)** for PR 13200 at commit [`526896f`](https://github.com/apache/spark/commit/526896fb710b8bbce33a3c32daa59559a413bbf5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SQL] Clean up SparkSession build...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13200#discussion_r63970853
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala ---
    @@ -738,26 +734,127 @@ object SparkSession {
          * Gets an existing [[SparkSession]] or, if there is no existing one, creates a new one
          * based on the options set in this builder.
          *
    +     * This method first checks whether there is a valid thread-local SparkSession, and if yes,
    +     * return that one. It then checks whether there is a valid global default SparkSession,
    +     * and if yes, return that one. If no valid global default SparkSession exists, the method
    +     * creates a new SparkSession and assigns the newly created SparkSession as the global default.
    +     *
    +     * In case an existing SparkSession is returned, the config options specified in this builder
    +     * will be applied to the existing SparkSession.
    +     *
          * @since 2.0.0
          */
         def getOrCreate(): SparkSession = synchronized {
    -      // Step 1. Create a SparkConf
    -      // Step 2. Get a SparkContext
    -      // Step 3. Get a SparkSession
    -      val sparkConf = new SparkConf()
    -      options.foreach { case (k, v) => sparkConf.set(k, v) }
    -      val sparkContext = SparkContext.getOrCreate(sparkConf)
    -
    -      SQLContext.getOrCreate(sparkContext).sparkSession
    +      // Get the session from current thread's active session.
    +      var session = activeThreadSession.get()
    +      if ((session ne null) && !session.sparkContext.isStopped) {
    +        options.foreach { case (k, v) => session.conf.set(k, v) }
    +        return session
    +      }
    +
    +      // Global synchronization so we will only set the default session once.
    +      SparkSession.synchronized {
    +        // If the current thread does not have an active session, get it from the global session.
    +        session = defaultSession.get()
    +        if ((session ne null) && !session.sparkContext.isStopped) {
    +          options.foreach { case (k, v) => session.conf.set(k, v) }
    +          return session
    +        }
    +
    +        // No active nor global default session. Create a new one.
    +        val sparkContext = userSuppliedContext.getOrElse {
    +          // set app name if not given
    +          if (!options.contains("spark.app.name")) {
    +            options += "spark.app.name" -> java.util.UUID.randomUUID().toString
    +          }
    +
    +          val sparkConf = new SparkConf()
    +          options.foreach { case (k, v) => sparkConf.set(k, v) }
    +          SparkContext.getOrCreate(sparkConf)
    +        }
    +        session = new SparkSession(sparkContext)
    +        options.foreach { case (k, v) => session.conf.set(k, v) }
    +        defaultSession.set(session)
    +
    +        // Register a successfully instantiated context to the singleton. This should be at the
    +        // end of the class definition so that the singleton is updated only if there is no
    +        // exception in the construction of the instance.
    +        sparkContext.addSparkListener(new SparkListener {
    +          override def onApplicationEnd(applicationEnd: SparkListenerApplicationEnd): Unit = {
    +            defaultSession.set(null)
    +            // TODO(rxin): Do we need to also clear SQL listener?
    --- End diff --
    
    cc @zsxwing  any idea?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SQL] Clean up dependencies betwe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220467867
  
    **[Test build #58907 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58907/consoleFull)** for PR 13200 at commit [`55ef850`](https://github.com/apache/spark/commit/55ef8505219aab9cc0075879e1a86193bbd20a92).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220515969
  
    **[Test build #58939 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58939/consoleFull)** for PR 13200 at commit [`e4a4bc1`](https://github.com/apache/spark/commit/e4a4bc1f590770ff95f3fb0277b3e0e8050cec72).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220496850
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220499040
  
    I updated Python docs. The Python change seems slightly larger and since it is not user facing, I'm going to defer it to another pr.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13200#discussion_r63993145
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala ---
    @@ -735,29 +731,130 @@ object SparkSession {
         }
     
         /**
    -     * Gets an existing [[SparkSession]] or, if there is no existing one, creates a new one
    -     * based on the options set in this builder.
    +     * Gets an existing [[SparkSession]] or, if there is no existing one, creates a new
    +     * one based on the options set in this builder.
    +     *
    +     * This method first checks whether there is a valid thread-local SparkSession,
    +     * and if yes, return that one. It then checks whether there is a valid global
    +     * default SparkSession, and if yes, return that one. If no valid global default
    +     * SparkSession exists, the method creates a new SparkSession and assigns the
    +     * newly created SparkSession as the global default.
    +     *
    +     * In case an existing SparkSession is returned, the config options specified in
    +     * this builder will be applied to the existing SparkSession.
          *
          * @since 2.0.0
          */
         def getOrCreate(): SparkSession = synchronized {
    -      // Step 1. Create a SparkConf
    -      // Step 2. Get a SparkContext
    -      // Step 3. Get a SparkSession
    -      val sparkConf = new SparkConf()
    -      options.foreach { case (k, v) => sparkConf.set(k, v) }
    -      val sparkContext = SparkContext.getOrCreate(sparkConf)
    -
    -      SQLContext.getOrCreate(sparkContext).sparkSession
    +      // Get the session from current thread's active session.
    +      var session = activeThreadSession.get()
    +      if ((session ne null) && !session.sparkContext.isStopped) {
    +        options.foreach { case (k, v) => session.conf.set(k, v) }
    +        return session
    +      }
    +
    +      // Global synchronization so we will only set the default session once.
    +      SparkSession.synchronized {
    +        // If the current thread does not have an active session, get it from the global session.
    +        session = defaultSession.get()
    +        if ((session ne null) && !session.sparkContext.isStopped) {
    +          options.foreach { case (k, v) => session.conf.set(k, v) }
    +          return session
    +        }
    +
    +        // No active nor global default session. Create a new one.
    +        val sparkContext = userSuppliedContext.getOrElse {
    +          // set app name if not given
    +          if (!options.contains("spark.app.name")) {
    +            options += "spark.app.name" -> java.util.UUID.randomUUID().toString
    +          }
    +
    +          val sparkConf = new SparkConf()
    +          options.foreach { case (k, v) => sparkConf.set(k, v) }
    +          SparkContext.getOrCreate(sparkConf)
    +        }
    +        session = new SparkSession(sparkContext)
    +        options.foreach { case (k, v) => session.conf.set(k, v) }
    +        defaultSession.set(session)
    --- End diff --
    
    We would create a new one in that case ...
    
    I'm not too worried about the legacy corner cases here though.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SQL] Clean up SparkSession build...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13200#discussion_r63970867
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala ---
    @@ -738,26 +734,127 @@ object SparkSession {
          * Gets an existing [[SparkSession]] or, if there is no existing one, creates a new one
          * based on the options set in this builder.
          *
    +     * This method first checks whether there is a valid thread-local SparkSession, and if yes,
    +     * return that one. It then checks whether there is a valid global default SparkSession,
    +     * and if yes, return that one. If no valid global default SparkSession exists, the method
    +     * creates a new SparkSession and assigns the newly created SparkSession as the global default.
    +     *
    +     * In case an existing SparkSession is returned, the config options specified in this builder
    +     * will be applied to the existing SparkSession.
    +     *
          * @since 2.0.0
          */
         def getOrCreate(): SparkSession = synchronized {
    -      // Step 1. Create a SparkConf
    -      // Step 2. Get a SparkContext
    -      // Step 3. Get a SparkSession
    -      val sparkConf = new SparkConf()
    -      options.foreach { case (k, v) => sparkConf.set(k, v) }
    -      val sparkContext = SparkContext.getOrCreate(sparkConf)
    -
    -      SQLContext.getOrCreate(sparkContext).sparkSession
    +      // Get the session from current thread's active session.
    +      var session = activeThreadSession.get()
    +      if ((session ne null) && !session.sparkContext.isStopped) {
    +        options.foreach { case (k, v) => session.conf.set(k, v) }
    +        return session
    +      }
    +
    +      // Global synchronization so we will only set the default session once.
    +      SparkSession.synchronized {
    +        // If the current thread does not have an active session, get it from the global session.
    +        session = defaultSession.get()
    +        if ((session ne null) && !session.sparkContext.isStopped) {
    +          options.foreach { case (k, v) => session.conf.set(k, v) }
    +          return session
    +        }
    +
    +        // No active nor global default session. Create a new one.
    +        val sparkContext = userSuppliedContext.getOrElse {
    +          // set app name if not given
    +          if (!options.contains("spark.app.name")) {
    +            options += "spark.app.name" -> java.util.UUID.randomUUID().toString
    +          }
    +
    +          val sparkConf = new SparkConf()
    +          options.foreach { case (k, v) => sparkConf.set(k, v) }
    +          SparkContext.getOrCreate(sparkConf)
    +        }
    +        session = new SparkSession(sparkContext)
    +        options.foreach { case (k, v) => session.conf.set(k, v) }
    +        defaultSession.set(session)
    +
    +        // Register a successfully instantiated context to the singleton. This should be at the
    +        // end of the class definition so that the singleton is updated only if there is no
    +        // exception in the construction of the instance.
    +        sparkContext.addSparkListener(new SparkListener {
    +          override def onApplicationEnd(applicationEnd: SparkListenerApplicationEnd): Unit = {
    +            defaultSession.set(null)
    +            // TODO(rxin): Do we need to also clear SQL listener?
    --- End diff --
    
    actually @zsxwing might be good for you to review the entire pr.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220493679
  
    That's been updated?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220508705
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58931/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15075][SPARK-15345][SQL] Clean up Spark...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13200#issuecomment-220489946
  
    **[Test build #58913 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58913/consoleFull)** for PR 13200 at commit [`526896f`](https://github.com/apache/spark/commit/526896fb710b8bbce33a3c32daa59559a413bbf5).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org