You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by we...@apache.org on 2017/07/05 02:29:42 UTC
spark git commit: [SPARK-20256][SQL][BRANCH-2.1] SessionState should
be created more lazily
Repository: spark
Updated Branches:
refs/heads/branch-2.1 3ecef2491 -> 8f1ca6957
[SPARK-20256][SQL][BRANCH-2.1] SessionState should be created more lazily
## What changes were proposed in this pull request?
`SessionState` is designed to be created lazily. However, in reality, it created immediately in `SparkSession.Builder.getOrCreate` ([here](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L943)).
This PR aims to recover the lazy behavior by keeping the options into `initialSessionOptions`. The benefit is like the following. Users can start `spark-shell` and use RDD operations without any problems.
**BEFORE**
```scala
$ bin/spark-shell
java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder'
...
Caused by: org.apache.spark.sql.AnalysisException:
org.apache.hadoop.hive.ql.metadata.HiveException:
MetaException(message:java.security.AccessControlException:
Permission denied: user=spark, access=READ,
inode="/apps/hive/warehouse":hive:hdfs:drwx------
```
As reported in SPARK-20256, this happens when the warehouse directory is not allowed for this user.
**AFTER**
```scala
$ bin/spark-shell
...
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.1.2-SNAPSHOT
/_/
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_131)
Type in expressions to have them evaluated.
Type :help for more information.
scala> sc.range(0, 10, 1).count()
res0: Long = 10
```
## How was this patch tested?
Manual.
Author: Dongjoon Hyun <do...@apache.org>
Closes #18530 from dongjoon-hyun/SPARK-20256-BRANCH-2.1.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8f1ca695
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/8f1ca695
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/8f1ca695
Branch: refs/heads/branch-2.1
Commit: 8f1ca695753ca4692c091791d4cc66b06e176d52
Parents: 3ecef24
Author: Dongjoon Hyun <do...@apache.org>
Authored: Wed Jul 5 10:29:37 2017 +0800
Committer: Wenchen Fan <we...@databricks.com>
Committed: Wed Jul 5 10:29:37 2017 +0800
----------------------------------------------------------------------
.../main/scala/org/apache/spark/sql/SparkSession.scala | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/spark/blob/8f1ca695/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
----------------------------------------------------------------------
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
index f3dde48..f886069 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
@@ -102,14 +102,22 @@ class SparkSession private(
}
/**
+ * Initial options for session. This options are applied once when sessionState is created.
+ */
+ @transient
+ private[sql] val initialSessionOptions = new scala.collection.mutable.HashMap[String, String]
+
+ /**
* State isolated across sessions, including SQL configurations, temporary tables, registered
* functions, and everything else that accepts a [[org.apache.spark.sql.internal.SQLConf]].
*/
@transient
private[sql] lazy val sessionState: SessionState = {
- SparkSession.reflect[SessionState, SparkSession](
+ val state = SparkSession.reflect[SessionState, SparkSession](
SparkSession.sessionStateClassName(sparkContext.conf),
self)
+ initialSessionOptions.foreach { case (k, v) => state.conf.setConfString(k, v) }
+ state
}
/**
@@ -875,7 +883,7 @@ object SparkSession {
sc
}
session = new SparkSession(sparkContext)
- options.foreach { case (k, v) => session.sessionState.conf.setConfString(k, v) }
+ options.foreach { case (k, v) => session.initialSessionOptions.put(k, v) }
defaultSession.set(session)
// Register a successfully instantiated context to the singleton. This should be at the
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org