You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by BruceXu1991 <gi...@git.apache.org> on 2017/12/20 13:15:26 UTC
[GitHub] spark pull request #20034: [SPARK-22846][SQL] Fix table owner is null when c...
GitHub user BruceXu1991 opened a pull request:
https://github.com/apache/spark/pull/20034
[SPARK-22846][SQL] Fix table owner is null when creating table through spark sql or thriftserver
## What changes were proposed in this pull request?
fix table owner is null when create new table through spark sql
## How was this patch tested?
manual test.
1、first create an table
2、select the table properties in mysql of hive metastore
Please review http://spark.apache.org/contributing.html before opening a pull request.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/BruceXu1991/spark SPARK-22846
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20034.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20034
----
commit e8c3035028e6242005806476f5ce7cbdad5af889
Author: xu.wenchun <xu...@...>
Date: 2017-12-20T13:05:13Z
fix SPARK-22846
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20034: [SPARK-22846][SQL] Fix table owner is null when c...
Posted by BruceXu1991 <gi...@git.apache.org>.
Github user BruceXu1991 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20034#discussion_r158472814
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
@@ -186,7 +186,7 @@ private[hive] class HiveClientImpl(
/** Returns the configuration for the current session. */
def conf: HiveConf = state.getConf
- private val userName = state.getAuthenticator.getUserName
+ private val userName = conf.getUser
--- End diff --
yes, i met this problem by using MySQL as Hive metastore.
what's more, when I execute DESCRIBE FORMATTED spark_22846, NullPointerException will occur.
'''
> DESCRIBE FORMATTED offline.spark_22846;
Error: java.lang.NullPointerException (state=,code=0)
'''
and the detail stack info:
```
17/12/22 18:18:10 ERROR SparkExecuteStatementOperation: Error executing query, currentState RUNNING,
java.lang.NullPointerException
at scala.collection.immutable.StringOps$.length$extension(StringOps.scala:47)
at scala.collection.immutable.StringOps.length(StringOps.scala:47)
at scala.collection.IndexedSeqOptimized$class.isEmpty(IndexedSeqOptimized.scala:27)
at scala.collection.immutable.StringOps.isEmpty(StringOps.scala:29)
at scala.collection.TraversableOnce$class.nonEmpty(TraversableOnce.scala:111)
at scala.collection.immutable.StringOps.nonEmpty(StringOps.scala:29)
at org.apache.spark.sql.catalyst.catalog.CatalogTable.toLinkedHashMap(interface.scala:301)
at org.apache.spark.sql.execution.command.DescribeTableCommand.describeFormattedTableInfo(tables.scala:559)
at org.apache.spark.sql.execution.command.DescribeTableCommand.run(tables.scala:537)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:183)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:68)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:767)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:691)
```
this result of NPE is that owner is null. The relevant source code is below:
```
def toLinkedHashMap: mutable.LinkedHashMap[String, String] = {
.........
line 301: if (owner.nonEmpty) map.put("Owner", owner)
........
}
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20034: [SPARK-22846][SQL] Fix table owner is null when c...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/20034#discussion_r158501959
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
@@ -186,7 +186,7 @@ private[hive] class HiveClientImpl(
/** Returns the configuration for the current session. */
def conf: HiveConf = state.getConf
- private val userName = state.getAuthenticator.getUserName
+ private val userName = conf.getUser
--- End diff --
do you know how Hive get the username internally?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20034: [SPARK-22846][SQL] Fix table owner is null when c...
Posted by BruceXu1991 <gi...@git.apache.org>.
Github user BruceXu1991 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20034#discussion_r158577749
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
@@ -186,7 +186,7 @@ private[hive] class HiveClientImpl(
/** Returns the configuration for the current session. */
def conf: HiveConf = state.getConf
- private val userName = state.getAuthenticator.getUserName
+ private val userName = conf.getUser
--- End diff --
well, if using spark 2.2.1's current implementation
```
private val userName = state.getAuthenticator.getUserName
```
when the implementation of state.getAuthenticator is **HadoopDefaultAuthenticator**, which is default in hive conf, the username is got.
however, in the case that the implementation of state.getAuthenticator is **SessionStateUserAuthenticator**, which is used in my case, then username will be null.
the simplified code below explains the reason:
1) HadoopDefaultAuthenticator
```
public class HadoopDefaultAuthenticator implements HiveAuthenticationProvider {
@Override
public String getUserName() {
return userName;
}
@Override
public void setConf(Configuration conf) {
this.conf = conf;
UserGroupInformation ugi = null;
try {
ugi = Utils.getUGI();
} catch (Exception e) {
throw new RuntimeException(e);
}
this.userName = ugi.getShortUserName();
if (ugi.getGroupNames() != null) {
this.groupNames = Arrays.asList(ugi.getGroupNames());
}
}
}
public class Utils {
public static UserGroupInformation getUGI() throws LoginException, IOException {
String doAs = System.getenv("HADOOP_USER_NAME");
if(doAs != null && doAs.length() > 0) {
return UserGroupInformation.createProxyUser(doAs, UserGroupInformation.getLoginUser());
}
return UserGroupInformation.getCurrentUser();
}
}
```
it shows that HadoopDefaultAuthenticator will get username through Utils.getUGI(), so the username is HADOOP_USER_NAME of LoginUser.
2) SessionStateUserAuthenticator
```
public class SessionStateUserAuthenticator implements HiveAuthenticationProvider {
@Override
public void setConf(Configuration arg0) {
}
@Override
public String getUserName() {
return sessionState.getUserName();
}
}
```
it shows that SessionStateUserAuthenticator get the username through sessionState.getUserName(), which is null. Here is the [instantiation of SessionState in HiveClientImpl](https://github.com/apache/spark/blob/1cf3e3a26961d306eb17b7629d8742a4df45f339/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L187)
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20034: [SPARK-22846][SQL] Fix table owner is null when creating...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20034
**[Test build #85270 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85270/testReport)** for PR 20034 at commit [`e8c3035`](https://github.com/apache/spark/commit/e8c3035028e6242005806476f5ce7cbdad5af889).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20034: [SPARK-22846][SQL] Fix table owner is null when c...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/20034
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20034: [SPARK-22846][SQL] Fix table owner is null when c...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/20034#discussion_r158333472
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
@@ -186,7 +186,7 @@ private[hive] class HiveClientImpl(
/** Returns the configuration for the current session. */
def conf: HiveConf = state.getConf
- private val userName = state.getAuthenticator.getUserName
+ private val userName = conf.getUser
--- End diff --
So, does this happen in case of MySQL as Hive metastore?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20034: [SPARK-22846][SQL] Fix table owner is null when c...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/20034#discussion_r158319646
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
@@ -186,7 +186,7 @@ private[hive] class HiveClientImpl(
/** Returns the configuration for the current session. */
def conf: HiveConf = state.getConf
- private val userName = state.getAuthenticator.getUserName
--- End diff --
Why this returns null?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20034: [SPARK-22846][SQL] Fix table owner is null when creating...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/20034
can you add a test?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20034: [SPARK-22846][SQL] Fix table owner is null when creating...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/20034
ok to test
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20034: [SPARK-22846][SQL] Fix table owner is null when creating...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20034
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85270/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20034: [SPARK-22846][SQL] Fix table owner is null when creating...
Posted by BruceXu1991 <gi...@git.apache.org>.
Github user BruceXu1991 commented on the issue:
https://github.com/apache/spark/pull/20034
@cloud-fan @gatorsmile could you review this issue?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20034: [SPARK-22846][SQL] Fix table owner is null when creating...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20034
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20034: [SPARK-22846][SQL] Fix table owner is null when creating...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20034
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20034: [SPARK-22846][SQL] Fix table owner is null when c...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/20034#discussion_r158332740
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
@@ -186,7 +186,7 @@ private[hive] class HiveClientImpl(
/** Returns the configuration for the current session. */
def conf: HiveConf = state.getConf
- private val userName = state.getAuthenticator.getUserName
+ private val userName = conf.getUser
--- End diff --
@BruceXu1991. I want to reproduce your problem here. Could you describe your environment more specifically? For me, 2.2.1 works like the following.
```scala
scala> spark.version
res0: String = 2.2.1
scala> sql("CREATE TABLE spark_22846(a INT)")
scala> sql("DESCRIBE FORMATTED spark_22846").show
+--------------------+--------------------+-------+
| col_name| data_type|comment|
+--------------------+--------------------+-------+
| a| int| null|
| | | |
|# Detailed Table ...| | |
| Database| default| |
| Table| spark_22846| |
| Owner| dongjoon| |
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20034: [SPARK-22846][SQL] Fix table owner is null when creating...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20034
**[Test build #85270 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85270/testReport)** for PR 20034 at commit [`e8c3035`](https://github.com/apache/spark/commit/e8c3035028e6242005806476f5ce7cbdad5af889).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20034: [SPARK-22846][SQL] Fix table owner is null when creating...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/20034
thanks, merging to master!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org