You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/09/07 03:36:33 UTC

[GitHub] [spark] AngersZhuuuu opened a new pull request #29656: [SPARK-32807][SQL] ThriftServer open session slow when high concurrent when init current DB

AngersZhuuuu opened a new pull request #29656:
URL: https://github.com/apache/spark/pull/29656


   ### What changes were proposed in this pull request?
   When init current database, we can use direct API, don't need to call  SQL
   
   ### Why are the changes needed?
   No
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Not need


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29656:
URL: https://github.com/apache/spark/pull/29656#issuecomment-688016714


   **[Test build #128333 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128333/testReport)** for PR 29656 at commit [`1f2fa6c`](https://github.com/apache/spark/commit/1f2fa6cfc6dd841bda5c82fa8f428a54e1dfa8a5).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29656:
URL: https://github.com/apache/spark/pull/29656#issuecomment-688019831






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #29656:
URL: https://github.com/apache/spark/pull/29656#discussion_r484171144



##########
File path: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLSessionManager.scala
##########
@@ -69,7 +69,7 @@ private[hive] class SparkSQLSessionManager(hiveServer: HiveServer2, sqlContext:
     setConfMap(ctx, hiveSessionState.getOverriddenConfigurations)
     setConfMap(ctx, hiveSessionState.getHiveVariables)
     if (sessionConf != null && sessionConf.containsKey("use:database")) {
-      ctx.sql(s"use ${sessionConf.get("use:database")}")
+      ctx.sparkSession.sessionState.catalog.setCurrentDatabase(sessionConf.get("use:database"))

Review comment:
       > `ctx.sessionState.catalog.setCurrentDatabase`?
   
   Updated.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on pull request #29656: [SPARK-32807][SQL] ThriftServer open session slow when high concurrent when init current DB

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on pull request #29656:
URL: https://github.com/apache/spark/pull/29656#issuecomment-688008917


   cc @wangyum @juliuszsompolski 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29656:
URL: https://github.com/apache/spark/pull/29656#issuecomment-688017138






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on a change in pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
wangyum commented on a change in pull request #29656:
URL: https://github.com/apache/spark/pull/29656#discussion_r484170616



##########
File path: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLSessionManager.scala
##########
@@ -69,7 +69,7 @@ private[hive] class SparkSQLSessionManager(hiveServer: HiveServer2, sqlContext:
     setConfMap(ctx, hiveSessionState.getOverriddenConfigurations)
     setConfMap(ctx, hiveSessionState.getHiveVariables)
     if (sessionConf != null && sessionConf.containsKey("use:database")) {
-      ctx.sql(s"use ${sessionConf.get("use:database")}")
+      ctx.sparkSession.sessionState.catalog.setCurrentDatabase(sessionConf.get("use:database"))

Review comment:
       `ctx.sessionState.catalog.setCurrentDatabase`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29656: [SPARK-32807][SQL] ThriftServer open session slow when high concurrent when init current DB

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29656:
URL: https://github.com/apache/spark/pull/29656#issuecomment-688009132


   **[Test build #128331 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128331/testReport)** for PR 29656 at commit [`a1cf29a`](https://github.com/apache/spark/commit/a1cf29a0eb00ffc45296874ff445a75c5d7f42e7).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] juliuszsompolski commented on pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
juliuszsompolski commented on pull request #29656:
URL: https://github.com/apache/spark/pull/29656#issuecomment-688211778


   This may change how this skipped processing handles errors. I would add some tests that verify such behaviour.
   cc @bogdanghit @CJStuart 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29656:
URL: https://github.com/apache/spark/pull/29656#issuecomment-688017138






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29656: [SPARK-32807][SQL] ThriftServer open session slow when high concurrent when init current DB

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29656:
URL: https://github.com/apache/spark/pull/29656#issuecomment-688009576






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29656:
URL: https://github.com/apache/spark/pull/29656#issuecomment-688009132


   **[Test build #128331 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128331/testReport)** for PR 29656 at commit [`a1cf29a`](https://github.com/apache/spark/commit/a1cf29a0eb00ffc45296874ff445a75c5d7f42e7).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29656:
URL: https://github.com/apache/spark/pull/29656#issuecomment-688027227


   **[Test build #128333 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128333/testReport)** for PR 29656 at commit [`1f2fa6c`](https://github.com/apache/spark/commit/1f2fa6cfc6dd841bda5c82fa8f428a54e1dfa8a5).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29656:
URL: https://github.com/apache/spark/pull/29656#issuecomment-688019603


   **[Test build #128331 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128331/testReport)** for PR 29656 at commit [`a1cf29a`](https://github.com/apache/spark/commit/a1cf29a0eb00ffc45296874ff445a75c5d7f42e7).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29656:
URL: https://github.com/apache/spark/pull/29656#issuecomment-688027437






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu closed pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu closed pull request #29656:
URL: https://github.com/apache/spark/pull/29656


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #29656:
URL: https://github.com/apache/spark/pull/29656#discussion_r484209443



##########
File path: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLSessionManager.scala
##########
@@ -69,7 +69,7 @@ private[hive] class SparkSQLSessionManager(hiveServer: HiveServer2, sqlContext:
     setConfMap(ctx, hiveSessionState.getOverriddenConfigurations)
     setConfMap(ctx, hiveSessionState.getHiveVariables)
     if (sessionConf != null && sessionConf.containsKey("use:database")) {
-      ctx.sql(s"use ${sessionConf.get("use:database")}")
+      ctx.sessionState.catalog.setCurrentDatabase(sessionConf.get("use:database"))

Review comment:
       > just to be clear, is this just a small refactoring?
   
   In this way we can avoid a lot of process part, but seems I have miss a important problem that user may write a db name wrong such as `aa.cc`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29656:
URL: https://github.com/apache/spark/pull/29656#issuecomment-688027437






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #29656:
URL: https://github.com/apache/spark/pull/29656#discussion_r484203367



##########
File path: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLSessionManager.scala
##########
@@ -69,7 +69,7 @@ private[hive] class SparkSQLSessionManager(hiveServer: HiveServer2, sqlContext:
     setConfMap(ctx, hiveSessionState.getOverriddenConfigurations)
     setConfMap(ctx, hiveSessionState.getHiveVariables)
     if (sessionConf != null && sessionConf.containsKey("use:database")) {
-      ctx.sql(s"use ${sessionConf.get("use:database")}")
+      ctx.sessionState.catalog.setCurrentDatabase(sessionConf.get("use:database"))

Review comment:
       just to be clear, is this just a small refactoring?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] bogdanghit commented on a change in pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
bogdanghit commented on a change in pull request #29656:
URL: https://github.com/apache/spark/pull/29656#discussion_r484963868



##########
File path: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLSessionManager.scala
##########
@@ -69,7 +69,7 @@ private[hive] class SparkSQLSessionManager(hiveServer: HiveServer2, sqlContext:
     setConfMap(ctx, hiveSessionState.getOverriddenConfigurations)
     setConfMap(ctx, hiveSessionState.getHiveVariables)
     if (sessionConf != null && sessionConf.containsKey("use:database")) {
-      ctx.sql(s"use ${sessionConf.get("use:database")}")
+      ctx.sessionState.catalog.setCurrentDatabase(sessionConf.get("use:database"))

Review comment:
       Can we add extra tests to verify the behaviour in case of failures? Is it the same error thrown? Are we skipping some checks along the way by calling the catalog functions?  




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29656: [SPARK-32807][SQL] ThriftServer open session slow when high concurrent when init current DB

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29656:
URL: https://github.com/apache/spark/pull/29656#issuecomment-688009576






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29656:
URL: https://github.com/apache/spark/pull/29656#issuecomment-688016714


   **[Test build #128333 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128333/testReport)** for PR 29656 at commit [`1f2fa6c`](https://github.com/apache/spark/commit/1f2fa6cfc6dd841bda5c82fa8f428a54e1dfa8a5).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on pull request #29656:
URL: https://github.com/apache/spark/pull/29656#issuecomment-688234411


   > This may change how this skipped processing handles errors. I would add some tests that verify such behaviour.
   > cc @bogdanghit @CJStuart
   
   What I most want to do is make HiveClientImpl's client to be thread local var and remove lock to speed up query.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu closed pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu closed pull request #29656:
URL: https://github.com/apache/spark/pull/29656


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29656: [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29656:
URL: https://github.com/apache/spark/pull/29656#issuecomment-688019831






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org