You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@kyuubi.apache.org by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/02/15 12:09:35 UTC

[GitHub] [kyuubi] pan3793 opened a new pull request, #4336: Avoid listing all schemas for Spark session catalog on schema pruning

pan3793 opened a new pull request, #4336:
URL: https://github.com/apache/kyuubi/pull/4336

   <!--
   Thanks for sending a pull request!
   
   Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://kyuubi.readthedocs.io/en/latest/community/CONTRIBUTING.html
     2. If the PR is related to an issue in https://github.com/apache/kyuubi/issues, add '[KYUUBI #XXXX]' in your PR title, e.g., '[KYUUBI #XXXX] Your PR title ...'.
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][KYUUBI #XXXX] Your PR title ...'.
   -->
   
   ### _Why are the changes needed?_
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you add a feature, you can talk about the use case of it.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   Some DBMS tools like DBeaver and HUE will call thrift meta api for listing catalogs, databases, and tables. The current implementation of `CatalogShim_v3_0#getSchemas` will call `listAllNamespaces` first and do schema pruning on the Spark driver, which may cause "permission denied" exception when HMS has permission control, like the ranger plugin.
   
   This PR proposes to call HMS API(through v1 session catalog) directly for `spark_catalog`, to suppress the above issue.
   
   ```
   2023-02-15 20:02:13.048 ERROR org.apache.kyuubi.server.KyuubiTBinaryFrontendService: Error getting schemas:
   org.apache.kyuubi.KyuubiSQLException: Error operating GetSchemas: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Permission denied: user [user1] does not have [SELECT] privilege on [userdb1])
           at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:134)
           at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:249)
           at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.databaseExists(ExternalCatalogWithListener.scala:69)
           at org.apache.spark.sql.catalyst.catalog.SessionCatalog.databaseExists(SessionCatalog.scala:294)
           at org.apache.spark.sql.execution.datasources.v2.V2SessionCatalog.listNamespaces(V2SessionCatalog.scala:212)
           at org.apache.kyuubi.engine.spark.shim.CatalogShim_v3_0.$anonfun$listAllNamespaces$1(CatalogShim_v3_0.scala:74)
           at org.apache.kyuubi.engine.spark.shim.CatalogShim_v3_0.$anonfun$listAllNamespaces$1$adapted(CatalogShim_v3_0.scala:73)
           at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293)
           at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
           at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
           at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
           at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293)
           at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290)
           at scala.collection.mutable.ArrayOps$ofRef.flatMap(ArrayOps.scala:198)
           at org.apache.kyuubi.engine.spark.shim.CatalogShim_v3_0.listAllNamespaces(CatalogShim_v3_0.scala:73)
           at org.apache.kyuubi.engine.spark.shim.CatalogShim_v3_0.listAllNamespaces(CatalogShim_v3_0.scala:90)
           at org.apache.kyuubi.engine.spark.shim.CatalogShim_v3_0.getSchemasWithPattern(CatalogShim_v3_0.scala:118)
           at org.apache.kyuubi.engine.spark.shim.CatalogShim_v3_0.getSchemas(CatalogShim_v3_0.scala:133)
           at org.apache.kyuubi.engine.spark.operation.GetSchemas.runInternal(GetSchemas.scala:43)
           at org.apache.kyuubi.operation.AbstractOperation.run(AbstractOperation.scala:164)
           at org.apache.kyuubi.session.AbstractSession.runOperation(AbstractSession.scala:99)
           at org.apache.kyuubi.engine.spark.session.SparkSessionImpl.runOperation(SparkSessionImpl.scala:78)
           at org.apache.kyuubi.session.AbstractSession.getSchemas(AbstractSession.scala:150)
           at org.apache.kyuubi.service.AbstractBackendService.getSchemas(AbstractBackendService.scala:83)
           at org.apache.kyuubi.service.TFrontendService.GetSchemas(TFrontendService.scala:294)
           at org.apache.kyuubi.shade.org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1617)
           at org.apache.kyuubi.shade.org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1602)
           at org.apache.kyuubi.shade.org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
           at org.apache.kyuubi.shade.org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
           at org.apache.kyuubi.service.authentication.TSetIpAddressProcessor.process(TSetIpAddressProcessor.scala:36)
           at org.apache.kyuubi.shade.org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:750)
   ```
   
   ### _How was this patch tested?_
   - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
   
   - [ ] Add screenshots for manual tests if appropriate
   
   - [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] echollee commented on pull request #4336: Avoid listing all schemas for Spark session catalog on schema pruning

Posted by "echollee (via GitHub)" <gi...@apache.org>.
echollee commented on PR #4336:
URL: https://github.com/apache/kyuubi/pull/4336#issuecomment-1432623718

   Problem Solved.
   Excellent!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 commented on pull request #4336: Avoid listing all schemas for Spark session catalog on schema pruning

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on PR #4336:
URL: https://github.com/apache/kyuubi/pull/4336#issuecomment-1432625076

   Thanks @echollee for verificating, merged to master/1.7/1.6


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] codecov-commenter commented on pull request #4336: Avoid listing all schemas for Spark session catalog on schema pruning

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #4336:
URL: https://github.com/apache/kyuubi/pull/4336#issuecomment-1432550556

   # [Codecov](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#4336](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9ece864) into [master](https://codecov.io/gh/apache/kyuubi/commit/02deaf4756fef953d706f171d87c5d0dd760f264?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (02deaf4) will **decrease** coverage by `0.08%`.
   > The diff coverage is `100.00%`.
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #4336      +/-   ##
   ============================================
   - Coverage     53.66%   53.58%   -0.08%     
     Complexity       13       13              
   ============================================
     Files           562      562              
     Lines         30738    30709      -29     
     Branches       4161     4142      -19     
   ============================================
   - Hits          16496    16456      -40     
   - Misses        12682    12706      +24     
   + Partials       1560     1547      -13     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...he/kyuubi/engine/spark/shim/CatalogShim\_v2\_4.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZXh0ZXJuYWxzL2t5dXViaS1zcGFyay1zcWwtZW5naW5lL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUva3l1dWJpL2VuZ2luZS9zcGFyay9zaGltL0NhdGFsb2dTaGltX3YyXzQuc2NhbGE=) | `67.18% <100.00%> (+3.12%)` | :arrow_up: |
   | [...he/kyuubi/engine/spark/shim/CatalogShim\_v3\_0.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZXh0ZXJuYWxzL2t5dXViaS1zcGFyay1zcWwtZW5naW5lL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUva3l1dWJpL2VuZ2luZS9zcGFyay9zaGltL0NhdGFsb2dTaGltX3YzXzAuc2NhbGE=) | `80.00% <100.00%> (-0.44%)` | :arrow_down: |
   | [...ain/scala/org/apache/kyuubi/util/RowSetUtils.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLWNvbW1vbi9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS91dGlsL1Jvd1NldFV0aWxzLnNjYWxh) | `26.08% <0.00%> (-57.25%)` | :arrow_down: |
   | [...kyuubi/server/trino/api/v1/StatementResource.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLXNlcnZlci9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9zZXJ2ZXIvdHJpbm8vYXBpL3YxL1N0YXRlbWVudFJlc291cmNlLnNjYWxh) | `50.00% <0.00%> (-6.76%)` | :arrow_down: |
   | [...apache/kyuubi/engine/JpsApplicationOperation.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLXNlcnZlci9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9lbmdpbmUvSnBzQXBwbGljYXRpb25PcGVyYXRpb24uc2NhbGE=) | `77.41% <0.00%> (-3.23%)` | :arrow_down: |
   | [...g/apache/kyuubi/operation/BatchJobSubmission.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLXNlcnZlci9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9vcGVyYXRpb24vQmF0Y2hKb2JTdWJtaXNzaW9uLnNjYWxh) | `75.27% <0.00%> (-2.20%)` | :arrow_down: |
   | [...rg/apache/kyuubi/ctl/cmd/log/LogBatchCommand.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLWN0bC9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9jdGwvY21kL2xvZy9Mb2dCYXRjaENvbW1hbmQuc2NhbGE=) | `59.09% <0.00%> (-1.52%)` | :arrow_down: |
   | [...mon/src/main/scala/org/apache/kyuubi/Logging.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLWNvbW1vbi9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9Mb2dnaW5nLnNjYWxh) | `41.25% <0.00%> (-1.25%)` | :arrow_down: |
   | [...n/scala/org/apache/kyuubi/engine/ProcBuilder.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLXNlcnZlci9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9lbmdpbmUvUHJvY0J1aWxkZXIuc2NhbGE=) | `78.39% <0.00%> (-0.62%)` | :arrow_down: |
   | [.../kyuubi/engine/spark/operation/ExecutePython.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZXh0ZXJuYWxzL2t5dXViaS1zcGFyay1zcWwtZW5naW5lL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUva3l1dWJpL2VuZ2luZS9zcGFyay9vcGVyYXRpb24vRXhlY3V0ZVB5dGhvbi5zY2FsYQ==) | `81.46% <0.00%> (-0.44%)` | :arrow_down: |
   | ... and [13 more](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 closed pull request #4336: Avoid listing all schemas for Spark session catalog on schema pruning

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 closed pull request #4336: Avoid listing all schemas for Spark session catalog on schema pruning
URL: https://github.com/apache/kyuubi/pull/4336


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org