You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@kyuubi.apache.org by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/02/15 12:09:35 UTC
[GitHub] [kyuubi] pan3793 opened a new pull request, #4336: Avoid listing all schemas for Spark session catalog on schema pruning
pan3793 opened a new pull request, #4336:
URL: https://github.com/apache/kyuubi/pull/4336
<!--
Thanks for sending a pull request!
Here are some tips for you:
1. If this is your first time, please read our contributor guidelines: https://kyuubi.readthedocs.io/en/latest/community/CONTRIBUTING.html
2. If the PR is related to an issue in https://github.com/apache/kyuubi/issues, add '[KYUUBI #XXXX]' in your PR title, e.g., '[KYUUBI #XXXX] Your PR title ...'.
3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][KYUUBI #XXXX] Your PR title ...'.
-->
### _Why are the changes needed?_
<!--
Please clarify why the changes are needed. For instance,
1. If you add a feature, you can talk about the use case of it.
2. If you fix a bug, you can clarify why it is a bug.
-->
Some DBMS tools like DBeaver and HUE will call thrift meta api for listing catalogs, databases, and tables. The current implementation of `CatalogShim_v3_0#getSchemas` will call `listAllNamespaces` first and do schema pruning on the Spark driver, which may cause "permission denied" exception when HMS has permission control, like the ranger plugin.
This PR proposes to call HMS API(through v1 session catalog) directly for `spark_catalog`, to suppress the above issue.
```
2023-02-15 20:02:13.048 ERROR org.apache.kyuubi.server.KyuubiTBinaryFrontendService: Error getting schemas:
org.apache.kyuubi.KyuubiSQLException: Error operating GetSchemas: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Permission denied: user [user1] does not have [SELECT] privilege on [userdb1])
at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:134)
at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:249)
at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.databaseExists(ExternalCatalogWithListener.scala:69)
at org.apache.spark.sql.catalyst.catalog.SessionCatalog.databaseExists(SessionCatalog.scala:294)
at org.apache.spark.sql.execution.datasources.v2.V2SessionCatalog.listNamespaces(V2SessionCatalog.scala:212)
at org.apache.kyuubi.engine.spark.shim.CatalogShim_v3_0.$anonfun$listAllNamespaces$1(CatalogShim_v3_0.scala:74)
at org.apache.kyuubi.engine.spark.shim.CatalogShim_v3_0.$anonfun$listAllNamespaces$1$adapted(CatalogShim_v3_0.scala:73)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290)
at scala.collection.mutable.ArrayOps$ofRef.flatMap(ArrayOps.scala:198)
at org.apache.kyuubi.engine.spark.shim.CatalogShim_v3_0.listAllNamespaces(CatalogShim_v3_0.scala:73)
at org.apache.kyuubi.engine.spark.shim.CatalogShim_v3_0.listAllNamespaces(CatalogShim_v3_0.scala:90)
at org.apache.kyuubi.engine.spark.shim.CatalogShim_v3_0.getSchemasWithPattern(CatalogShim_v3_0.scala:118)
at org.apache.kyuubi.engine.spark.shim.CatalogShim_v3_0.getSchemas(CatalogShim_v3_0.scala:133)
at org.apache.kyuubi.engine.spark.operation.GetSchemas.runInternal(GetSchemas.scala:43)
at org.apache.kyuubi.operation.AbstractOperation.run(AbstractOperation.scala:164)
at org.apache.kyuubi.session.AbstractSession.runOperation(AbstractSession.scala:99)
at org.apache.kyuubi.engine.spark.session.SparkSessionImpl.runOperation(SparkSessionImpl.scala:78)
at org.apache.kyuubi.session.AbstractSession.getSchemas(AbstractSession.scala:150)
at org.apache.kyuubi.service.AbstractBackendService.getSchemas(AbstractBackendService.scala:83)
at org.apache.kyuubi.service.TFrontendService.GetSchemas(TFrontendService.scala:294)
at org.apache.kyuubi.shade.org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1617)
at org.apache.kyuubi.shade.org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1602)
at org.apache.kyuubi.shade.org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.kyuubi.shade.org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.kyuubi.service.authentication.TSetIpAddressProcessor.process(TSetIpAddressProcessor.scala:36)
at org.apache.kyuubi.shade.org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
```
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [kyuubi] echollee commented on pull request #4336: Avoid listing all schemas for Spark session catalog on schema pruning
Posted by "echollee (via GitHub)" <gi...@apache.org>.
echollee commented on PR #4336:
URL: https://github.com/apache/kyuubi/pull/4336#issuecomment-1432623718
Problem Solved.
Excellent!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [kyuubi] pan3793 commented on pull request #4336: Avoid listing all schemas for Spark session catalog on schema pruning
Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on PR #4336:
URL: https://github.com/apache/kyuubi/pull/4336#issuecomment-1432625076
Thanks @echollee for verificating, merged to master/1.7/1.6
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [kyuubi] codecov-commenter commented on pull request #4336: Avoid listing all schemas for Spark session catalog on schema pruning
Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #4336:
URL: https://github.com/apache/kyuubi/pull/4336#issuecomment-1432550556
# [Codecov](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#4336](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9ece864) into [master](https://codecov.io/gh/apache/kyuubi/commit/02deaf4756fef953d706f171d87c5d0dd760f264?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (02deaf4) will **decrease** coverage by `0.08%`.
> The diff coverage is `100.00%`.
```diff
@@ Coverage Diff @@
## master #4336 +/- ##
============================================
- Coverage 53.66% 53.58% -0.08%
Complexity 13 13
============================================
Files 562 562
Lines 30738 30709 -29
Branches 4161 4142 -19
============================================
- Hits 16496 16456 -40
- Misses 12682 12706 +24
+ Partials 1560 1547 -13
```
| [Impacted Files](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...he/kyuubi/engine/spark/shim/CatalogShim\_v2\_4.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZXh0ZXJuYWxzL2t5dXViaS1zcGFyay1zcWwtZW5naW5lL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUva3l1dWJpL2VuZ2luZS9zcGFyay9zaGltL0NhdGFsb2dTaGltX3YyXzQuc2NhbGE=) | `67.18% <100.00%> (+3.12%)` | :arrow_up: |
| [...he/kyuubi/engine/spark/shim/CatalogShim\_v3\_0.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZXh0ZXJuYWxzL2t5dXViaS1zcGFyay1zcWwtZW5naW5lL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUva3l1dWJpL2VuZ2luZS9zcGFyay9zaGltL0NhdGFsb2dTaGltX3YzXzAuc2NhbGE=) | `80.00% <100.00%> (-0.44%)` | :arrow_down: |
| [...ain/scala/org/apache/kyuubi/util/RowSetUtils.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLWNvbW1vbi9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS91dGlsL1Jvd1NldFV0aWxzLnNjYWxh) | `26.08% <0.00%> (-57.25%)` | :arrow_down: |
| [...kyuubi/server/trino/api/v1/StatementResource.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLXNlcnZlci9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9zZXJ2ZXIvdHJpbm8vYXBpL3YxL1N0YXRlbWVudFJlc291cmNlLnNjYWxh) | `50.00% <0.00%> (-6.76%)` | :arrow_down: |
| [...apache/kyuubi/engine/JpsApplicationOperation.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLXNlcnZlci9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9lbmdpbmUvSnBzQXBwbGljYXRpb25PcGVyYXRpb24uc2NhbGE=) | `77.41% <0.00%> (-3.23%)` | :arrow_down: |
| [...g/apache/kyuubi/operation/BatchJobSubmission.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLXNlcnZlci9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9vcGVyYXRpb24vQmF0Y2hKb2JTdWJtaXNzaW9uLnNjYWxh) | `75.27% <0.00%> (-2.20%)` | :arrow_down: |
| [...rg/apache/kyuubi/ctl/cmd/log/LogBatchCommand.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLWN0bC9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9jdGwvY21kL2xvZy9Mb2dCYXRjaENvbW1hbmQuc2NhbGE=) | `59.09% <0.00%> (-1.52%)` | :arrow_down: |
| [...mon/src/main/scala/org/apache/kyuubi/Logging.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLWNvbW1vbi9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9Mb2dnaW5nLnNjYWxh) | `41.25% <0.00%> (-1.25%)` | :arrow_down: |
| [...n/scala/org/apache/kyuubi/engine/ProcBuilder.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLXNlcnZlci9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9lbmdpbmUvUHJvY0J1aWxkZXIuc2NhbGE=) | `78.39% <0.00%> (-0.62%)` | :arrow_down: |
| [.../kyuubi/engine/spark/operation/ExecutePython.scala](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZXh0ZXJuYWxzL2t5dXViaS1zcGFyay1zcWwtZW5naW5lL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUva3l1dWJpL2VuZ2luZS9zcGFyay9vcGVyYXRpb24vRXhlY3V0ZVB5dGhvbi5zY2FsYQ==) | `81.46% <0.00%> (-0.44%)` | :arrow_down: |
| ... and [13 more](https://codecov.io/gh/apache/kyuubi/pull/4336?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
:mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [kyuubi] pan3793 closed pull request #4336: Avoid listing all schemas for Spark session catalog on schema pruning
Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 closed pull request #4336: Avoid listing all schemas for Spark session catalog on schema pruning
URL: https://github.com/apache/kyuubi/pull/4336
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org