You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kyuubi.apache.org by GitBox <gi...@apache.org> on 2021/11/11 10:21:14 UTC
[GitHub] [incubator-kyuubi] ethan7811 opened a new issue #1362: [Bug] pyhive can't get result from kyuubi
ethan7811 opened a new issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
### Search before asking
- [X] I have searched in the [issues](https://github.com/apache/incubator-kyuubi/issues?q=is%3Aissue) and found no similar issues.
### Describe the bug
we use kyuubi as redash hive backend, and we use pyhive connect to kyuubi to get result, some simple sql works fine, but some complex sql always failed to get result, but in kyuubi log it seem that sql job has finished, it always throws exception as below in process of computation as below in engine log
```
21/11/11 18:05:29 ERROR scheduler.AsyncEventQueue: Listener SparkSQLEngineListener threw an exception
java.lang.NullPointerException
at org.apache.kyuubi.engine.spark.monitor.KyuubiStatementMonitor$.$anonfun$insertJobEndTimeAndResult$2(KyuubiStatementMonitor.scala:133)
at org.apache.kyuubi.Logging.warn(Logging.scala:60)
at org.apache.kyuubi.Logging.warn$(Logging.scala:58)
at org.apache.kyuubi.engine.spark.monitor.KyuubiStatementMonitor$.warn(KyuubiStatementMonitor.scala:28)
at org.apache.kyuubi.engine.spark.monitor.KyuubiStatementMonitor$.insertJobEndTimeAndResult(KyuubiStatementMonitor.scala:133)
at org.apache.spark.kyuubi.SparkSQLEngineListener.onJobEnd(SparkSQLEngineListener.scala:79)
at org.apache.spark.scheduler.SparkListenerBus.doPostEvent(SparkListenerBus.scala:39)
at org.apache.spark.scheduler.SparkListenerBus.doPostEvent$(SparkListenerBus.scala:28)
at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:117)
at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:101)
at org.apache.spark.scheduler.AsyncEventQueue.super$postToAll(AsyncEventQueue.scala:105)
at org.apache.spark.scheduler.AsyncEventQueue.$anonfun$dispatch$1(AsyncEventQueue.scala:105)
at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:100)
at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.$anonfun$run$1(AsyncEventQueue.scala:96)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1381)
at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.run(AsyncEventQueue.scala:96)
```
the exception above doesn't interrupt job execution and we finally get log like
```
21/11/11 18:05:52 INFO operation.ExecuteStatement: Processing xxx's query[f512caa2-0d60-48d1-91f8-f4b3a06c5ee6]: RUNNING_STATE -> FINISHED_STATE, statement --
xxxxxx, time taken: 44.383 seconds
21/11/11 18:06:08 INFO service.ThriftFrontendService: Received request of closing SessionHandle [2ad135fe-a481-45e7-a626-76f0fbbea931]
21/11/11 18:06:08 INFO session.SparkSQLSessionManager: SessionHandle [2ad135fe-a481-45e7-a626-76f0fbbea931] is closed, current opening sessions 0
21/11/11 18:06:08 INFO service.ThriftFrontendService: Finished closing SessionHandle [2ad135fe-a481-45e7-a626-76f0fbbea931]
21/11/11 18:06:08 ERROR server.TThreadPoolServer: Thrift error occurred during processing of message.
org.apache.thrift.transport.TTransportException
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:374)
at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:451)
at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:433)
at org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:43)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:425)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:321)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:225)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
at org.apache.kyuubi.service.authentication.TSetIpAddressProcessor.process(TSetIpAddressProcessor.scala:36)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:310)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
```
and redash show that "Error running query: failed communicating with server. Please check your Internet connection and try again."
### Affects Version(s)
1.3.0
### Kyuubi Server Log Output
_No response_
### Kyuubi Engine Log Output
_No response_
### Kyuubi Server Configurations
_No response_
### Kyuubi Engine Configurations
```yaml
kyuubi.authentication=KERBEROS
kyuubi.frontend.bind.host=xxxx
kyuubi.frontend.bind.port=10003
kyuubi.ha.enabled=true
kyuubi.ha.zookeeper.acl.enabled=false
kyuubi.ha.zookeeper.client.port=2181
kyuubi.ha.zookeeper.namespace=kyuubi-ha
kyuubi.ha.zookeeper.quorum=xxxx
kyuubi.kinit.keytab=/etc/keytabs/hive.keytab
kyuubi.kinit.principal=xxxxx
kyuubi.session.engine.login.timeout=PT30M
kyuubi.session.idle.timeout=PT30M
kyuubi.operation.idle.timeout=PT1H
## Spark
spark.driver.maxResultSize=1g
spark.driver.memory=2g
spark.dynamicAllocation.maxExecutors=10
spark.executor.cores=3
spark.executor.memory=12G
spark.submit.deployMode=client
```
### Additional context
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] ethan7811 edited a comment on issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
ethan7811 edited a comment on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966310235
> [5421b56](https://github.com/apache/incubator-kyuubi/commit/5421b56440cdd0d07de0a98d5a3dc0cfb9ef312e) seem to be fixed by this
why increase "kyuubi.session.engine.login.timeout" seems not work, and is NPE throw by SparkSQLEngineListener matter?the npe can be reproduced
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] ethan7811 commented on issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
ethan7811 commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966845996
> BTW, does Redash support async mode? if yes, trun on it and see what happens
we try pyhive async mode in redash and it doesn't work, same errors as above
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] ethan7811 commented on issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
ethan7811 commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966849610
> What if increse the sleep time?
we changed to 100000L and it works too
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] pan3793 commented on issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
pan3793 commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966849196
What if increse the sleep time?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] pan3793 commented on issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
pan3793 commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966799671
> why increase "kyuubi.session.engine.login.timeout" seems not work
If it's the issue about 5421b56, increase "kyuubi.session.engine.login.timeout" should work.
> some simple sql works fine, but some complex sql always failed to get result
@ethan7811 to identity if it's affect by query time, would you please try `select java_method('java.lang.Thread', 'sleep', 10000L)` to simulate **complex** query?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] ethan7811 edited a comment on issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
ethan7811 edited a comment on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966310235
> [5421b56](https://github.com/apache/incubator-kyuubi/commit/5421b56440cdd0d07de0a98d5a3dc0cfb9ef312e) seem to be fixed by this
why increase "kyuubi.session.engine.login.timeout" seems not work, and does the NPE throw by SparkSQLEngineListener matter?the npe can be reproduced
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] ethan7811 closed issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
ethan7811 closed issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] pan3793 commented on issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
pan3793 commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966800118
BTW, does Redash support async mode? if yes, trun on it and see what happens
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] yaooqinn commented on issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
yaooqinn commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-967036501
Can you share a use case at #925 if appropriate?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] ethan7811 edited a comment on issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
ethan7811 edited a comment on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966987879
> Maybe [fd17dd0](https://github.com/apache/incubator-kyuubi/commit/fd17dd0ae429cc62b45ae705b362cb80d6015407) is also related. If so `spark.driver.memory=2g` is too small for your complex queries.
>
> Can you try the master branch?
finally we find that its because redash worker has been dead, and we limit smaller query result and it seems work, it should not be kyuubi's problem, sorry for disturbing and this issue can be closed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] ethan7811 commented on issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
ethan7811 commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966310235
> [5421b56](https://github.com/apache/incubator-kyuubi/commit/5421b56440cdd0d07de0a98d5a3dc0cfb9ef312e) seem to be fixed by this
why increase "kyuubi.session.engine.login.timeout" seems not work, and is NPE throw by SparkSQLEngineListener matter?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] ethan7811 commented on issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
ethan7811 commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966987879
> Maybe [fd17dd0](https://github.com/apache/incubator-kyuubi/commit/fd17dd0ae429cc62b45ae705b362cb80d6015407) is also related. If so `spark.driver.memory=2g` is too small for your complex queries.
>
> Can you try the master branch?
finally we find that its because redash worker has been dead, and we limit smaller query result and it seems work, it should not be kyuubi's problem, sorry for disturbing
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] yaooqinn commented on issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
yaooqinn commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966962485
Maybe https://github.com/apache/incubator-kyuubi/commit/fd17dd0ae429cc62b45ae705b362cb80d6015407 is also related. If so `spark.driver.memory=2g` is too small for your complex queries.
Can you try the master branch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] yaooqinn commented on issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
yaooqinn commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966796137
> why increase "kyuubi.session.engine.login.timeout" seems not work
pyhive uses a very low TProtocalVersion of hive-rpc, we didn't handle it right for running query synchronously.
> does the NPE throw by SparkSQLEngineListener matter?
IIUC, it doesn't hurt the main execution logic.
KyuubiStatementMonitor has been removed completely in the master branch. cc @zhang1002 as the original author of this feature too.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] yaooqinn commented on issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
yaooqinn commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966289623
https://github.com/apache/incubator-kyuubi/commit/5421b56440cdd0d07de0a98d5a3dc0cfb9ef312e seem to be fixed by this
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] pan3793 edited a comment on issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
pan3793 edited a comment on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966799671
> why increase "kyuubi.session.engine.login.timeout" seems not work
If it's the issue about 5421b56, increase "kyuubi.session.engine.login.timeout" should work.
> some simple sql works fine, but some complex sql always failed to get result
@ethan7811 to identify if it's affect by query time, would you please try `select java_method('java.lang.Thread', 'sleep', 10000L)` to simulate **complex** query?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] ethan7811 commented on issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
ethan7811 commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966848505
> select java_method('java.lang.Thread', 'sleep', 10000L)
```select java_method('java.lang.Thread', 'sleep', 10000L)``` this query seems works fine
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] ethan7811 edited a comment on issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
ethan7811 edited a comment on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966310235
> [5421b56](https://github.com/apache/incubator-kyuubi/commit/5421b56440cdd0d07de0a98d5a3dc0cfb9ef312e) seem to be fixed by this
why increase "kyuubi.session.engine.login.timeout" seems not work, and does NPE throw by SparkSQLEngineListener matter?the npe can be reproduced
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-kyuubi] ethan7811 edited a comment on issue #1362: [Bug] pyhive can't get result from kyuubi
Posted by GitBox <gi...@apache.org>.
ethan7811 edited a comment on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966310235
> [5421b56](https://github.com/apache/incubator-kyuubi/commit/5421b56440cdd0d07de0a98d5a3dc0cfb9ef312e) seem to be fixed by this
why increase "kyuubi.session.engine.login.timeout" seems not work, and is NPE throw by SparkSQLEngineListener matter?the npe can be reproduction
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org