You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kyuubi.apache.org by GitBox <gi...@apache.org> on 2021/11/11 10:21:14 UTC

[GitHub] [incubator-kyuubi] ethan7811 opened a new issue #1362: [Bug] pyhive can't get result from kyuubi

ethan7811 opened a new issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362


   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   
   
   ### Search before asking
   
   - [X] I have searched in the [issues](https://github.com/apache/incubator-kyuubi/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### Describe the bug
   
   we use kyuubi as redash hive backend, and we use pyhive connect to kyuubi to get result, some simple sql works fine, but some complex sql always failed to get result, but in kyuubi log it seem that sql job has finished, it always throws exception as below in process of computation as below in engine log
   ```
   21/11/11 18:05:29 ERROR scheduler.AsyncEventQueue: Listener SparkSQLEngineListener threw an exception
   java.lang.NullPointerException
           at org.apache.kyuubi.engine.spark.monitor.KyuubiStatementMonitor$.$anonfun$insertJobEndTimeAndResult$2(KyuubiStatementMonitor.scala:133)
           at org.apache.kyuubi.Logging.warn(Logging.scala:60)
           at org.apache.kyuubi.Logging.warn$(Logging.scala:58)
           at org.apache.kyuubi.engine.spark.monitor.KyuubiStatementMonitor$.warn(KyuubiStatementMonitor.scala:28)
           at org.apache.kyuubi.engine.spark.monitor.KyuubiStatementMonitor$.insertJobEndTimeAndResult(KyuubiStatementMonitor.scala:133)
           at org.apache.spark.kyuubi.SparkSQLEngineListener.onJobEnd(SparkSQLEngineListener.scala:79)
           at org.apache.spark.scheduler.SparkListenerBus.doPostEvent(SparkListenerBus.scala:39)
           at org.apache.spark.scheduler.SparkListenerBus.doPostEvent$(SparkListenerBus.scala:28)
           at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
           at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
           at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:117)
           at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:101)
           at org.apache.spark.scheduler.AsyncEventQueue.super$postToAll(AsyncEventQueue.scala:105)
           at org.apache.spark.scheduler.AsyncEventQueue.$anonfun$dispatch$1(AsyncEventQueue.scala:105)
           at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)
           at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
           at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:100)
           at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.$anonfun$run$1(AsyncEventQueue.scala:96)
           at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1381)
           at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.run(AsyncEventQueue.scala:96)
   ```
   
   the exception above doesn't interrupt job execution and we finally get log like
   ```
   21/11/11 18:05:52 INFO operation.ExecuteStatement: Processing xxx's query[f512caa2-0d60-48d1-91f8-f4b3a06c5ee6]: RUNNING_STATE -> FINISHED_STATE, statement --
   xxxxxx, time taken: 44.383 seconds
   21/11/11 18:06:08 INFO service.ThriftFrontendService: Received request of closing SessionHandle [2ad135fe-a481-45e7-a626-76f0fbbea931]
   21/11/11 18:06:08 INFO session.SparkSQLSessionManager: SessionHandle [2ad135fe-a481-45e7-a626-76f0fbbea931] is closed, current opening sessions 0
   21/11/11 18:06:08 INFO service.ThriftFrontendService: Finished closing SessionHandle [2ad135fe-a481-45e7-a626-76f0fbbea931]
   21/11/11 18:06:08 ERROR server.TThreadPoolServer: Thrift error occurred during processing of message.
   org.apache.thrift.transport.TTransportException
           at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
           at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
           at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:374)
           at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:451)
           at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:433)
           at org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:43)
           at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
           at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:425)
           at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:321)
           at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:225)
           at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
           at org.apache.kyuubi.service.authentication.TSetIpAddressProcessor.process(TSetIpAddressProcessor.scala:36)
           at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:310)
           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)
   ```
   and redash show that "Error running query: failed communicating with server. Please check your Internet connection and try again."
   
   ### Affects Version(s)
   
   1.3.0
   
   ### Kyuubi Server Log Output
   
   _No response_
   
   ### Kyuubi Engine Log Output
   
   _No response_
   
   ### Kyuubi Server Configurations
   
   _No response_
   
   ### Kyuubi Engine Configurations
   
   ```yaml
   kyuubi.authentication=KERBEROS
   kyuubi.frontend.bind.host=xxxx
   kyuubi.frontend.bind.port=10003
   kyuubi.ha.enabled=true
   kyuubi.ha.zookeeper.acl.enabled=false
   kyuubi.ha.zookeeper.client.port=2181
   kyuubi.ha.zookeeper.namespace=kyuubi-ha
   kyuubi.ha.zookeeper.quorum=xxxx
   kyuubi.kinit.keytab=/etc/keytabs/hive.keytab
   kyuubi.kinit.principal=xxxxx
   kyuubi.session.engine.login.timeout=PT30M
   kyuubi.session.idle.timeout=PT30M
   kyuubi.operation.idle.timeout=PT1H
   
   ## Spark
   spark.driver.maxResultSize=1g
   spark.driver.memory=2g
   spark.dynamicAllocation.maxExecutors=10
   spark.executor.cores=3
   spark.executor.memory=12G
   spark.submit.deployMode=client
   ```
   
   
   ### Additional context
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] ethan7811 edited a comment on issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
ethan7811 edited a comment on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966310235


   > [5421b56](https://github.com/apache/incubator-kyuubi/commit/5421b56440cdd0d07de0a98d5a3dc0cfb9ef312e) seem to be fixed by this
   
   why increase "kyuubi.session.engine.login.timeout" seems not work, and is NPE throw by SparkSQLEngineListener matter?the npe can be reproduced


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] ethan7811 commented on issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
ethan7811 commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966845996


   > BTW, does Redash support async mode? if yes, trun on it and see what happens
   
   we try pyhive async mode in redash and it doesn't work, same errors as above


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] ethan7811 commented on issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
ethan7811 commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966849610


   > What if increse the sleep time?
   
   we changed to 100000L  and it works too


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] pan3793 commented on issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
pan3793 commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966849196


   What if increse the sleep time?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] pan3793 commented on issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
pan3793 commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966799671


   > why increase "kyuubi.session.engine.login.timeout" seems not work
   
   If it's the issue about 5421b56, increase "kyuubi.session.engine.login.timeout" should work.
   
   > some simple sql works fine, but some complex sql always failed to get result
   
   @ethan7811 to identity if it's affect by query time, would you please try `select java_method('java.lang.Thread', 'sleep', 10000L)` to simulate **complex** query?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] ethan7811 edited a comment on issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
ethan7811 edited a comment on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966310235


   > [5421b56](https://github.com/apache/incubator-kyuubi/commit/5421b56440cdd0d07de0a98d5a3dc0cfb9ef312e) seem to be fixed by this
   
   why increase "kyuubi.session.engine.login.timeout" seems not work, and does the NPE throw by SparkSQLEngineListener matter?the npe can be reproduced


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] ethan7811 closed issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
ethan7811 closed issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] pan3793 commented on issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
pan3793 commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966800118


   BTW, does Redash support async mode? if yes, trun on it and see what happens


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] yaooqinn commented on issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
yaooqinn commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-967036501


    Can you share a use case at #925 if appropriate?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] ethan7811 edited a comment on issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
ethan7811 edited a comment on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966987879


   > Maybe [fd17dd0](https://github.com/apache/incubator-kyuubi/commit/fd17dd0ae429cc62b45ae705b362cb80d6015407) is also related. If so `spark.driver.memory=2g` is too small for your complex queries.
   > 
   > Can you try the master branch?
   
   finally we find that its because redash worker has been dead, and we limit smaller query result and it seems work, it should not be kyuubi's problem, sorry for disturbing and this issue can be closed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] ethan7811 commented on issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
ethan7811 commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966310235


   > [5421b56](https://github.com/apache/incubator-kyuubi/commit/5421b56440cdd0d07de0a98d5a3dc0cfb9ef312e) seem to be fixed by this
   
   why increase "kyuubi.session.engine.login.timeout" seems not work, and is NPE throw by SparkSQLEngineListener matter?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] ethan7811 commented on issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
ethan7811 commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966987879


   > Maybe [fd17dd0](https://github.com/apache/incubator-kyuubi/commit/fd17dd0ae429cc62b45ae705b362cb80d6015407) is also related. If so `spark.driver.memory=2g` is too small for your complex queries.
   > 
   > Can you try the master branch?
   
   finally we find that its because redash worker has been dead, and we limit smaller query result and it seems work, it should not be kyuubi's problem, sorry for disturbing


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] yaooqinn commented on issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
yaooqinn commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966962485


   Maybe https://github.com/apache/incubator-kyuubi/commit/fd17dd0ae429cc62b45ae705b362cb80d6015407 is also related. If so `spark.driver.memory=2g` is too small for your complex queries. 
   
   Can you try the master branch? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] yaooqinn commented on issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
yaooqinn commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966796137


   > why increase "kyuubi.session.engine.login.timeout" seems not work
   
   pyhive uses a very low TProtocalVersion of hive-rpc, we didn't handle it right for running query synchronously.
   
   > does the NPE throw by SparkSQLEngineListener matter?
   
   IIUC, it doesn't hurt the main execution logic. 
   KyuubiStatementMonitor has been removed completely in the master branch. cc @zhang1002 as the original author of this feature too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] yaooqinn commented on issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
yaooqinn commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966289623


   https://github.com/apache/incubator-kyuubi/commit/5421b56440cdd0d07de0a98d5a3dc0cfb9ef312e seem to be fixed by this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] pan3793 edited a comment on issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
pan3793 edited a comment on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966799671


   > why increase "kyuubi.session.engine.login.timeout" seems not work
   
   If it's the issue about 5421b56, increase "kyuubi.session.engine.login.timeout" should work.
   
   > some simple sql works fine, but some complex sql always failed to get result
   
   @ethan7811 to identify if it's affect by query time, would you please try `select java_method('java.lang.Thread', 'sleep', 10000L)` to simulate **complex** query?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] ethan7811 commented on issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
ethan7811 commented on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966848505


   > select java_method('java.lang.Thread', 'sleep', 10000L)
   
   ```select java_method('java.lang.Thread', 'sleep', 10000L)``` this query seems works fine


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] ethan7811 edited a comment on issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
ethan7811 edited a comment on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966310235


   > [5421b56](https://github.com/apache/incubator-kyuubi/commit/5421b56440cdd0d07de0a98d5a3dc0cfb9ef312e) seem to be fixed by this
   
   why increase "kyuubi.session.engine.login.timeout" seems not work, and does NPE throw by SparkSQLEngineListener matter?the npe can be reproduced


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-kyuubi] ethan7811 edited a comment on issue #1362: [Bug] pyhive can't get result from kyuubi

Posted by GitBox <gi...@apache.org>.
ethan7811 edited a comment on issue #1362:
URL: https://github.com/apache/incubator-kyuubi/issues/1362#issuecomment-966310235


   > [5421b56](https://github.com/apache/incubator-kyuubi/commit/5421b56440cdd0d07de0a98d5a3dc0cfb9ef312e) seem to be fixed by this
   
   why increase "kyuubi.session.engine.login.timeout" seems not work, and is NPE throw by SparkSQLEngineListener matter?the npe can be reproduction


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org