You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Matt McCline (Jira)" <ji...@apache.org> on 2021/06/02 18:57:00 UTC

[jira] [Created] (HIVE-25191) Modernize Hive Thrift CLI Service Protocol

Matt McCline created HIVE-25191:
-----------------------------------

             Summary: Modernize Hive Thrift CLI Service Protocol
                 Key: HIVE-25191
                 URL: https://issues.apache.org/jira/browse/HIVE-25191
             Project: Hive
          Issue Type: Bug
            Reporter: Matt McCline
            Assignee: Matt McCline


Unnecessary errors are occurring with the advent of proxy use such as Gateways between the Hive client and Hive Server 2. Query failures can be due to arbitrary proxy timeouts. This proposal avoids the timeouts by changing the protocol to do regular polling. Currently, the Hive client uses one request for the query compile request. Long query compile times make those requests vulnerable to the arbitrary proxy timeouts.

Another issue is Hive Server 2 sometimes does not notice the client has failed or has lost interest in a potentially long running query. This causes Hive locks and Big Data query resources to be held unnecessarily. The assumption is the client issues a cancel query request when it gets an error. This assumption does not always hold. If the proxy returned an error itself, that proxy may reject the subsequent cancel request, too. And, if the client is killed or the network is down, the client cannot complete a cancel request. The proposed solution here is for Hive Server 2 to watch that the client is sending regular polling requests for status. If a client ceases those requests, then Hive Server 2 will cancel the query.

Hive owns the JDBC path (i.e. HiveDriver). The ODBC path may be more challenging because vendors provide ODBC drivers and Hive does not own the ODBC protocol.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)