You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Aizhamal Nurmamat kyzy (JIRA)" <ji...@apache.org> on 2019/05/17 21:16:00 UTC

[jira] [Updated] (AIRFLOW-2514) HiveServer2Hook doesn't work on Python2 due to thrift version conflict

     [ https://issues.apache.org/jira/browse/AIRFLOW-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aizhamal Nurmamat kyzy updated AIRFLOW-2514:
--------------------------------------------
    Labels: hive hive-hooks  (was: )

adding 'hooks' component and tagging with 'hive' label

> HiveServer2Hook doesn't work on Python2 due to thrift version conflict
> ----------------------------------------------------------------------
>
>                 Key: AIRFLOW-2514
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2514
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: hive_hooks, hooks
>            Reporter: Kengo Seki
>            Priority: Major
>              Labels: hive, hive-hooks
>
> impyla on which HiveServer2Hook depends doesn't work with Thrift 0.10.0+ on Python2. Example:
> {code}
> $ pip show thrift
> Name: thrift
> Version: 0.11.0
> (snip)
> $ ipython
> (snip)
> In [1]: from airflow.hooks.hive_hooks import HiveServer2Hook
> In [2]: HiveServer2Hook().get_conn().cursor()
> [2018-05-23 10:21:02,117] {base_hook.py:83} INFO - Using connection to: localhost
> ---------------------------------------------------------------------------
> TypeError                                 Traceback (most recent call last)
> <ipython-input-2-f76a25f124cf> in <module>()
> ----> 1 HiveServer2Hook().get_conn().cursor()
> (snip)
> /home/sekikn/.virtualenvs/a/local/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.pyc in write(self, oprot)
>    1067   def write(self, oprot):
>    1068     if oprot.__class__ == TBinaryProtocol.TBinaryProtocolAccelerated and self.thrift_spec is not None and fastbinary is not None:
> -> 1069       oprot.trans.write(fastbinary.encode_binary(self, (self.__class__, self.thrift_spec)))
>    1070       return
>    1071     oprot.writeStructBegin('OpenSession_args')
> TypeError: expecting list of size 2 for struct args
> {code}
> [This problem is already reported|https://github.com/cloudera/impyla/issues/286] and therefore [impyla pins Thrift version to 0.9.3|https://github.com/cloudera/impyla/commit/94a8eff9cda0cdb16b180c7079961449c8385997].
> On the other hand, hmsclient (introduced by AIRFLOW-2336) needs Thrift 0.11.0+.
> With the lower version, importing hmsclient fails as follows:
> {code}
> $ pip show thrift
> Name: thrift
> Version: 0.10.0
> (snip)
> $ python -m airflow.hooks.hive_hooks
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
>     "__main__", fname, loader, pkg_name)
>   File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
>     exec code in run_globals
>   File "/home/sekikn/dev/incubator-airflow/airflow/hooks/hive_hooks.py", line 33, in <module>
>     import hmsclient
>   File "/home/sekikn/.virtualenvs/a/local/lib/python2.7/site-packages/hmsclient/__init__.py", line 2, in <module>
>     from .hmsclient import HMSClient
>   File "/home/sekikn/.virtualenvs/a/local/lib/python2.7/site-packages/hmsclient/hmsclient.py", line 23, in <module>
>     from .genthrift.hive_metastore import ThriftHiveMetastore
>   File "/home/sekikn/.virtualenvs/a/local/lib/python2.7/site-packages/hmsclient/genthrift/hive_metastore/ThriftHiveMetastore.py", line 11, in <module>
>     from thrift.TRecursive import fix_spec
> ImportError: No module named TRecursive
> {code}
> As a result, HiveServer2Hook is not available on Python2 now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)