You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Kengo Seki (JIRA)" <ji...@apache.org> on 2018/05/03 03:11:00 UTC
[jira] [Created] (AIRFLOW-2412) Fix HiveCliHook.load_file to
address HIVE-10541
Kengo Seki created AIRFLOW-2412:
-----------------------------------
Summary: Fix HiveCliHook.load_file to address HIVE-10541
Key: AIRFLOW-2412
URL: https://issues.apache.org/jira/browse/AIRFLOW-2412
Project: Apache Airflow
Issue Type: Improvement
Components: hive_hooks, hooks
Reporter: Kengo Seki
Assignee: Kengo Seki
HiveCliHook.load_file generates a query file and executes it using {{-f}} option, but that file doesn't have a newline at the end. In such case, beeline bundled Hive under 1.3 doesn't execute the last query due to [a bug|https://issues.apache.org/jira/browse/HIVE-10541]. Example:
register connection and prepare file to be loaded:
{code}
$ airflow connections -a --conn_id hive_cli --conn_type hive_cli --conn_host localhost --conn_port 10000 --conn_schema default --conn_extra '{"use_beeline": true, "auth": "none"}'
[2018-05-02 18:38:48,208] {__init__.py:48} INFO - Using executor SequentialExecutor
Successfully added `conn_id`=hive_cli : hive_cli://:@localhost:10000/default
$ cat /tmp/t
0
1
2
3
4
5
6
7
8
9
{code}
executing load_file via ipython:
{code}
In [1]: from airflow.hooks.hive_hooks import HiveCliHook
In [2]: hook = HiveCliHook("hive_cli")
[2018-05-02 18:50:42,161] {base_hook.py:85} INFO - Using connection to: localhost
In [3]: hook.load_file(field_dict={"c": "int"}, filepath="/tmp/t", table="foo")
(snip)
[2018-05-02 18:51:06,043] {hive_hooks.py:216} INFO - beeline -u jdbc:hive2://localhost:10000/default;auth=none -f /tmp/airflow_hiveop_75jxXU/tmpmvhi0M
[2018-05-02 18:51:07,397] {hive_hooks.py:231} INFO - Connecting to jdbc:hive2://localhost:10000/default;auth=none
[2018-05-02 18:51:07,598] {hive_hooks.py:231} INFO - Connected to: Apache Hive (version 1.2.1)
[2018-05-02 18:51:07,600] {hive_hooks.py:231} INFO - Driver: Hive JDBC (version 1.2.1)
[2018-05-02 18:51:07,600] {hive_hooks.py:231} INFO - Transaction isolation: TRANSACTION_REPEATABLE_READ
[2018-05-02 18:51:07,644] {hive_hooks.py:231} INFO - 0: jdbc:hive2://localhost:10000/default> USE default;
[2018-05-02 18:51:07,749] {hive_hooks.py:231} INFO - No rows affected (0.104 seconds)
[2018-05-02 18:51:07,773] {hive_hooks.py:231} INFO - 0: jdbc:hive2://localhost:10000/defTABLE fooD DATA LOCAL INPATH '/tmp/t' OVERWRITE INTO
[2018-05-02 18:51:07,773] {hive_hooks.py:231} INFO - Closing: 0: jdbc:hive2://localhost:10000/default;auth=none
{code}
Hive table is created, but no data is loaded:
{code}
0: jdbc:hive2://localhost:10000/default> SHOW TABLES;
+-----------+--+
| tab_name |
+-----------+--+
| foo |
+-----------+--+
1 row selected (0.037 seconds)
0: jdbc:hive2://localhost:10000/default> SELECT * FROM foo;
+--------+--+
| foo.c |
+--------+--+
+--------+--+
No rows selected (0.1 seconds)
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)