You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Alexey Sanko (JIRA)" <ji...@apache.org> on 2016/06/30 07:10:10 UTC

[jira] [Updated] (AIRFLOW-295) Beeline called into HiveCliHook.run() read unclosed file and skip last statement

     [ https://issues.apache.org/jira/browse/AIRFLOW-295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexey Sanko updated AIRFLOW-295:
---------------------------------
    Description: 
If hql into HiveOperator which use beeline connection contains a lot of statements and doesn't contain additional space line at the end beeline skip last statement.
As I understand beeline directly reads unclosed file (which cannot be closed of NamedTemporaryFile class usage) and get unexpected EOF.
{code}
hive = HiveOperator(
    dag = hive_dag,
    start_date = datetime(2016, 1, 1),
    task_id='asanko_cli_remote_test',
    hql = """
use asanko;
drop table if exists test_airflow_dual;
create table asanko.test_airflow_dual as select * from asanko.dual where x <> '{{ ds }}';
desc asanko.test_airflow_dual;
""",
    hive_cli_conn_id='asanko_hive_cli_beeline',
    schema='asanko',
    default_args=args,
    run_as_owner=True)
{code}
Log:
{code}
[2016-06-29 03:01:51,346] {models.py:1041} INFO - Executing <Task(HiveOperator): asanko_cli_remote_test> on 2016-01-01 00:00:00
[2016-06-29 03:01:51,354] {hive_operator.py:63} INFO - Executing: 
use asanko;
drop table if exists test_airflow_dual;
create table asanko.test_airflow_dual as select * from asanko.dual where x <> '2016-01-01';
desc asanko.test_airflow_dual;
[2016-06-29 03:01:51,357] {base_hook.py:53} INFO - Using connection to: asanko_hive
[2016-06-29 03:01:51,358] {hive_hooks.py:105} INFO - beeline -f /tmp/airflow_hiveop_EDQ7kE/tmpuSq2NR -u jdbc:hive2://asanko_hive:10000/default;auth=none -n asanko -p pwd
[2016-06-29 03:01:52,119] {hive_hooks.py:116} INFO - scan complete in 3ms
[2016-06-29 03:01:52,120] {hive_hooks.py:116} INFO - Connecting to jdbc:hive2://asanko_hive:10000/default;auth=none
[2016-06-29 03:01:52,375] {hive_hooks.py:116} INFO - Connected to: Apache Hive (version 0.12.0-cdh5.1.3)
[2016-06-29 03:01:52,376] {hive_hooks.py:116} INFO - Driver: Hive JDBC (version 0.12.0-cdh5.1.3)
[2016-06-29 03:01:52,376] {hive_hooks.py:116} INFO - Transaction isolation: TRANSACTION_REPEATABLE_READ
[2016-06-29 03:01:52,385] {hive_hooks.py:116} INFO - Beeline version 0.12.0-cdh5.1.3 by Apache Hive
[2016-06-29 03:01:52,386] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> USE asanko;
[2016-06-29 03:01:52,428] {hive_hooks.py:116} INFO - No rows affected (0.041 seconds)
[2016-06-29 03:01:52,441] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive>
[2016-06-29 03:01:52,441] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> use asanko;
[2016-06-29 03:01:52,451] {hive_hooks.py:116} INFO - No rows affected (0.01 seconds)
[2016-06-29 03:01:52,452] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> drop table if exists test_airflow_dual;
[2016-06-29 03:01:52,463] {hive_hooks.py:116} INFO - No rows affected (0.009 seconds)
[2016-06-29 03:01:52,465] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> create table asanko.test_airflow_dual as select * from asanko.dual where x <> '2016-01-01';
[2016-06-29 03:01:55,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:00,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:05,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:10,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:15,010] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:20,003] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:25,011] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:30,010] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:35,007] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:40,007] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:45,007] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:46,575] {hive_hooks.py:116} INFO - No rows affected (54.109 seconds)
[2016-06-29 03:02:46,578] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> desc asanko.test_airflow_dual;Closing: org.apache.hive.jdbc.HiveConnection
{code}

But if we manually add empty row last statement successfully run:
{code}
hive = HiveOperator(
    dag = hive_dag,
    start_date = datetime(2016, 1, 1),
    task_id='asanko_cli_remote_test',
    hql = """
use asanko;
drop table if exists test_airflow_dual;
create table asanko.test_airflow_dual as select * from asanko.dual where x <> '{{ ds }}';
desc asanko.test_airflow_dual;

""",
    hive_cli_conn_id='asanko_hive_cli_beeline',
    schema='asanko',
    default_args=args,
    run_as_owner=True)
{code}
Log:
{code}
[2016-06-29 03:04:01,378] {models.py:1041} INFO - Executing <Task(HiveOperator): asanko_cli_remote_test> on 2016-01-01 00:00:00
[2016-06-29 03:04:01,386] {hive_operator.py:63} INFO - Executing: 
use asanko;
drop table if exists test_airflow_dual;
create table asanko.test_airflow_dual as select * from asanko.dual where x <> '2016-01-01';
desc asanko.test_airflow_dual;

[2016-06-29 03:04:01,388] {base_hook.py:53} INFO - Using connection to: asanko_hive
[2016-06-29 03:04:01,390] {hive_hooks.py:105} INFO - beeline -f /tmp/airflow_hiveop_vmWhkH/tmpDq9Lyp -u jdbc:hive2://asanko_hive:10000/default;auth=none -n asanko -p pwd
[2016-06-29 03:04:02,216] {hive_hooks.py:116} INFO - scan complete in 2ms
[2016-06-29 03:04:02,217] {hive_hooks.py:116} INFO - Connecting to jdbc:hive2://asanko_hive:10000/default;auth=none
[2016-06-29 03:04:02,708] {hive_hooks.py:116} INFO - Connected to: Apache Hive (version 0.12.0-cdh5.1.3)
[2016-06-29 03:04:02,708] {hive_hooks.py:116} INFO - Driver: Hive JDBC (version 0.12.0-cdh5.1.3)
[2016-06-29 03:04:02,709] {hive_hooks.py:116} INFO - Transaction isolation: TRANSACTION_REPEATABLE_READ
[2016-06-29 03:04:02,735] {hive_hooks.py:116} INFO - Beeline version 0.12.0-cdh5.1.3 by Apache Hive
[2016-06-29 03:04:02,735] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> USE asanko;
[2016-06-29 03:04:02,786] {hive_hooks.py:116} INFO - No rows affected (0.05 seconds)
[2016-06-29 03:04:02,800] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive>
[2016-06-29 03:04:02,800] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> use asanko;
[2016-06-29 03:04:02,810] {hive_hooks.py:116} INFO - No rows affected (0.008 seconds)
[2016-06-29 03:04:02,812] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> drop table if exists test_airflow_dual;
[2016-06-29 03:04:02,899] {hive_hooks.py:116} INFO - No rows affected (0.087 seconds)
[2016-06-29 03:04:02,902] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> create table asanko.test_airflow_dual as select * from asanko.dual where x <> '2016-01-01';
[2016-06-29 03:04:05,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:10,007] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:15,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:20,008] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:25,010] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:30,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:35,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:40,005] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:45,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:50,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:55,007] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:05:00,004] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:05:05,007] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:05:06,221] {hive_hooks.py:116} INFO - No rows affected (63.319 seconds)
[2016-06-29 03:05:06,225] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> desc asanko.test_airflow_dual;
[2016-06-29 03:05:06,390] {hive_hooks.py:116} INFO - +-----------------------+-----------------------+-----------------------+
[2016-06-29 03:05:06,390] {hive_hooks.py:116} INFO - |       col_name        |       data_type       |        comment        |
[2016-06-29 03:05:06,390] {hive_hooks.py:116} INFO - +-----------------------+-----------------------+-----------------------+
[2016-06-29 03:05:06,391] {hive_hooks.py:116} INFO - | x                     | string                | None                  |
[2016-06-29 03:05:06,391] {hive_hooks.py:116} INFO - +-----------------------+-----------------------+-----------------------+
[2016-06-29 03:05:06,391] {hive_hooks.py:116} INFO - 1 row selected (0.166 seconds)
[2016-06-29 03:05:06,394] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> Closing: org.apache.hive.jdbc.HiveConnection
{code}

  was:
If hql into HiveOperator which use beeline connection contains a lot of statements and doesn't contain additional space line at the end beeline skip last statement.
As I understand beeline directly read unclosed file (which cannot be closed of NamedTemporaryFile class usage) and get unexpected EOF.
{code}
hive = HiveOperator(
    dag = hive_dag,
    start_date = datetime(2016, 1, 1),
    task_id='asanko_cli_remote_test',
    hql = """
use asanko;
drop table if exists test_airflow_dual;
create table asanko.test_airflow_dual as select * from asanko.dual where x <> '{{ ds }}';
desc asanko.test_airflow_dual;
""",
    hive_cli_conn_id='asanko_hive_cli_beeline',
    schema='asanko',
    default_args=args,
    run_as_owner=True)
{code}
Log:
{code}
[2016-06-29 03:01:51,346] {models.py:1041} INFO - Executing <Task(HiveOperator): asanko_cli_remote_test> on 2016-01-01 00:00:00
[2016-06-29 03:01:51,354] {hive_operator.py:63} INFO - Executing: 
use asanko;
drop table if exists test_airflow_dual;
create table asanko.test_airflow_dual as select * from asanko.dual where x <> '2016-01-01';
desc asanko.test_airflow_dual;
[2016-06-29 03:01:51,357] {base_hook.py:53} INFO - Using connection to: asanko_hive
[2016-06-29 03:01:51,358] {hive_hooks.py:105} INFO - beeline -f /tmp/airflow_hiveop_EDQ7kE/tmpuSq2NR -u jdbc:hive2://asanko_hive:10000/default;auth=none -n asanko -p pwd
[2016-06-29 03:01:52,119] {hive_hooks.py:116} INFO - scan complete in 3ms
[2016-06-29 03:01:52,120] {hive_hooks.py:116} INFO - Connecting to jdbc:hive2://asanko_hive:10000/default;auth=none
[2016-06-29 03:01:52,375] {hive_hooks.py:116} INFO - Connected to: Apache Hive (version 0.12.0-cdh5.1.3)
[2016-06-29 03:01:52,376] {hive_hooks.py:116} INFO - Driver: Hive JDBC (version 0.12.0-cdh5.1.3)
[2016-06-29 03:01:52,376] {hive_hooks.py:116} INFO - Transaction isolation: TRANSACTION_REPEATABLE_READ
[2016-06-29 03:01:52,385] {hive_hooks.py:116} INFO - Beeline version 0.12.0-cdh5.1.3 by Apache Hive
[2016-06-29 03:01:52,386] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> USE asanko;
[2016-06-29 03:01:52,428] {hive_hooks.py:116} INFO - No rows affected (0.041 seconds)
[2016-06-29 03:01:52,441] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive>
[2016-06-29 03:01:52,441] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> use asanko;
[2016-06-29 03:01:52,451] {hive_hooks.py:116} INFO - No rows affected (0.01 seconds)
[2016-06-29 03:01:52,452] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> drop table if exists test_airflow_dual;
[2016-06-29 03:01:52,463] {hive_hooks.py:116} INFO - No rows affected (0.009 seconds)
[2016-06-29 03:01:52,465] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> create table asanko.test_airflow_dual as select * from asanko.dual where x <> '2016-01-01';
[2016-06-29 03:01:55,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:00,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:05,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:10,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:15,010] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:20,003] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:25,011] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:30,010] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:35,007] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:40,007] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:45,007] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:02:46,575] {hive_hooks.py:116} INFO - No rows affected (54.109 seconds)
[2016-06-29 03:02:46,578] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> desc asanko.test_airflow_dual;Closing: org.apache.hive.jdbc.HiveConnection
{code}

But if we manually add empty row last statement successfully run:
{code}
hive = HiveOperator(
    dag = hive_dag,
    start_date = datetime(2016, 1, 1),
    task_id='asanko_cli_remote_test',
    hql = """
use asanko;
drop table if exists test_airflow_dual;
create table asanko.test_airflow_dual as select * from asanko.dual where x <> '{{ ds }}';
desc asanko.test_airflow_dual;

""",
    hive_cli_conn_id='asanko_hive_cli_beeline',
    schema='asanko',
    default_args=args,
    run_as_owner=True)
{code}
Log:
{code}
[2016-06-29 03:04:01,378] {models.py:1041} INFO - Executing <Task(HiveOperator): asanko_cli_remote_test> on 2016-01-01 00:00:00
[2016-06-29 03:04:01,386] {hive_operator.py:63} INFO - Executing: 
use asanko;
drop table if exists test_airflow_dual;
create table asanko.test_airflow_dual as select * from asanko.dual where x <> '2016-01-01';
desc asanko.test_airflow_dual;

[2016-06-29 03:04:01,388] {base_hook.py:53} INFO - Using connection to: asanko_hive
[2016-06-29 03:04:01,390] {hive_hooks.py:105} INFO - beeline -f /tmp/airflow_hiveop_vmWhkH/tmpDq9Lyp -u jdbc:hive2://asanko_hive:10000/default;auth=none -n asanko -p pwd
[2016-06-29 03:04:02,216] {hive_hooks.py:116} INFO - scan complete in 2ms
[2016-06-29 03:04:02,217] {hive_hooks.py:116} INFO - Connecting to jdbc:hive2://asanko_hive:10000/default;auth=none
[2016-06-29 03:04:02,708] {hive_hooks.py:116} INFO - Connected to: Apache Hive (version 0.12.0-cdh5.1.3)
[2016-06-29 03:04:02,708] {hive_hooks.py:116} INFO - Driver: Hive JDBC (version 0.12.0-cdh5.1.3)
[2016-06-29 03:04:02,709] {hive_hooks.py:116} INFO - Transaction isolation: TRANSACTION_REPEATABLE_READ
[2016-06-29 03:04:02,735] {hive_hooks.py:116} INFO - Beeline version 0.12.0-cdh5.1.3 by Apache Hive
[2016-06-29 03:04:02,735] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> USE asanko;
[2016-06-29 03:04:02,786] {hive_hooks.py:116} INFO - No rows affected (0.05 seconds)
[2016-06-29 03:04:02,800] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive>
[2016-06-29 03:04:02,800] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> use asanko;
[2016-06-29 03:04:02,810] {hive_hooks.py:116} INFO - No rows affected (0.008 seconds)
[2016-06-29 03:04:02,812] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> drop table if exists test_airflow_dual;
[2016-06-29 03:04:02,899] {hive_hooks.py:116} INFO - No rows affected (0.087 seconds)
[2016-06-29 03:04:02,902] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> create table asanko.test_airflow_dual as select * from asanko.dual where x <> '2016-01-01';
[2016-06-29 03:04:05,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:10,007] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:15,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:20,008] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:25,010] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:30,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:35,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:40,005] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:45,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:50,006] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:04:55,007] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:05:00,004] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:05:05,007] {jobs.py:142} DEBUG - [heart] Boom.
[2016-06-29 03:05:06,221] {hive_hooks.py:116} INFO - No rows affected (63.319 seconds)
[2016-06-29 03:05:06,225] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> desc asanko.test_airflow_dual;
[2016-06-29 03:05:06,390] {hive_hooks.py:116} INFO - +-----------------------+-----------------------+-----------------------+
[2016-06-29 03:05:06,390] {hive_hooks.py:116} INFO - |       col_name        |       data_type       |        comment        |
[2016-06-29 03:05:06,390] {hive_hooks.py:116} INFO - +-----------------------+-----------------------+-----------------------+
[2016-06-29 03:05:06,391] {hive_hooks.py:116} INFO - | x                     | string                | None                  |
[2016-06-29 03:05:06,391] {hive_hooks.py:116} INFO - +-----------------------+-----------------------+-----------------------+
[2016-06-29 03:05:06,391] {hive_hooks.py:116} INFO - 1 row selected (0.166 seconds)
[2016-06-29 03:05:06,394] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> Closing: org.apache.hive.jdbc.HiveConnection
{code}


> Beeline called into HiveCliHook.run() read unclosed file and skip last statement
> --------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-295
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-295
>             Project: Apache Airflow
>          Issue Type: Bug
>            Reporter: Alexey Sanko
>
> If hql into HiveOperator which use beeline connection contains a lot of statements and doesn't contain additional space line at the end beeline skip last statement.
> As I understand beeline directly reads unclosed file (which cannot be closed of NamedTemporaryFile class usage) and get unexpected EOF.
> {code}
> hive = HiveOperator(
>     dag = hive_dag,
>     start_date = datetime(2016, 1, 1),
>     task_id='asanko_cli_remote_test',
>     hql = """
> use asanko;
> drop table if exists test_airflow_dual;
> create table asanko.test_airflow_dual as select * from asanko.dual where x <> '{{ ds }}';
> desc asanko.test_airflow_dual;
> """,
>     hive_cli_conn_id='asanko_hive_cli_beeline',
>     schema='asanko',
>     default_args=args,
>     run_as_owner=True)
> {code}
> Log:
> {code}
> [2016-06-29 03:01:51,346] {models.py:1041} INFO - Executing <Task(HiveOperator): asanko_cli_remote_test> on 2016-01-01 00:00:00
> [2016-06-29 03:01:51,354] {hive_operator.py:63} INFO - Executing: 
> use asanko;
> drop table if exists test_airflow_dual;
> create table asanko.test_airflow_dual as select * from asanko.dual where x <> '2016-01-01';
> desc asanko.test_airflow_dual;
> [2016-06-29 03:01:51,357] {base_hook.py:53} INFO - Using connection to: asanko_hive
> [2016-06-29 03:01:51,358] {hive_hooks.py:105} INFO - beeline -f /tmp/airflow_hiveop_EDQ7kE/tmpuSq2NR -u jdbc:hive2://asanko_hive:10000/default;auth=none -n asanko -p pwd
> [2016-06-29 03:01:52,119] {hive_hooks.py:116} INFO - scan complete in 3ms
> [2016-06-29 03:01:52,120] {hive_hooks.py:116} INFO - Connecting to jdbc:hive2://asanko_hive:10000/default;auth=none
> [2016-06-29 03:01:52,375] {hive_hooks.py:116} INFO - Connected to: Apache Hive (version 0.12.0-cdh5.1.3)
> [2016-06-29 03:01:52,376] {hive_hooks.py:116} INFO - Driver: Hive JDBC (version 0.12.0-cdh5.1.3)
> [2016-06-29 03:01:52,376] {hive_hooks.py:116} INFO - Transaction isolation: TRANSACTION_REPEATABLE_READ
> [2016-06-29 03:01:52,385] {hive_hooks.py:116} INFO - Beeline version 0.12.0-cdh5.1.3 by Apache Hive
> [2016-06-29 03:01:52,386] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> USE asanko;
> [2016-06-29 03:01:52,428] {hive_hooks.py:116} INFO - No rows affected (0.041 seconds)
> [2016-06-29 03:01:52,441] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive>
> [2016-06-29 03:01:52,441] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> use asanko;
> [2016-06-29 03:01:52,451] {hive_hooks.py:116} INFO - No rows affected (0.01 seconds)
> [2016-06-29 03:01:52,452] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> drop table if exists test_airflow_dual;
> [2016-06-29 03:01:52,463] {hive_hooks.py:116} INFO - No rows affected (0.009 seconds)
> [2016-06-29 03:01:52,465] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> create table asanko.test_airflow_dual as select * from asanko.dual where x <> '2016-01-01';
> [2016-06-29 03:01:55,006] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:02:00,006] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:02:05,006] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:02:10,006] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:02:15,010] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:02:20,003] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:02:25,011] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:02:30,010] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:02:35,007] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:02:40,007] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:02:45,007] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:02:46,575] {hive_hooks.py:116} INFO - No rows affected (54.109 seconds)
> [2016-06-29 03:02:46,578] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> desc asanko.test_airflow_dual;Closing: org.apache.hive.jdbc.HiveConnection
> {code}
> But if we manually add empty row last statement successfully run:
> {code}
> hive = HiveOperator(
>     dag = hive_dag,
>     start_date = datetime(2016, 1, 1),
>     task_id='asanko_cli_remote_test',
>     hql = """
> use asanko;
> drop table if exists test_airflow_dual;
> create table asanko.test_airflow_dual as select * from asanko.dual where x <> '{{ ds }}';
> desc asanko.test_airflow_dual;
> """,
>     hive_cli_conn_id='asanko_hive_cli_beeline',
>     schema='asanko',
>     default_args=args,
>     run_as_owner=True)
> {code}
> Log:
> {code}
> [2016-06-29 03:04:01,378] {models.py:1041} INFO - Executing <Task(HiveOperator): asanko_cli_remote_test> on 2016-01-01 00:00:00
> [2016-06-29 03:04:01,386] {hive_operator.py:63} INFO - Executing: 
> use asanko;
> drop table if exists test_airflow_dual;
> create table asanko.test_airflow_dual as select * from asanko.dual where x <> '2016-01-01';
> desc asanko.test_airflow_dual;
> [2016-06-29 03:04:01,388] {base_hook.py:53} INFO - Using connection to: asanko_hive
> [2016-06-29 03:04:01,390] {hive_hooks.py:105} INFO - beeline -f /tmp/airflow_hiveop_vmWhkH/tmpDq9Lyp -u jdbc:hive2://asanko_hive:10000/default;auth=none -n asanko -p pwd
> [2016-06-29 03:04:02,216] {hive_hooks.py:116} INFO - scan complete in 2ms
> [2016-06-29 03:04:02,217] {hive_hooks.py:116} INFO - Connecting to jdbc:hive2://asanko_hive:10000/default;auth=none
> [2016-06-29 03:04:02,708] {hive_hooks.py:116} INFO - Connected to: Apache Hive (version 0.12.0-cdh5.1.3)
> [2016-06-29 03:04:02,708] {hive_hooks.py:116} INFO - Driver: Hive JDBC (version 0.12.0-cdh5.1.3)
> [2016-06-29 03:04:02,709] {hive_hooks.py:116} INFO - Transaction isolation: TRANSACTION_REPEATABLE_READ
> [2016-06-29 03:04:02,735] {hive_hooks.py:116} INFO - Beeline version 0.12.0-cdh5.1.3 by Apache Hive
> [2016-06-29 03:04:02,735] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> USE asanko;
> [2016-06-29 03:04:02,786] {hive_hooks.py:116} INFO - No rows affected (0.05 seconds)
> [2016-06-29 03:04:02,800] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive>
> [2016-06-29 03:04:02,800] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> use asanko;
> [2016-06-29 03:04:02,810] {hive_hooks.py:116} INFO - No rows affected (0.008 seconds)
> [2016-06-29 03:04:02,812] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> drop table if exists test_airflow_dual;
> [2016-06-29 03:04:02,899] {hive_hooks.py:116} INFO - No rows affected (0.087 seconds)
> [2016-06-29 03:04:02,902] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> create table asanko.test_airflow_dual as select * from asanko.dual where x <> '2016-01-01';
> [2016-06-29 03:04:05,006] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:04:10,007] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:04:15,006] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:04:20,008] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:04:25,010] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:04:30,006] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:04:35,006] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:04:40,005] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:04:45,006] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:04:50,006] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:04:55,007] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:05:00,004] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:05:05,007] {jobs.py:142} DEBUG - [heart] Boom.
> [2016-06-29 03:05:06,221] {hive_hooks.py:116} INFO - No rows affected (63.319 seconds)
> [2016-06-29 03:05:06,225] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> desc asanko.test_airflow_dual;
> [2016-06-29 03:05:06,390] {hive_hooks.py:116} INFO - +-----------------------+-----------------------+-----------------------+
> [2016-06-29 03:05:06,390] {hive_hooks.py:116} INFO - |       col_name        |       data_type       |        comment        |
> [2016-06-29 03:05:06,390] {hive_hooks.py:116} INFO - +-----------------------+-----------------------+-----------------------+
> [2016-06-29 03:05:06,391] {hive_hooks.py:116} INFO - | x                     | string                | None                  |
> [2016-06-29 03:05:06,391] {hive_hooks.py:116} INFO - +-----------------------+-----------------------+-----------------------+
> [2016-06-29 03:05:06,391] {hive_hooks.py:116} INFO - 1 row selected (0.166 seconds)
> [2016-06-29 03:05:06,394] {hive_hooks.py:116} INFO - 0: jdbc:hive2://asanko_hive> Closing: org.apache.hive.jdbc.HiveConnection
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)