You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Cristian Figueroa Rodriguez (JIRA)" <ji...@apache.org> on 2017/09/26 00:29:00 UTC

[jira] [Created] (AIRFLOW-1644) HiveOperator dry_run can execute arbitrary HQL

Cristian Figueroa Rodriguez created AIRFLOW-1644:
----------------------------------------------------

             Summary: HiveOperator dry_run can execute arbitrary HQL
                 Key: AIRFLOW-1644
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1644
             Project: Apache Airflow
          Issue Type: Bug
          Components: hive_hooks, tests
    Affects Versions: Airflow 1.8
            Reporter: Cristian Figueroa Rodriguez
            Assignee: Cristian Figueroa Rodriguez
            Priority: Minor


When running:

{code:java}
airflow test <dag_id> <task_id> <ds> --dry_run 
{code}


If it's a HiveOperator it runs explain plans and executes some set up overhead if necessary. However if there are comments before a statement airflow recognizes these as set up, independent of the command after the comment.

Example:

{code:java}

test_setup = HiveOperator(
    task_id='test_setup',
    hql="""
    CREATE TABLE tmp.test_dry_run_airflow (
      dummy STRING,
      state STRING,
      gender STRING
    );""",
    dag=dag,
)

test_comments = HiveOperator(
    task_id='dry_run_hql_with_comments',
    hql="""
    -- A line commenting before the drop.
    DROP TABLE tmp.test_dry_run_airflow;
    CREATE TABLE tmp.test_dry_run_airflow_comments (
      state STRING
    );
    INSERT OVERWRITE TABLE tmp.test_dry_run_airflow_comments
    SELECT state FROM tmp.test_dry_run_airflow;
    """,
    dag=dag,
)

cristian_figueroa@ airflow : ~ $ airflow test dag_start_dates_testing dry_run_hql_with_comments 2017-07-03 --dry_run      
[2017-09-25 23:59:29,061] {hive_hooks.py:254} INFO - Testing HQL [CREATE TABLE tmp.test_dry_run_airflow_comments ( s (...)]
[2017-09-25 23:59:36,235] {hive_hooks.py:272} INFO - SUCCESS
[2017-09-25 23:59:36,236] {hive_hooks.py:254} INFO - Testing HQL [INSERT OVERWRITE TABLE tmp.test_dry_run_airflow_co (...)]
[2017-09-25 23:59:43,179] {hive_hooks.py:263} INFO - FAILED: SemanticException [Error 10001]: Line 3:27 Table not found 'test_dry_run_airflow_comments'
[2017-09-25 23:59:43,180] {hive_hooks.py:270} INFO - Context :
     INSERT OVERWRITE TABLE tmp.test_dry_run_airflow_comments
    SELECT state FROM tmp.test_dry_run_airflow

{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)