You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Cristian Figueroa Rodriguez (JIRA)" <ji...@apache.org> on 2017/09/26 00:29:00 UTC
[jira] [Created] (AIRFLOW-1644) HiveOperator dry_run can execute
arbitrary HQL
Cristian Figueroa Rodriguez created AIRFLOW-1644:
----------------------------------------------------
Summary: HiveOperator dry_run can execute arbitrary HQL
Key: AIRFLOW-1644
URL: https://issues.apache.org/jira/browse/AIRFLOW-1644
Project: Apache Airflow
Issue Type: Bug
Components: hive_hooks, tests
Affects Versions: Airflow 1.8
Reporter: Cristian Figueroa Rodriguez
Assignee: Cristian Figueroa Rodriguez
Priority: Minor
When running:
{code:java}
airflow test <dag_id> <task_id> <ds> --dry_run
{code}
If it's a HiveOperator it runs explain plans and executes some set up overhead if necessary. However if there are comments before a statement airflow recognizes these as set up, independent of the command after the comment.
Example:
{code:java}
test_setup = HiveOperator(
task_id='test_setup',
hql="""
CREATE TABLE tmp.test_dry_run_airflow (
dummy STRING,
state STRING,
gender STRING
);""",
dag=dag,
)
test_comments = HiveOperator(
task_id='dry_run_hql_with_comments',
hql="""
-- A line commenting before the drop.
DROP TABLE tmp.test_dry_run_airflow;
CREATE TABLE tmp.test_dry_run_airflow_comments (
state STRING
);
INSERT OVERWRITE TABLE tmp.test_dry_run_airflow_comments
SELECT state FROM tmp.test_dry_run_airflow;
""",
dag=dag,
)
cristian_figueroa@ airflow : ~ $ airflow test dag_start_dates_testing dry_run_hql_with_comments 2017-07-03 --dry_run
[2017-09-25 23:59:29,061] {hive_hooks.py:254} INFO - Testing HQL [CREATE TABLE tmp.test_dry_run_airflow_comments ( s (...)]
[2017-09-25 23:59:36,235] {hive_hooks.py:272} INFO - SUCCESS
[2017-09-25 23:59:36,236] {hive_hooks.py:254} INFO - Testing HQL [INSERT OVERWRITE TABLE tmp.test_dry_run_airflow_co (...)]
[2017-09-25 23:59:43,179] {hive_hooks.py:263} INFO - FAILED: SemanticException [Error 10001]: Line 3:27 Table not found 'test_dry_run_airflow_comments'
[2017-09-25 23:59:43,180] {hive_hooks.py:270} INFO - Context :
INSERT OVERWRITE TABLE tmp.test_dry_run_airflow_comments
SELECT state FROM tmp.test_dry_run_airflow
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)