You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Tomasz Urbaszek (Jira)" <ji...@apache.org> on 2019/09/03 15:18:00 UTC

[jira] [Commented] (AIRFLOW-3804) MySqlToGoogleCloudStorageOperator success when it should fail

    [ https://issues.apache.org/jira/browse/AIRFLOW-3804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921504#comment-16921504 ] 

Tomasz Urbaszek commented on AIRFLOW-3804:
------------------------------------------

I could not replicate the problem. Both in unit test and in example (using the query from first issue):
{code:python}
with models.DAG(
    "example_mysql_to_gcs",
    default_args=default_args,
    schedule_interval=None,  # Override to match your needs
) as example_dag:
    test_task = MySqlToGoogleCloudStorageOperator(
        task_id="test_task",
        bucket="gs://test-bucket",
        filename="test_dump",
        sql="SELECT * FROM airflow where modifiedTS>2000-01-01 00:00:00 and modifiedTS<= 2019-02-04 13:55:21"
    )
{code}
Here is the error we are expecting to get:
{code:sh}
[2019-09-03 15:13:32,072] {taskinstance.py:1072} INFO - Marking task as FAILED.
Traceback (most recent call last):
  File "/usr/local/bin/airflow", line 7, in <module>
    exec(compile(f.read(), __file__, 'exec'))
  File "/opt/airflow/airflow/bin/airflow", line 36, in <module>
    args.func(args)
  File "/opt/airflow/airflow/utils/cli.py", line 73, in wrapper
    return f(*args, **kwargs)
  File "/opt/airflow/airflow/bin/cli.py", line 709, in test
    ti.run(ignore_task_deps=True, ignore_ti_state=True, test_mode=True)
  File "/opt/airflow/airflow/utils/db.py", line 69, in wrapper
    return func(*args, **kwargs)
  File "/opt/airflow/airflow/models/taskinstance.py", line 1003, in run
    session=session)
  File "/opt/airflow/airflow/utils/db.py", line 65, in wrapper
    return func(*args, **kwargs)
  File "/opt/airflow/airflow/models/taskinstance.py", line 916, in _run_raw_task
    result = task_copy.execute(context=context)
  File "/opt/airflow/airflow/operators/sql_to_gcs.py", line 120, in execute
    cursor = self.query()
  File "/opt/airflow/airflow/operators/mysql_to_gcs.py", line 88, in query
    cursor.execute(self.sql)
  File "/usr/local/lib/python3.5/site-packages/MySQLdb/cursors.py", line 255, in execute
    self.errorhandler(self, exc, value)
  File "/usr/local/lib/python3.5/site-packages/MySQLdb/connections.py", line 50, in defaulterrorhandler
    raise errorvalue
  File "/usr/local/lib/python3.5/site-packages/MySQLdb/cursors.py", line 252, in execute
    res = self._query(query)
  File "/usr/local/lib/python3.5/site-packages/MySQLdb/cursors.py", line 378, in _query
    db.query(q)
  File "/usr/local/lib/python3.5/site-packages/MySQLdb/connections.py", line 280, in query
    _mysql.connection.query(self, query)
_mysql_exceptions.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '00:00:00 and modifiedTS<= 2019-02-04 13:55:21' at line 1")
{code}

> MySqlToGoogleCloudStorageOperator success when it should fail
> -------------------------------------------------------------
>
>                 Key: AIRFLOW-3804
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3804
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: gcp, operators
>            Reporter: jack
>            Priority: Major
>             Fix For: 2.0.0
>
>
> Testing the following query on MySqlToGoogleCloudStorageOperator.
>  
> {code:java}
> SELECT * FROM table where modifiedTS>2000-01-01 00:00:00 and modifiedTS<= 2019-02-04 13:55:21{code}
>  
> The operator runs smoothly and report success so airflow continue to execute the down stream of the operator.
> However this query is invalid.
> Running it on MySQL will give:
>  
> {code:java}
> Error Code: 1064. You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '00:00:00  and modifiedTS<= 2019-02-04 13:55:21' at line 11{code}
>  
> The operator should have *FAILD* when running this query it has syntax error.
> There is probably a problem with how this operator treats the result of this query and confuses it with valid result of no rows returned. 
>  
> Not sure if it's related but I'm running the query with SQL file using : filename option of the operator.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)