You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Michal Dziemianko (JIRA)" <ji...@apache.org> on 2017/07/14 09:48:00 UTC

[jira] [Updated] (AIRFLOW-1413) FTPSensor fails when 550 error message text differs from the expected

     [ https://issues.apache.org/jira/browse/AIRFLOW-1413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michal Dziemianko updated AIRFLOW-1413:
---------------------------------------
    Description: 
The FTPSensor relies on an error message returned by the FTPHook. The expected message is  "Can't check for file existence" and is used to check for errors in following way:

{code:none}
try:
    hook.get_mod_time(self.path)
except ftplib.error_perm as e:
    error = str(e).split(None, 1)
    if error[1] != "Can't check for file existence":
        raise e
{code}

However on my system (Linux/Antregos) this message text is different, leading to inevitable dag termination.  Moreover the actual format of the exception on my system is  different and the split returns '-' instead of the actual message.

Testing for the error message text is not a good idea - this text is not consistent across platforms and locales (I tried changing locale and I can get it to return me localised messages).

Instead, as a quick fix, I suggest testing for the error code in the following way:

{code:none}
error = str(e).split(None, 1)
if error[0] != "550":
    raise e
{code}

This is more reliable, although still not perfect as per FTP specification the codes are just indicatory rather then mandatory. Moreover certain codes (4xx series) are transient and arguably should not cause an exception if recovery is possible (as an example host unavailable can be a temporary network issue) within time limit.

I am going to provide PR with series of improvements.


  was:
The FTPSensor relies on an error message returned by the FTPHook. The expected message is  "Can't check for file existence" and is used to check for errors in following way:

{code:python}
try:
    hook.get_mod_time(self.path)
except ftplib.error_perm as e:
    error = str(e).split(None, 1)
    if error[1] != "Can't check for file existence":
        raise e
{code}

However on my system (Linux/Antregos) this message text is different, leading to inevitable dag termination.  Moreover the actual format of the exception on my system is  different and the split returns '-' instead of the actual message.

Testing for the error message text is not a good idea - this text is not consistent across platforms and locales (I tried changing locale and I can get it to return me localised messages).

Instead, as a quick fix, I suggest testing for the error code in the following way:

{code:python}
error = str(e).split(None, 1)
if error[0] != "550":
    raise e
{code}

This is more reliable, although still not perfect as per FTP specification the codes are just indicatory rather then mandatory. Moreover certain codes (4xx series) are transient and arguably should not cause an exception if recovery is possible (as an example host unavailable can be a temporary network issue) within time limit.

I am going to provide PR with series of improvements.



> FTPSensor fails when 550 error message text differs from the expected
> ---------------------------------------------------------------------
>
>                 Key: AIRFLOW-1413
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1413
>             Project: Apache Airflow
>          Issue Type: Bug
>            Reporter: Michal Dziemianko
>            Assignee: Michal Dziemianko
>            Priority: Minor
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The FTPSensor relies on an error message returned by the FTPHook. The expected message is  "Can't check for file existence" and is used to check for errors in following way:
> {code:none}
> try:
>     hook.get_mod_time(self.path)
> except ftplib.error_perm as e:
>     error = str(e).split(None, 1)
>     if error[1] != "Can't check for file existence":
>         raise e
> {code}
> However on my system (Linux/Antregos) this message text is different, leading to inevitable dag termination.  Moreover the actual format of the exception on my system is  different and the split returns '-' instead of the actual message.
> Testing for the error message text is not a good idea - this text is not consistent across platforms and locales (I tried changing locale and I can get it to return me localised messages).
> Instead, as a quick fix, I suggest testing for the error code in the following way:
> {code:none}
> error = str(e).split(None, 1)
> if error[0] != "550":
>     raise e
> {code}
> This is more reliable, although still not perfect as per FTP specification the codes are just indicatory rather then mandatory. Moreover certain codes (4xx series) are transient and arguably should not cause an exception if recovery is possible (as an example host unavailable can be a temporary network issue) within time limit.
> I am going to provide PR with series of improvements.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)