You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/03/29 15:56:00 UTC

[jira] [Commented] (AIRFLOW-4085) Support wildcards in FileSensor

    [ https://issues.apache.org/jira/browse/AIRFLOW-4085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17310743#comment-17310743 ] 

ASF GitHub Bot commented on AIRFLOW-4085:
-----------------------------------------

NBardelot commented on pull request #5358:
URL: https://github.com/apache/airflow/pull/5358#issuecomment-809495082


   FYI this behaviour is inconsistent.
   
   If anyone like me comes to this PR after looking at the glob() behaviour, this cannot work for hooks that make use of a remote FS (like SFTPHook for example). As the Python documentation states that glob() uses a mix of  os.scandir() and fnmatch.fnmatch() which make the code proposed here based on a local FS.
   
   Thus, trying to use a path with a glob pattern and a hook to a remote FS will have a random effect: either you're lucky and the glob() will not find the equivalent path locally and just return that the path does not exist (the sensor will never trigger); or in a worse case scenario you might trigger the sensor for a file that exists locally and not in the remote...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Support wildcards in FileSensor
> -------------------------------
>
>                 Key: AIRFLOW-4085
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4085
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: contrib
>            Reporter: Matthew Livesey
>            Priority: Minor
>             Fix For: 2.0.0
>
>
> When using FileSensor, it is necessary to specify the exact path of the file or folder to be sensed.
> There are use cases where the exact file name may not be known. For example if a file is delivered daily, but includes the full timestamp of file creation in the name, such as
> {code:java}
> my_data_2019-03-01T13_45_03.csv
> {code}
> I would like the FileSensor to sense with a file path such as
> {code:java}
> my_data_2019-03-01*.csv{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)