You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/08/13 15:57:03 UTC

[GitHub] [airflow] ajinkya-xyz opened a new pull request, #25705: Fix : SFTP Sensor not able to locate file when file_pattern is provided

ajinkya-xyz opened a new pull request, #25705:
URL: https://github.com/apache/airflow/pull/25705

   Background : Pull request # [24084](https://github.com/apache/airflow/pull/24084) added the ability to provide `fnamtch` file pattern to SFTP sensor. But sensor fails to locate the file when the file pattern is provided. 
    
   Issue : Code fails to get modified time once file matching to the given pattern is found. [Line # 77](https://github.com/apache/airflow/blob/main/airflow/providers/sftp/sensors/sftp.py#L77) from sftp.py file fails with no such file error. 
   
   `mod_time = self.hook.get_mod_time(actual_file_to_check)`
   
   Root cause: Code assumes that the `get_file_by_pattern` method returns a complete path for a file matching the given `fnamtch` expression. While we are only getting file name in return. `get_file_by_pattern` internally relies on [Paramiko SFTP clients listdir()](https://docs.paramiko.org/en/stable/api/sftp.html#paramiko.sftp_client.SFTPClient.listdir) method to retrieve files contained in given path folder. `listdir()` method only returns file names and not the complete file path. 
   
   Related code, sftp.py file [line # 68.](https://github.com/apache/airflow/blob/main/airflow/providers/sftp/sensors/sftp.py#L68)
   `file_from_pattern = self.hook.get_file_by_pattern(self.path, self.file_pattern)`
   
   Fix : Prepending the path to `actual_file_to_check `variable fixes the issue. 
   
   `actual_file_to_check = os.path.join(self.path, file_from_pattern)`
   
   Related : [24084](https://github.com/apache/airflow/pull/24084)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #25705: Fix : SFTP Sensor fails to locate file when file_pattern is provided

Posted by GitBox <gi...@apache.org>.
uranusjr commented on code in PR #25705:
URL: https://github.com/apache/airflow/pull/25705#discussion_r945570886


##########
airflow/providers/sftp/sensors/sftp.py:
##########
@@ -67,7 +68,7 @@ def poke(self, context: 'Context') -> bool:
         if self.file_pattern:
             file_from_pattern = self.hook.get_file_by_pattern(self.path, self.file_pattern)
             if file_from_pattern:
-                actual_file_to_check = file_from_pattern
+                actual_file_to_check = os.path.join(self.path, file_from_pattern)

Review Comment:
   Hmm, is it correct to use `os.path.join` here? SFTP only supports `/` (I think) and this is platform dependant.
   
   It’s likely more correct to use `posixpath.join` instead.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on pull request #25705: Fix : SFTP Sensor fails to locate file when file_pattern is provided

Posted by GitBox <gi...@apache.org>.
uranusjr commented on PR #25705:
URL: https://github.com/apache/airflow/pull/25705#issuecomment-1216149807

   It seems like the tests need some work to fix temp file usasges.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ajinkya-xyz commented on pull request #25705: Fix : SFTP Sensor fails to locate file when file_pattern is provided

Posted by GitBox <gi...@apache.org>.
ajinkya-xyz commented on PR #25705:
URL: https://github.com/apache/airflow/pull/25705#issuecomment-1224201356

   @uranusjr @potiuk Facing issue in doing local setup on windows machine. Will try to spend some time on weekend.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] github-actions[bot] commented on pull request #25705: Fix : SFTP Sensor fails to locate file when file_pattern is provided

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #25705:
URL: https://github.com/apache/airflow/pull/25705#issuecomment-1272173865

   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] github-actions[bot] closed pull request #25705: Fix : SFTP Sensor fails to locate file when file_pattern is provided

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #25705: Fix : SFTP Sensor fails to locate file when file_pattern is provided
URL: https://github.com/apache/airflow/pull/25705


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #25705: Fix : SFTP Sensor fails to locate file when file_pattern is provided

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25705:
URL: https://github.com/apache/airflow/pull/25705#issuecomment-1224194374

   Yep. Fix would be nice here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ajinkya-xyz commented on a diff in pull request #25705: Fix : SFTP Sensor fails to locate file when file_pattern is provided

Posted by GitBox <gi...@apache.org>.
ajinkya-xyz commented on code in PR #25705:
URL: https://github.com/apache/airflow/pull/25705#discussion_r945745538


##########
airflow/providers/sftp/sensors/sftp.py:
##########
@@ -67,7 +68,7 @@ def poke(self, context: 'Context') -> bool:
         if self.file_pattern:
             file_from_pattern = self.hook.get_file_by_pattern(self.path, self.file_pattern)
             if file_from_pattern:
-                actual_file_to_check = file_from_pattern
+                actual_file_to_check = os.path.join(self.path, file_from_pattern)

Review Comment:
   @uranusjr Makes sense! Update code to use posixpath instead. Thank you!
   
   Reference : https://stackoverflow.com/questions/36592213/using-os-path-for-posix-path-operations-on-windows



##########
airflow/providers/sftp/sensors/sftp.py:
##########
@@ -67,7 +68,7 @@ def poke(self, context: 'Context') -> bool:
         if self.file_pattern:
             file_from_pattern = self.hook.get_file_by_pattern(self.path, self.file_pattern)
             if file_from_pattern:
-                actual_file_to_check = file_from_pattern
+                actual_file_to_check = os.path.join(self.path, file_from_pattern)

Review Comment:
   @uranusjr Makes sense! Updated code to use posixpath instead. Thank you!
   
   Reference : https://stackoverflow.com/questions/36592213/using-os-path-for-posix-path-operations-on-windows



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on pull request #25705: Fix : SFTP Sensor not able to locate file when file_pattern is provided

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on PR #25705:
URL: https://github.com/apache/airflow/pull/25705#issuecomment-1214180976

   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it better 🚀.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org