You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/06/14 19:42:27 UTC

[GitHub] [airflow] remem9527 opened a new pull request, #24451: Update `actual_file_to_check` with rendered `path`

remem9527 opened a new pull request, #24451:
URL: https://github.com/apache/airflow/pull/24451

   The new added attribute `self.actual_file_to_check` is not one of template_fields and is initiated in `self.__init__()`. So, it need to be updated after `self.path` is rendered. Otherwise, the template in `self.path` will not work.
   
   @alexkruc
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] josh-fell commented on a diff in pull request #24451: Update `actual_file_to_check` with rendered `path`

Posted by GitBox <gi...@apache.org>.
josh-fell commented on code in PR #24451:
URL: https://github.com/apache/airflow/pull/24451#discussion_r898330005


##########
airflow/providers/sftp/sensors/sftp.py:
##########
@@ -71,6 +71,8 @@ def poke(self, context: 'Context') -> bool:
                 self.actual_file_to_check = file_from_pattern
             else:
                 return False
+        else:
+            self.actual_file_to_check = self.path

Review Comment:
   ❀️ the discussion here! I think they are all great suggestions, but they don't necessarily need to be part of this PR. Definitely feel free to fire over PRs and we can discuss more.
   
   >Perhaps the best approach is to initiate the variable inside the poke() method itself.
   +1 



##########
airflow/providers/sftp/sensors/sftp.py:
##########
@@ -71,6 +71,8 @@ def poke(self, context: 'Context') -> bool:
                 self.actual_file_to_check = file_from_pattern
             else:
                 return False
+        else:
+            self.actual_file_to_check = self.path

Review Comment:
   ❀️ the discussion here! I think they are all great suggestions, but they don't necessarily need to be part of this PR. Definitely feel free to fire over PRs and we can discuss more.
   
   >Perhaps the best approach is to initiate the variable inside the poke() method itself.
   
   +1 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] remem9527 commented on a diff in pull request #24451: Update `actual_file_to_check` with rendered `path`

Posted by GitBox <gi...@apache.org>.
remem9527 commented on code in PR #24451:
URL: https://github.com/apache/airflow/pull/24451#discussion_r900105935


##########
airflow/providers/sftp/sensors/sftp.py:
##########
@@ -71,6 +71,8 @@ def poke(self, context: 'Context') -> bool:
                 self.actual_file_to_check = file_from_pattern
             else:
                 return False
+        else:
+            self.actual_file_to_check = self.path

Review Comment:
   Submitted another commit accordingly. Closing this conversation



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] josh-fell commented on a diff in pull request #24451: Update `actual_file_to_check` with rendered `path`

Posted by GitBox <gi...@apache.org>.
josh-fell commented on code in PR #24451:
URL: https://github.com/apache/airflow/pull/24451#discussion_r897320891


##########
airflow/providers/sftp/sensors/sftp.py:
##########
@@ -71,6 +71,8 @@ def poke(self, context: 'Context') -> bool:
                 self.actual_file_to_check = file_from_pattern
             else:
                 return False
+        else:
+            self.actual_file_to_check = self.path

Review Comment:
   Good catch! IMO it's probably not worth having `actual_file_to_check` be an instance attr since it's not used outside of the `poke()` method. WDYT? If you agree, it should be deleted from `__init__()` as well.



##########
airflow/providers/sftp/sensors/sftp.py:
##########
@@ -71,6 +71,8 @@ def poke(self, context: 'Context') -> bool:
                 self.actual_file_to_check = file_from_pattern
             else:
                 return False
+        else:
+            self.actual_file_to_check = self.path

Review Comment:
   Good catch! IMO it's probably not worth having `actual_file_to_check` be an instance attr since it's not used outside of the `poke()` method. WDYT? If you agree, it should be deleted from `__init__()` as well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on pull request #24451: Update `actual_file_to_check` with rendered `path`

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on PR #24451:
URL: https://github.com/apache/airflow/pull/24451#issuecomment-1159284545

   Awesome work, congrats on your first merged pull request!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] remem9527 commented on a diff in pull request #24451: Update `actual_file_to_check` with rendered `path`

Posted by GitBox <gi...@apache.org>.
remem9527 commented on code in PR #24451:
URL: https://github.com/apache/airflow/pull/24451#discussion_r897338441


##########
airflow/providers/sftp/sensors/sftp.py:
##########
@@ -71,6 +71,8 @@ def poke(self, context: 'Context') -> bool:
                 self.actual_file_to_check = file_from_pattern
             else:
                 return False
+        else:
+            self.actual_file_to_check = self.path

Review Comment:
   Maybe we want to access one of the filenames that matches `self.file_pattern`, when `self.file_pattern` is not empty? I am not sure about that. We probably need some input from the committer of file_pattern supporting.
   My true opinion is that the implementation about file_pattern is not very elegant. The variable `self.path` means a **file** path when `self.file_pattern` is empty. But when `self.file_pattern` is specified, the variable `self.path` is treated as a **folder** path. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] remem9527 commented on a diff in pull request #24451: Update `actual_file_to_check` with rendered `path`

Posted by GitBox <gi...@apache.org>.
remem9527 commented on code in PR #24451:
URL: https://github.com/apache/airflow/pull/24451#discussion_r900106292


##########
airflow/providers/sftp/sensors/sftp.py:
##########
@@ -71,6 +71,8 @@ def poke(self, context: 'Context') -> bool:
                 self.actual_file_to_check = file_from_pattern
             else:
                 return False
+        else:
+            self.actual_file_to_check = self.path

Review Comment:
   Duplicated. Closing it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] remem9527 commented on a diff in pull request #24451: Update `actual_file_to_check` with rendered `path`

Posted by GitBox <gi...@apache.org>.
remem9527 commented on code in PR #24451:
URL: https://github.com/apache/airflow/pull/24451#discussion_r897944996


##########
airflow/providers/sftp/sensors/sftp.py:
##########
@@ -71,6 +71,8 @@ def poke(self, context: 'Context') -> bool:
                 self.actual_file_to_check = file_from_pattern
             else:
                 return False
+        else:
+            self.actual_file_to_check = self.path

Review Comment:
   Personally, I think it would be very cool if I could use sensors like this:
   ```
   sensor_task = SFTPSensor(task_id='task_id', sftp_conn_id='sftp_conn_id', path='data/folder/datafile_{{ds}}_part_?.[ct]sv')
   ```
   Also, this supporting should be implemented on some base class, so that all the file sensors could be supported.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] github-actions[bot] commented on pull request #24451: Update `actual_file_to_check` with rendered `path`

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #24451:
URL: https://github.com/apache/airflow/pull/24451#issuecomment-1158934994

   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] josh-fell merged pull request #24451: Update `actual_file_to_check` with rendered `path`

Posted by GitBox <gi...@apache.org>.
josh-fell merged PR #24451:
URL: https://github.com/apache/airflow/pull/24451


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] josh-fell commented on pull request #24451: Update `actual_file_to_check` with rendered `path`

Posted by GitBox <gi...@apache.org>.
josh-fell commented on PR #24451:
URL: https://github.com/apache/airflow/pull/24451#issuecomment-1159287341

   Woohoo! First commit! Congrats! πŸŽ‰


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on pull request #24451: Update `actual_file_to_check` with rendered `path`

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on PR #24451:
URL: https://github.com/apache/airflow/pull/24451#issuecomment-1155640971

   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it better πŸš€.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] alexkruc commented on a diff in pull request #24451: Update `actual_file_to_check` with rendered `path`

Posted by GitBox <gi...@apache.org>.
alexkruc commented on code in PR #24451:
URL: https://github.com/apache/airflow/pull/24451#discussion_r897624908


##########
airflow/providers/sftp/sensors/sftp.py:
##########
@@ -71,6 +71,8 @@ def poke(self, context: 'Context') -> bool:
                 self.actual_file_to_check = file_from_pattern
             else:
                 return False
+        else:
+            self.actual_file_to_check = self.path

Review Comment:
   Indeed a good catch! :) 
   I agree that `actual_file_to_check` can be deleted from the `__init__()` method, it's not really needed there, and it was my mistake to leave it there πŸ˜… 
   Perhaps the best approach is to initiate the variable inside the `poke()` method itself.
   Or, because it is a user-facing attribute, maybe it's ok to add this to the `template_fields`? WDYT? 
   
   About the API itself, when I've added this feature, I saw `self.path` variable to be a strict and complete file location, but when you ask for a pattern, it's "easier" to understand that you look for a pattern, in a specific path.
   So for example, if you want to check if csv reports are done and placed in the correct folder, you will give the folder with the `*.csv` pattern for poking.
   
   But if you think that this API is not good, I guess it's ok also to change it :)  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org