You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/08/11 14:40:59 UTC

[GitHub] [airflow] alexkruc opened a new pull request, #25675: Adding a parameter for exclusion of trashed files in GoogleDriveHook

alexkruc opened a new pull request, #25675:
URL: https://github.com/apache/airflow/pull/25675

   GoogleDriveHook is currently returning all the files with the specified naming, regardless if they are in the "trash folder" or not.
   Taking inspiration from the following PR: #24446, I saw that the [Google Drive API ](https://developers.google.com/drive/api/v3/reference/files/list?apix_params=%7B%22q%22%3A%22name%20%3D%20%27abcd_test%27%20and%20trashed%3Dfalse%22%7D) has a parameter that we can add to the query (`trashed=false/true`) that specifies whether we want the API to return trashed files or not (default is `True` -> show trashed files).
   
   This PR is adding this query param to the `GoogleDriveHook`.
   
   
   related: #24446


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] alexkruc commented on a diff in pull request #25675: Adding a parameter for exclusion of trashed files in GoogleDriveHook

Posted by GitBox <gi...@apache.org>.
alexkruc commented on code in PR #25675:
URL: https://github.com/apache/airflow/pull/25675#discussion_r943812666


##########
airflow/providers/google/suite/hooks/drive.py:
##########
@@ -128,31 +128,45 @@ def get_media_request(self, file_id: str) -> HttpRequest:
         request = service.files().get_media(fileId=file_id)
         return request
 
-    def exists(self, folder_id: str, file_name: str, drive_id: Optional[str] = None):
+    def exists(
+        self, folder_id: str, file_name: str, include_trashed: bool = True, drive_id: Optional[str] = None

Review Comment:
   Just to be sure, I checked the official documentation of typing on `Optional[]`
   https://docs.python.org/3/library/typing.html#typing.Optional
   They also state the following
   ```
   An optional argument with a default does not require the Optional qualifier ....
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] alexkruc commented on a diff in pull request #25675: Adding a parameter for exclusion of trashed files in GoogleDriveHook

Posted by GitBox <gi...@apache.org>.
alexkruc commented on code in PR #25675:
URL: https://github.com/apache/airflow/pull/25675#discussion_r943793043


##########
airflow/providers/google/suite/hooks/drive.py:
##########
@@ -128,31 +128,45 @@ def get_media_request(self, file_id: str) -> HttpRequest:
         request = service.files().get_media(fileId=file_id)
         return request
 
-    def exists(self, folder_id: str, file_name: str, drive_id: Optional[str] = None):
+    def exists(
+        self, folder_id: str, file_name: str, include_trashed: bool = True, drive_id: Optional[str] = None

Review Comment:
   It is :) this is why we set it with a default value..
   Setting it with `Optional[]` and with default value is redundant, as if we don't pass it, it's passed with it's default value..
   
   Am I missing something?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on a diff in pull request #25675: Adding a parameter for exclusion of trashed files in GoogleDriveHook

Posted by GitBox <gi...@apache.org>.
eladkal commented on code in PR #25675:
URL: https://github.com/apache/airflow/pull/25675#discussion_r943776810


##########
airflow/providers/google/suite/hooks/drive.py:
##########
@@ -128,31 +128,45 @@ def get_media_request(self, file_id: str) -> HttpRequest:
         request = service.files().get_media(fileId=file_id)
         return request
 
-    def exists(self, folder_id: str, file_name: str, drive_id: Optional[str] = None):
+    def exists(
+        self, folder_id: str, file_name: str, include_trashed: bool = True, drive_id: Optional[str] = None

Review Comment:
   `include_trashed` is also optional param is it not?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #25675: Adding a parameter for exclusion of trashed files in GoogleDriveHook

Posted by GitBox <gi...@apache.org>.
uranusjr commented on code in PR #25675:
URL: https://github.com/apache/airflow/pull/25675#discussion_r944095967


##########
airflow/providers/google/suite/hooks/drive.py:
##########
@@ -128,31 +128,45 @@ def get_media_request(self, file_id: str) -> HttpRequest:
         request = service.files().get_media(fileId=file_id)
         return request
 
-    def exists(self, folder_id: str, file_name: str, drive_id: Optional[str] = None):
+    def exists(
+        self, folder_id: str, file_name: str, include_trashed: bool = True, drive_id: Optional[str] = None

Review Comment:
   Can we change the ordering to put `include_trashed` at the back? Right now this (unnecessarily) breaks backward compatibility if the user passes `drive_id` as a positional argument.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] alexkruc commented on a diff in pull request #25675: Adding a parameter for exclusion of trashed files in GoogleDriveHook

Posted by GitBox <gi...@apache.org>.
alexkruc commented on code in PR #25675:
URL: https://github.com/apache/airflow/pull/25675#discussion_r943793043


##########
airflow/providers/google/suite/hooks/drive.py:
##########
@@ -128,31 +128,45 @@ def get_media_request(self, file_id: str) -> HttpRequest:
         request = service.files().get_media(fileId=file_id)
         return request
 
-    def exists(self, folder_id: str, file_name: str, drive_id: Optional[str] = None):
+    def exists(
+        self, folder_id: str, file_name: str, include_trashed: bool = True, drive_id: Optional[str] = None

Review Comment:
   It is :) this is why we set it with a default value..
   Setting is with `Optional[]` and with default value is redundant, as I'd we don't pass it, it's passed with it's default value..
   
   Am I missing something?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] alexkruc commented on a diff in pull request #25675: Adding a parameter for exclusion of trashed files in GoogleDriveHook

Posted by GitBox <gi...@apache.org>.
alexkruc commented on code in PR #25675:
URL: https://github.com/apache/airflow/pull/25675#discussion_r944241876


##########
airflow/providers/google/suite/hooks/drive.py:
##########
@@ -128,31 +128,45 @@ def get_media_request(self, file_id: str) -> HttpRequest:
         request = service.files().get_media(fileId=file_id)
         return request
 
-    def exists(self, folder_id: str, file_name: str, drive_id: Optional[str] = None):
+    def exists(
+        self, folder_id: str, file_name: str, include_trashed: bool = True, drive_id: Optional[str] = None

Review Comment:
   Good point, I missed that one. Thanks for letting me know :)
   Also, a good point about making it keyword-only.
   
   I've implemented your suggestion :) 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal merged pull request #25675: Adding a parameter for exclusion of trashed files in GoogleDriveHook

Posted by GitBox <gi...@apache.org>.
eladkal merged PR #25675:
URL: https://github.com/apache/airflow/pull/25675


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] alexkruc commented on a diff in pull request #25675: Adding a parameter for exclusion of trashed files in GoogleDriveHook

Posted by GitBox <gi...@apache.org>.
alexkruc commented on code in PR #25675:
URL: https://github.com/apache/airflow/pull/25675#discussion_r943793043


##########
airflow/providers/google/suite/hooks/drive.py:
##########
@@ -128,31 +128,45 @@ def get_media_request(self, file_id: str) -> HttpRequest:
         request = service.files().get_media(fileId=file_id)
         return request
 
-    def exists(self, folder_id: str, file_name: str, drive_id: Optional[str] = None):
+    def exists(
+        self, folder_id: str, file_name: str, include_trashed: bool = True, drive_id: Optional[str] = None

Review Comment:
   It is :) this is why we set it with a default value..
   Setting it with `Optional[]` and with default value is redundant, as I'd we don't pass it, it's passed with it's default value..
   
   Am I missing something?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #25675: Adding a parameter for exclusion of trashed files in GoogleDriveHook

Posted by GitBox <gi...@apache.org>.
uranusjr commented on code in PR #25675:
URL: https://github.com/apache/airflow/pull/25675#discussion_r944095967


##########
airflow/providers/google/suite/hooks/drive.py:
##########
@@ -128,31 +128,45 @@ def get_media_request(self, file_id: str) -> HttpRequest:
         request = service.files().get_media(fileId=file_id)
         return request
 
-    def exists(self, folder_id: str, file_name: str, drive_id: Optional[str] = None):
+    def exists(
+        self, folder_id: str, file_name: str, include_trashed: bool = True, drive_id: Optional[str] = None

Review Comment:
   Can we change the ordering to put `include_trashed` at the back? Right now this (unnecessarily) breaks backward compatibility if the user passes `drive_id` as a positional argument.
   
   Maybe it’s best to make the new argument keyword-only as well, so future additions don’t break compatibility easily:
   
   ```python
   def exists(
       self,
       folder_id: str,
       file_name: str,
       drive_id: Optional[str] = None,
       *,
       include_trashed: bool = True,
   ) -> bool:
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org