You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/06/06 10:49:39 UTC

[GitHub] [airflow] potiuk commented on a diff in pull request #24039: Fix GCSToGCSOperator cannot copy a single file/folder without copying other files/folders with that prefix

potiuk commented on code in PR #24039:
URL: https://github.com/apache/airflow/pull/24039#discussion_r890034464


##########
airflow/providers/google/cloud/transfers/gcs_to_gcs.py:
##########
@@ -341,6 +345,8 @@ def _copy_source_without_wildcard(self, hook, prefix):
                 raise AirflowException(msg)
 
         for source_obj in objects:
+            if self.exact_match and (source_obj != prefix or not source_obj.endswith(prefix)):
+                continue

Review Comment:
   Yeah. I see the point. The object store semantics is funny because it resembles filesystem one, but it is in fact object name, so "prefix" can often mean wrong things, for example:
   
   * test_file.zip - might be an object (file) stored
   * test_file.zip/another_file.zip - > might be ANOTHER object in something that looks like a "test_file.zip" folder. 
   
   This is not like that in "real filesystem", you cannot have "file" and "folder" with the same name. That's why "exact-match" is sometimes the only choice as you have no other way to skip the "nested" object..
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org