You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/03/08 07:00:17 UTC

[GitHub] [airflow] rsg17 opened a new pull request #22071: Bug-fix GCSToS3Operator

rsg17 opened a new pull request #22071:
URL: https://github.com/apache/airflow/pull/22071


   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   closes: #18267 
   
   Implements an option to `keep_directory_structure`.
   True: Keep the destination directory structure
   False: Create the source directory structure as specified by `prefix`
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] rsg17 commented on a change in pull request #22071: Bug-fix GCSToS3Operator

Posted by GitBox <gi...@apache.org>.
rsg17 commented on a change in pull request #22071:
URL: https://github.com/apache/airflow/pull/22071#discussion_r832826066



##########
File path: airflow/providers/amazon/aws/transfers/gcs_to_s3.py
##########
@@ -147,6 +152,9 @@ def execute(self, context: 'Context') -> List[str]:
             aws_conn_id=self.dest_aws_conn_id, verify=self.dest_verify, extra_args=self.dest_s3_extra_args
         )
 
+        if not self.keep_directory_structure and self.prefix:
+            self.dest_s3_key = os.path.join(self.dest_s3_key, self.prefix)

Review comment:
       Will remember to use `posixpath` going forward. Actually, I referred to some of the code [here](https://github.com/apache/airflow/blob/main/airflow/providers/google/cloud/transfers/gcs_to_sftp.py#L171) when I did my PR..




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] rafalh commented on a change in pull request #22071: Bug-fix GCSToS3Operator

Posted by GitBox <gi...@apache.org>.
rafalh commented on a change in pull request #22071:
URL: https://github.com/apache/airflow/pull/22071#discussion_r830650965



##########
File path: airflow/providers/amazon/aws/transfers/gcs_to_s3.py
##########
@@ -147,6 +152,9 @@ def execute(self, context: 'Context') -> List[str]:
             aws_conn_id=self.dest_aws_conn_id, verify=self.dest_verify, extra_args=self.dest_s3_extra_args
         )
 
+        if not self.keep_directory_structure and self.prefix:
+            self.dest_s3_key = os.path.join(self.dest_s3_key, self.prefix)

Review comment:
       will it work properly on Windows where `os.path.join` uses backslashes (`\`)? I think it should use `posixpath` instead




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #22071: Bug-fix GCSToS3Operator

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #22071:
URL: https://github.com/apache/airflow/pull/22071#discussion_r831385579



##########
File path: airflow/providers/amazon/aws/transfers/gcs_to_s3.py
##########
@@ -147,6 +152,9 @@ def execute(self, context: 'Context') -> List[str]:
             aws_conn_id=self.dest_aws_conn_id, verify=self.dest_verify, extra_args=self.dest_s3_extra_args
         )
 
+        if not self.keep_directory_structure and self.prefix:
+            self.dest_s3_key = os.path.join(self.dest_s3_key, self.prefix)

Review comment:
       Feel free to make PR. Airflow only works on POSIX (for multiple reasons) but you are right it would be beter to even hardcode "/'".




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #22071: Bug-fix GCSToS3Operator

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #22071:
URL: https://github.com/apache/airflow/pull/22071


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed pull request #22071: Bug-fix GCSToS3Operator

Posted by GitBox <gi...@apache.org>.
potiuk closed pull request #22071:
URL: https://github.com/apache/airflow/pull/22071


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] rsg17 commented on a change in pull request #22071: Bug-fix GCSToS3Operator

Posted by GitBox <gi...@apache.org>.
rsg17 commented on a change in pull request #22071:
URL: https://github.com/apache/airflow/pull/22071#discussion_r832826066



##########
File path: airflow/providers/amazon/aws/transfers/gcs_to_s3.py
##########
@@ -147,6 +152,9 @@ def execute(self, context: 'Context') -> List[str]:
             aws_conn_id=self.dest_aws_conn_id, verify=self.dest_verify, extra_args=self.dest_s3_extra_args
         )
 
+        if not self.keep_directory_structure and self.prefix:
+            self.dest_s3_key = os.path.join(self.dest_s3_key, self.prefix)

Review comment:
       Will remember to use `posixpath` going forward. Actually, I referred to some of the code [here](https://github.com/apache/airflow/blob/main/airflow/providers/google/cloud/transfers/gcs_to_sftp.py#L171) when I did my PR.
   
   Let me know if I should do the fix, I can get to it over the weekend..




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org