You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/05/05 03:04:07 UTC

[GitHub] [airflow] Isaacwhyuenac opened a new issue #15664: Airflow Implicit rule on removing S3 path trailing slash

Isaacwhyuenac opened a new issue #15664:
URL: https://github.com/apache/airflow/issues/15664


   Hi, I found the behaviour that removes [the trailing slash on S3 path](https://github.com/apache/airflow/blob/a0eb747b8d73f71dcf471917e013669a660cd4dd/airflow/providers/amazon/aws/hooks/s3.py#L146) intriguing and unnecessary.
   
   Under the scheme, if in s3 there are
   
   ```
   2021-04-30 10:38:27   11.7 KiB agg/user_segmentation/tag_article_user_analysis/partition_0=2021-04-29/20210430_023708_00016_g2f4v_31a98366-e1a0-428e-a57f-3c4b0fe2bb79
   2021-04-30 10:38:27   49.1 KiB agg/user_segmentation/tag_article_user_analysis/partition_0=2021-04-29/20210430_023708_00016_g2f4v_55438602-3787-4c09-9fad-562e7a6786cb
   2021-04-30 10:38:31   10.6 KiB agg/user_segmentation/tag_article_user_analysis/partition_0=2021-04-29/20210430_023708_00016_g2f4v_6773215f-1697-4c99-9e94-f7961e86af62
   2021-04-30 10:38:31   27.1 KiB agg/user_segmentation/tag_article_user_analysis/partition_0=2021-04-29/20210430_023708_00016_g2f4v_69c952f5-97b4-45e9-b790-fc7830fb2150
   2021-04-30 10:38:31  131.2 KiB agg/user_segmentation/tag_article_user_analysis/partition_0=2021-04-29/20210430_023708_00016_g2f4v_b4b995f5-211d-4d46-bd9a-86912b29d978
   2021-04-30 10:38:27  166.2 KiB agg/user_segmentation/tag_article_user_analysis/partition_0=2021-04-29/20210430_023708_00016_g2f4v_bbcebd80-c280-4e66-9431-9a626df8bc33
   2021-04-30 10:38:30  171.6 KiB agg/user_segmentation/tag_article_user_analysis/partition_0=2021-04-29/20210430_023708_00016_g2f4v_f4ef423f-cf70-4f71-960e-70f1bdddaf3d
   
   2021-04-30 10:38:27   11.7 KiB agg/user_segmentation/tag_article_user_analysis_v2/partition_0=2021-04-29/20210430_023708_00016_g2f4v_31a98366-e1a0-428e-a57f-3c4b0fe2bb79
   2021-04-30 10:38:27   49.1 KiB agg/user_segmentation/tag_article_user_analysis_v2/partition_0=2021-04-29/20210430_023708_00016_g2f4v_55438602-3787-4c09-9fad-562e7a6786cb
   2021-04-30 10:38:31   10.6 KiB agg/user_segmentation/tag_article_user_analysis_v2/partition_0=2021-04-29/20210430_023708_00016_g2f4v_6773215f-1697-4c99-9e94-f7961e86af62
   2021-04-30 10:38:31   27.1 KiB agg/user_segmentation/tag_article_user_analysis_v2/partition_0=2021-04-29/20210430_023708_00016_g2f4v_69c952f5-97b4-45e9-b790-fc7830fb2150
   2021-04-30 10:38:31  131.2 KiB agg/user_segmentation/tag_article_user_analysis_v2/partition_0=2021-04-29/20210430_023708_00016_g2f4v_b4b995f5-211d-4d46-bd9a-86912b29d978
   2021-04-30 10:38:27  166.2 KiB agg/user_segmentation/tag_article_user_analysis_v2/partition_0=2021-04-29/20210430_023708_00016_g2f4v_bbcebd80-c280-4e66-9431-9a626df8bc33
   2021-04-30 10:38:30  171.6 KiB agg/user_segmentation/tag_article_user_analysis_v2/partition_0=2021-04-29/20210430_023708_00016_g2f4v_f4ef423f-cf70-4f71-960e-70f1bdddaf3d
   ```
   
   If we only want to match `agg/user_segmentation/tag_article_user_analysis/`, the `agg/user_segmentation/tag_article_user_analysis_v2` pattern will also be removed under the current s3 path processor.  Developer should have the freedom to choose what pattern they want to match instead of forcing a pattern matching for them.
   
   Created a PR on this issue.
   #15609 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Isaacwhyuenac closed issue #15664: Airflow Implicit rule on removing S3 path trailing slash

Posted by GitBox <gi...@apache.org>.
Isaacwhyuenac closed issue #15664:
URL: https://github.com/apache/airflow/issues/15664


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on issue #15664: Airflow Implicit rule on removing S3 path trailing slash

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on issue #15664:
URL: https://github.com/apache/airflow/issues/15664#issuecomment-896368156


   @potiuk  - Looks like this can be closed?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #15664: Airflow Implicit rule on removing S3 path trailing slash

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #15664:
URL: https://github.com/apache/airflow/issues/15664#issuecomment-832383376


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org