You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/02/19 06:56:25 UTC

[GitHub] [airflow] heeroyuy925 opened a new pull request #14313: Fix wasb task handler py

heeroyuy925 opened a new pull request #14313:
URL: https://github.com/apache/airflow/pull/14313


   related: https://github.com/apache/airflow/issues/14312
   
   To solve the display issue on UI when using wasb_task_handler.py to store remote log on Azure blob storage
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #14313: Fix remote log in azure storage blob displays in one line

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #14313:
URL: https://github.com/apache/airflow/pull/14313#issuecomment-785686119


   Awesome work, congrats on your first merged pull request!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] heeroyuy925 commented on pull request #14313: Fix remote log in azure storage blob displays in one line

Posted by GitBox <gi...@apache.org>.
heeroyuy925 commented on pull request #14313:
URL: https://github.com/apache/airflow/pull/14313#issuecomment-784937668


   Thank you for your help @ephraimbuddy I have updated the code as you suggested


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy merged pull request #14313: Fix remote log in azure storage blob displays in one line

Posted by GitBox <gi...@apache.org>.
ephraimbuddy merged pull request #14313:
URL: https://github.com/apache/airflow/pull/14313


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kurtqq commented on pull request #14313: Fix wasb task handler py

Posted by GitBox <gi...@apache.org>.
kurtqq commented on pull request #14313:
URL: https://github.com/apache/airflow/pull/14313#issuecomment-784442824


   probably a good idea to tag the creator of the PR that cause it
   @ephraimbuddy 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on pull request #14313: Fix wasb task handler py

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on pull request #14313:
URL: https://github.com/apache/airflow/pull/14313#issuecomment-784570714


   @heeroyuy925 I have confirmed that we should use `context_as_text` method instead of `readall()` can you update the PR with it? Your solution is Ok but I'm thinking this is less hacky 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on a change in pull request #14313: Fix remote log in azure storage blob displays in one line

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on a change in pull request #14313:
URL: https://github.com/apache/airflow/pull/14313#discussion_r582489584



##########
File path: airflow/providers/microsoft/azure/hooks/wasb.py
##########
@@ -244,7 +244,7 @@ def read_file(self, container_name: str, blob_name: str, **kwargs):
         :param kwargs: Optional keyword arguments that `BlobClient.download_blob` takes.
         :type kwargs: object
         """
-        return self.download(container_name, blob_name, **kwargs).readall()
+        return self.download(container_name, blob_name, **kwargs).content_as_text()

Review comment:
       You have to rebase though




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] heeroyuy925 commented on a change in pull request #14313: Fix remote log in azure storage blob displays in one line

Posted by GitBox <gi...@apache.org>.
heeroyuy925 commented on a change in pull request #14313:
URL: https://github.com/apache/airflow/pull/14313#discussion_r582515134



##########
File path: airflow/providers/microsoft/azure/hooks/wasb.py
##########
@@ -244,7 +244,7 @@ def read_file(self, container_name: str, blob_name: str, **kwargs):
         :param kwargs: Optional keyword arguments that `BlobClient.download_blob` takes.
         :type kwargs: object
         """
-        return self.download(container_name, blob_name, **kwargs).readall()
+        return self.download(container_name, blob_name, **kwargs).content_as_text()

Review comment:
       I have rabased follow the Pull Request Guidelines, please check it




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #14313: Fix remote log in azure storage blob displays in one line

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #14313:
URL: https://github.com/apache/airflow/pull/14313#issuecomment-785548787


   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest master or amend the last commit of the PR, and push it with --force-with-lease.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] flolas edited a comment on pull request #14313: Fix wasb task handler py

Posted by GitBox <gi...@apache.org>.
flolas edited a comment on pull request #14313:
URL: https://github.com/apache/airflow/pull/14313#issuecomment-782871543


   Happened after upgrading Blob 2.1v to 12v in Azure Provider #12188. Hope we can have this PR merged and released soon.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #14313: Fix wasb task handler py

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #14313:
URL: https://github.com/apache/airflow/pull/14313#issuecomment-781873354


   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, pylint and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/master/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/master/BREEZE.rst) for testing locally, itโ€™s a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it better ๐Ÿš€.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on a change in pull request #14313: Fix remote log in azure storage blob displays in one line

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on a change in pull request #14313:
URL: https://github.com/apache/airflow/pull/14313#discussion_r581864345



##########
File path: airflow/providers/microsoft/azure/hooks/wasb.py
##########
@@ -244,7 +244,7 @@ def read_file(self, container_name: str, blob_name: str, **kwargs):
         :param kwargs: Optional keyword arguments that `BlobClient.download_blob` takes.
         :type kwargs: object
         """
-        return self.download(container_name, blob_name, **kwargs).readall()
+        return self.download(container_name, blob_name, **kwargs).content_as_text()

Review comment:
       ```suggestion
           return self.download(container_name, blob_name, **kwargs).content_as_text(max_concurrency=1, encoding='utf-8')
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on a change in pull request #14313: Fix remote log in azure storage blob displays in one line

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on a change in pull request #14313:
URL: https://github.com/apache/airflow/pull/14313#discussion_r582487545



##########
File path: airflow/providers/microsoft/azure/hooks/wasb.py
##########
@@ -244,7 +244,7 @@ def read_file(self, container_name: str, blob_name: str, **kwargs):
         :param kwargs: Optional keyword arguments that `BlobClient.download_blob` takes.
         :type kwargs: object
         """
-        return self.download(container_name, blob_name, **kwargs).readall()
+        return self.download(container_name, blob_name, **kwargs).content_as_text()

Review comment:
       Hey, seen it. Please don't add




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on pull request #14313: Fix wasb task handler py

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on pull request #14313:
URL: https://github.com/apache/airflow/pull/14313#issuecomment-784496125


   @kurtqq Thanks for the mention. I think what we should change is replace `readall` in hook.read_file with `content_as_text(max_concurrency=1, encoding='UTF-8')`.  
   https://github.com/apache/airflow/blob/c281979982c36f16c4c346c996a0c8d6ca7c630d/airflow/providers/microsoft/azure/hooks/wasb.py#L247
   I'm taking a look.
   https://docs.microsoft.com/en-us/azure/developer/python/sdk/storage/azure-storage-blob/azure.storage.blob.storagestreamdownloader?view=storage-py-v12#methods
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] heeroyuy925 commented on a change in pull request #14313: Fix remote log in azure storage blob displays in one line

Posted by GitBox <gi...@apache.org>.
heeroyuy925 commented on a change in pull request #14313:
URL: https://github.com/apache/airflow/pull/14313#discussion_r582484342



##########
File path: airflow/providers/microsoft/azure/hooks/wasb.py
##########
@@ -244,7 +244,7 @@ def read_file(self, container_name: str, blob_name: str, **kwargs):
         :param kwargs: Optional keyword arguments that `BlobClient.download_blob` takes.
         :type kwargs: object
         """
-        return self.download(container_name, blob_name, **kwargs).readall()
+        return self.download(container_name, blob_name, **kwargs).content_as_text()

Review comment:
       I check the Microsoft document you gave above and find these two parameters are default value.
   Ok, I can add this as you suggested




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] flolas commented on pull request #14313: Fix wasb task handler py

Posted by GitBox <gi...@apache.org>.
flolas commented on pull request #14313:
URL: https://github.com/apache/airflow/pull/14313#issuecomment-782871543


   Happened after upgrading Blob 2.1v to 12v in Azure Provider #12188. Hope we can have in a release soon.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org