You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/04/22 14:27:57 UTC

[GitHub] [airflow] bobpace opened a new issue #15489: Remote Logging: WasbTaskHandler wasb_read returns bytes instead of string

bobpace opened a new issue #15489:
URL: https://github.com/apache/airflow/issues/15489


   <!--
   
   Welcome to Apache Airflow!  For a smooth issue process, try to answer the following questions.
   Don't worry if they're not all applicable; just try to include what you can :-)
   
   If you need to include code snippets or logs, please put them in fenced code
   blocks.  If they're super-long, please use the details tag like
   <details><summary>super-long log</summary> lots of stuff </details>
   
   Please delete these comment blocks before submitting the issue.
   
   -->
   
   <!--
   
   IMPORTANT!!!
   
   PLEASE CHECK "SIMILAR TO X EXISTING ISSUES" OPTION IF VISIBLE
   NEXT TO "SUBMIT NEW ISSUE" BUTTON!!!
   
   PLEASE CHECK IF THIS ISSUE HAS BEEN REPORTED PREVIOUSLY USING SEARCH!!!
   
   Please complete the next sections or the issue will be closed.
   These questions are the first thing we need to know to understand the context.
   
   -->
   
   **Apache Airflow version**:
   2.0.1
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"clean", BuildDate:"2021-01-14T05:15:04Z", GoVersion:"go1.15.6", Compiler:"gc", Platform:"darwin/amd64"}
   Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:41:49Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**:
   - **OS** (e.g. from /etc/os-release): macOS Big Sur 11.2.1
   - **Kernel** (e.g. `uname -a`): Darwin ftlmacwmd6t.local 20.3.0 Darwin Kernel Version 20.3.0: Thu Jan 21 00:07:06 PST 2021; root:xnu-7195.81.3~1/RELEASE_X86_64 x86_64
   - **Install tools**:
   - **Others**:
   
   **What happened**:
   After configuring remote logging using wasb it works but the remote log output is returned as bytes instead of string and makes it difficult to read in the airflow UI. For example, the log output will look like b"line1\nline2\nline3\netc" and it will not split the lines on the \n.
   
   I was able to work around this issue by including a custom task handler that inherits from WasbTaskHandler and overrides the `wasb_read` function and its usage of the underlying WasbHook `read_file` function to set encoding to "utf-8":
   ```
       def wasb_read(self, remote_log_location: str, return_error: bool = False):
           """
           Default WasbTaskHandler returns contents as bytes which doesn't look good in log display
           ie, b"text\nsome more text\n" and doesn't split the newlines
           This overrides the encoding to utf-8 so we get a string back
           """
           try:
               return self.hook.read_file(self.wasb_container, remote_log_location, encoding="utf-8")
           except AzureHttpError:
               msg = f'Could not read logs from {remote_log_location}'
               self.log.exception(msg)
               # return error if needed
               if return_error:
                   return msg
   ```
   
   <!-- (please include exact error messages if you can) -->
   
   **What you expected to happen**:
   It would be convenient if the encoding was set to "utf-8" or otherwise allow for it to be passed in so that the remote logs don't display as one giant line that wraps.
   <!-- What do you think went wrong? -->
   
   **How to reproduce it**:
   Use remote logging with WasbTaskHandler
   <!---
   
   As minimally and precisely as possible. Keep in mind we do not have access to your cluster or dags.
   
   If you are using kubernetes, please attempt to recreate the issue using minikube or kind.
   
   ## Install minikube/kind
   
   - Minikube https://minikube.sigs.k8s.io/docs/start/
   - Kind https://kind.sigs.k8s.io/docs/user/quick-start/
   
   If this is a UI bug, please provide a screenshot of the bug or a link to a youtube video of the bug in action
   
   You can include images using the .md style of
   ![alt text](http://url/to/img.png)
   
   To record a screencast, mac users can use QuickTime and then create an unlisted youtube video with the resulting .mov file.
   
   --->
   
   
   **Anything else we need to know**:
   
   <!--
   
   How often does this problem occur? Once? Every time etc?
   
   Any relevant logs to include? Put them here in side a detail tag:
   <details><summary>x.log</summary> lots of stuff </details>
   
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bobpace commented on issue #15489: Remote Logging: WasbTaskHandler wasb_read returns bytes instead of string

Posted by GitBox <gi...@apache.org>.
bobpace commented on issue #15489:
URL: https://github.com/apache/airflow/issues/15489#issuecomment-826398091


   was using:
   apache-airflow-providers-microsoft-azure==1.1.0
   
   now using:
   apache-airflow-providers-microsoft-azure==1.3.0
   
   confirmed that it works without issue after upgrading to latest version, thank you for your help!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bobpace closed issue #15489: Remote Logging: WasbTaskHandler wasb_read returns bytes instead of string

Posted by GitBox <gi...@apache.org>.
bobpace closed issue #15489:
URL: https://github.com/apache/airflow/issues/15489


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bobpace closed issue #15489: Remote Logging: WasbTaskHandler wasb_read returns bytes instead of string

Posted by GitBox <gi...@apache.org>.
bobpace closed issue #15489:
URL: https://github.com/apache/airflow/issues/15489


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #15489: Remote Logging: WasbTaskHandler wasb_read returns bytes instead of string

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #15489:
URL: https://github.com/apache/airflow/issues/15489#issuecomment-824890554


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bobpace commented on issue #15489: Remote Logging: WasbTaskHandler wasb_read returns bytes instead of string

Posted by GitBox <gi...@apache.org>.
bobpace commented on issue #15489:
URL: https://github.com/apache/airflow/issues/15489#issuecomment-826398091


   was using:
   apache-airflow-providers-microsoft-azure==1.1.0
   
   now using:
   apache-airflow-providers-microsoft-azure==1.3.0
   
   confirmed that it works without issue after upgrading to latest version, thank you for your help!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on issue #15489: Remote Logging: WasbTaskHandler wasb_read returns bytes instead of string

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on issue #15489:
URL: https://github.com/apache/airflow/issues/15489#issuecomment-826064754


   There's an issue on this that have been fixed, can you share the version of apache-airflow-providers-microsoft-azure you're using? If you're on a lower version please upgrade and try if you can reproduce it again


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on issue #15489: Remote Logging: WasbTaskHandler wasb_read returns bytes instead of string

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on issue #15489:
URL: https://github.com/apache/airflow/issues/15489#issuecomment-826064754


   There's an issue on this that have been fixed, can you share the version of apache-airflow-providers-microsoft-azure you're using? If you're on a lower version please upgrade and try if you can reproduce it again


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bobpace commented on issue #15489: Remote Logging: WasbTaskHandler wasb_read returns bytes instead of string

Posted by GitBox <gi...@apache.org>.
bobpace commented on issue #15489:
URL: https://github.com/apache/airflow/issues/15489#issuecomment-825992186


   Ah ok I see what you mean, sure I'll take a shot at a PR this weekend.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #15489: Remote Logging: WasbTaskHandler wasb_read returns bytes instead of string

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #15489:
URL: https://github.com/apache/airflow/issues/15489#issuecomment-825334339


   It might not be a good idea to change `wasb_read()` to return a str, but changing the log line (in `WasbTaskHandler._read()`) to do `remote_log.decode("utf-8", errors="backslashreplace")` IMO.
   
   Would you be interested in submitting a pull request for that?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org