You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@livy.apache.org by "Chao Gao (Jira)" <ji...@apache.org> on 2020/06/15 21:56:00 UTC

[jira] [Updated] (LIVY-774) Logging does not print to stdout or stderr correctly on PySpark through Livy

     [ https://issues.apache.org/jira/browse/LIVY-774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chao Gao updated LIVY-774:
--------------------------
    Description: 
h2. Summary

When using PySpark through Livy on Zeppelin or Jupyter Notebook, or Linux curl,  For the
 1st time, it could print out the log to stdout or stderr. But for the 2nd time and afterwards, it will show the error stack:  {color:#ff0000}{{ValueError: I/O operation on closed file}}{color}

If we use PySpark CLI on the master node, it works well, you could check the attachment: +_Works_on_PySpark_CLI.png_+
h2. Reproduce Step

In Zeppelin using Livy as interpreter
{code:java}
%pyspark

import sys
import logging;

// OUTPUT
Spark Application Id: application_1591899500515_0002

{code}
When the 1st time, we try to print log to stdout or stderr, it works well.
{code:java}
%pyspark

logger = logging.getLogger("log_example")
logger.setLevel(logging.ERROR)
ch = logging.StreamHandler(sys.stderr)
ch.setLevel(logging.ERROR)
logger.addHandler(ch)
logger.error("test error!")

// OUTPUT is expected
test error!{code}
When we try to print log to stdout or stderr 2nd time and afterwards, it will show the error stack.
{code:java}
%pyspark

logger.error("test error again!")

// OUTPUT showing error stack
--- Logging error ---
Traceback (most recent call last):
  File "/usr/lib64/python3.7/logging/__init__.py", line 1028, in emit
    stream.write(msg + self.terminator)
  File "/tmp/1262710270598062870", line 534, in write
    super(UnicodeDecodingStringIO, self).write(s)
ValueError: I/O operation on closed file
Call stack:
  File "/tmp/1262710270598062870", line 714, in <module>
    sys.exit(main())
  File "/tmp/1262710270598062870", line 686, in main
    response = handler(content)
  File "/tmp/1262710270598062870", line 318, in execute_request
    result = node.execute()
  File "/tmp/1262710270598062870", line 229, in execute
    exec(code, global_dict)
  File "<stdin>", line 1, in <module>
Message: 'test error again!'{code}
For Jupyter notebook, or Linux curl command, they got the same error. You could check the attachments:

+_1. Zeppelin_use_Livy_bug.png_+

+_2. JupyterNotebook_use_Livy_bug.png_+

+_3. LinuxCurl_use_Livy_error.png_+

 

 

  was:
h2. Summary

When using PySpark through Livy on Zeppelin or Jupyter Notebook, or Linux curl,  For the
1st time, it could print out the log to stdout or stderr. But for the 2nd time and afterwards, it will show the error stack:  {color:#FF0000}{{ValueError: I/O operation on closed file}}{color}

If we use PySpark CLI on the master node, it works well, you could check the attachment: +_Works_on_PySpark_CLI.png_+
h2. Reproduce Step

In Zeppelin using Livy as interpreter
{code:java}
%pyspark

import sys
import logging;

// OUTPUT
Spark Application Id: application_1591899500515_0002

{code}
When the 1st time, we try to print log to stdout or stderr, it works well.
{code:java}
%pyspark

logger = logging.getLogger("log_example")
logger.setLevel(logging.ERROR)
ch = logging.StreamHandler(sys.stderr)
ch.setLevel(logging.ERROR)
logger.addHandler(ch)
logger.error("test error!")

// OUTPUT is expected
test error!{code}
When we try to print log to stdout or stderr 2nd time and afterwards, it will show the error stack.

 
{code:java}
%pyspark

logger.error("test error again!")

// OUTPUT showing error stack
--- Logging error ---
Traceback (most recent call last):
  File "/usr/lib64/python3.7/logging/__init__.py", line 1028, in emit
    stream.write(msg + self.terminator)
  File "/tmp/1262710270598062870", line 534, in write
    super(UnicodeDecodingStringIO, self).write(s)
ValueError: I/O operation on closed file
Call stack:
  File "/tmp/1262710270598062870", line 714, in <module>
    sys.exit(main())
  File "/tmp/1262710270598062870", line 686, in main
    response = handler(content)
  File "/tmp/1262710270598062870", line 318, in execute_request
    result = node.execute()
  File "/tmp/1262710270598062870", line 229, in execute
    exec(code, global_dict)
  File "<stdin>", line 1, in <module>
Message: 'test error again!'{code}
For Jupyter notebook, or Linux curl command, they got the same error. You could check the attachments:

+_1. Zeppelin_use_Livy_bug.png_+

+_2. JupyterNotebook_use_Livy_bug.png_+

+_3. LinuxCurl_use_Livy_error.png_+

 

 


> Logging does not print to stdout or stderr correctly on PySpark through Livy
> ----------------------------------------------------------------------------
>
>                 Key: LIVY-774
>                 URL: https://issues.apache.org/jira/browse/LIVY-774
>             Project: Livy
>          Issue Type: Bug
>          Components: API
>    Affects Versions: 0.7.0
>            Reporter: Chao Gao
>            Priority: Critical
>         Attachments: JupyterNotebook_use_Livy_bug.png, LinuxCurl_use_Livy_error.png, Works_on_PySpark_CLI.png, Zeppelin_use_Livy_bug.png
>
>
> h2. Summary
> When using PySpark through Livy on Zeppelin or Jupyter Notebook, or Linux curl,  For the
>  1st time, it could print out the log to stdout or stderr. But for the 2nd time and afterwards, it will show the error stack:  {color:#ff0000}{{ValueError: I/O operation on closed file}}{color}
> If we use PySpark CLI on the master node, it works well, you could check the attachment: +_Works_on_PySpark_CLI.png_+
> h2. Reproduce Step
> In Zeppelin using Livy as interpreter
> {code:java}
> %pyspark
> import sys
> import logging;
> // OUTPUT
> Spark Application Id: application_1591899500515_0002
> {code}
> When the 1st time, we try to print log to stdout or stderr, it works well.
> {code:java}
> %pyspark
> logger = logging.getLogger("log_example")
> logger.setLevel(logging.ERROR)
> ch = logging.StreamHandler(sys.stderr)
> ch.setLevel(logging.ERROR)
> logger.addHandler(ch)
> logger.error("test error!")
> // OUTPUT is expected
> test error!{code}
> When we try to print log to stdout or stderr 2nd time and afterwards, it will show the error stack.
> {code:java}
> %pyspark
> logger.error("test error again!")
> // OUTPUT showing error stack
> --- Logging error ---
> Traceback (most recent call last):
>   File "/usr/lib64/python3.7/logging/__init__.py", line 1028, in emit
>     stream.write(msg + self.terminator)
>   File "/tmp/1262710270598062870", line 534, in write
>     super(UnicodeDecodingStringIO, self).write(s)
> ValueError: I/O operation on closed file
> Call stack:
>   File "/tmp/1262710270598062870", line 714, in <module>
>     sys.exit(main())
>   File "/tmp/1262710270598062870", line 686, in main
>     response = handler(content)
>   File "/tmp/1262710270598062870", line 318, in execute_request
>     result = node.execute()
>   File "/tmp/1262710270598062870", line 229, in execute
>     exec(code, global_dict)
>   File "<stdin>", line 1, in <module>
> Message: 'test error again!'{code}
> For Jupyter notebook, or Linux curl command, they got the same error. You could check the attachments:
> +_1. Zeppelin_use_Livy_bug.png_+
> +_2. JupyterNotebook_use_Livy_bug.png_+
> +_3. LinuxCurl_use_Livy_error.png_+
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)