You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Sebastian Nagel <wa...@googlemail.com> on 2017/04/27 16:03:14 UTC
[Pyspark, Python 2.7] Executor hangup caused by Unicode error while
logging uncaught exception in worker
Hi,
I've seen a hangup of a job (resp. one of the executors) if the message of an uncaught exception
contains bytes which cannot be properly decoded as Unicode characters. The last lines in the
executor logs were
PySpark worker failed with exception:
Traceback (most recent call last):
File
"/data/1/yarn/local/usercache/ubuntu/appcache/application_1492496523387_0009/container_1492496523387_0009_01_000006/pyspark.zip/pyspark/worker.py",
lin
e 178, in main
write_with_length(traceback.format_exc().encode("utf-8"), outfile)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x8b in position 1386: ordinal not in range(128)
After that nothing happened for hours, no CPU used on the machine running the executor.
First seen with Spark on Yarn
Spark 2.1.0, Scala 2.11.8
Python 2.7.6
Hadoop 2.6.0-cdh5.11.0
Reproduced with Spark 2.1.0 and Python 2.7.12 in local mode and traced down to this small script:
https://gist.github.com/sebastian-nagel/310a5a5f39cc668fb71b6ace208706f7
Is this a known problem?
Of course, one may argue that the job would have been failed anyway, but a hang-up isn't that nice,
on Yarn it blocks resources (containers) until killed.
Thanks,
Sebastian
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org