You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Abdeali Kothari (JIRA)" <ji...@apache.org> on 2018/11/09 13:48:00 UTC

[jira] [Created] (SPARK-25992) Accumulators giving KeyError in pyspark

Abdeali Kothari created SPARK-25992:
---------------------------------------

             Summary: Accumulators giving KeyError in pyspark
                 Key: SPARK-25992
                 URL: https://issues.apache.org/jira/browse/SPARK-25992
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 2.3.1
            Reporter: Abdeali Kothari


I am using accumulators and when I run my code, I sometimes get some warn messages. When I checked, there was nothing accumulated - not sure if I lost info from the accumulator or it worked and I can ignore this error ?

The message:
{noformat}
Exception happened during processing of request from
('127.0.0.1', 62099)
Traceback (most recent call last):
File "/Users/abdealijk/anaconda3/lib/python3.6/socketserver.py", line 317, in _handle_request_noblock
    self.process_request(request, client_address)
File "/Users/abdealijk/anaconda3/lib/python3.6/socketserver.py", line 348, in process_request
    self.finish_request(request, client_address)
File "/Users/abdealijk/anaconda3/lib/python3.6/socketserver.py", line 361, in finish_request
    self.RequestHandlerClass(request, client_address, self)
File "/Users/abdealijk/anaconda3/lib/python3.6/socketserver.py", line 696, in __init__
    self.handle()
File "/usr/local/hadoop/spark2.3.1/python/pyspark/accumulators.py", line 238, in handle
    _accumulatorRegistry[aid] += update
KeyError: 0
----------------------------------------
2018-11-09 19:09:08 ERROR DAGScheduler:91 - Failed to update accumulators for task 0
org.apache.spark.SparkException: EOF reached before Python server acknowledged
	at org.apache.spark.api.python.PythonAccumulatorV2.merge(PythonRDD.scala:634)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$updateAccumulators$1.apply(DAGScheduler.scala:1131)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$updateAccumulators$1.apply(DAGScheduler.scala:1123)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
	at org.apache.spark.scheduler.DAGScheduler.updateAccumulators(DAGScheduler.scala:1123)
	at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1206)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1820)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1772)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1761)
	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org