You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Tathagata Das (JIRA)" <ji...@apache.org> on 2015/12/02 03:58:11 UTC

[jira] [Created] (SPARK-12087) DStream.saveAsHadoopFiles can throw ConcurrentModificationException

Tathagata Das created SPARK-12087:
-------------------------------------

             Summary: DStream.saveAsHadoopFiles can throw ConcurrentModificationException
                 Key: SPARK-12087
                 URL: https://issues.apache.org/jira/browse/SPARK-12087
             Project: Spark
          Issue Type: Bug
          Components: Streaming
    Affects Versions: 1.5.2, 1.4.1, 1.3.1
            Reporter: Tathagata Das
            Assignee: Tathagata Das


The JobConf object created in DStream.saveAsHadoopFiles is used concurrently in multiple places:
- The JobConf is updated by RDD.saveAsHadoopFile() before the job is launched
- The JobConf is serialized as part of the DStream checkpoints. 

These concurrent accesses (updating in one thread, while the another thread is serializing it) can lead to concurrentModidicationException in the underlying Java hashmap using in the internal Hadoop Configuration object. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org