You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Michael N (JIRA)" <ji...@apache.org> on 2017/09/28 23:56:00 UTC

[jira] [Created] (SPARK-22163) Design Issue of Spark Streaming that Causes Random Run-time Exception

Michael N created SPARK-22163:
---------------------------------

             Summary: Design Issue of Spark Streaming that Causes Random Run-time Exception
                 Key: SPARK-22163
                 URL: https://issues.apache.org/jira/browse/SPARK-22163
             Project: Spark
          Issue Type: Bug
          Components: DStreams, Structured Streaming
    Affects Versions: 2.2.0
         Environment: Spark Streaming
Kafka
Linux
            Reporter: Michael N
            Priority: Critical


The application objects can contain List and can be modified dynamically as well.   However, Spark Streaming framework asynchronously serializes the application's objects as the application runs.  Therefore, it causes random run-time exception on the List when Spark Streaming framework happens to serializes the application's objects while the application modifies a List in its own object.  

In fact, there are multiple bugs reported about

Caused by: java.util.ConcurrentModificationException
at java.util.ArrayList.writeObject

that are permutation of the same root cause. So the design issue of Spark streaming framework is that it should do this serialization asynchronously.  Instead, it should either

1. do this serialization synchronously. This is preferred to eliminate the issue completely.  Or

2. Allow it to be configured per application whether to do this serialization synchronously or asynchronously, depending on the nature of each application.

Also, Spark documentation should describe the conditions that trigger Spark to do this type of serialization asynchronously, so the applications can work around them until the fix is provided. 





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org