You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Abdulla Al-Qawasmeh (JIRA)" <ji...@apache.org> on 2016/08/19 00:07:20 UTC

[jira] [Created] (SPARK-17145) Object with many fields causes Seq Serialization Bug

Abdulla Al-Qawasmeh created SPARK-17145:
-------------------------------------------

             Summary: Object with many fields causes Seq Serialization Bug 
                 Key: SPARK-17145
                 URL: https://issues.apache.org/jira/browse/SPARK-17145
             Project: Spark
          Issue Type: Bug
    Affects Versions: 2.0.0
         Environment: OS: OSX El Capitan 10.11.6

            Reporter: Abdulla Al-Qawasmeh


The unit test here (https://gist.github.com/abdulla16/433faf7df59fce11a5fff284bac0d945) describes the problem. 

It looks like Spark is having problems serializing a Scala Seq when it's part of an object with many fields (I'm not 100% sure it's a serialization problem). The deserialized Seq ends up with as many items as the original Seq, however, all the items are copies of the last item in the original Seq.

The object that I used in my unit test (as an example) is a Tuple5. However, I've seen this behavior in other types of objects. 

Reducing MyClass5 to only two fields (field34 and field35) causes the unit test to pass. 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org