You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Martin Mauch (JIRA)" <ji...@apache.org> on 2017/05/15 14:54:04 UTC

[jira] [Created] (SPARK-20745) Data gets wrongly copied from one row to others, possibly related to named structs

Martin Mauch created SPARK-20745:
------------------------------------

             Summary: Data gets wrongly copied from one row to others, possibly related to named structs
                 Key: SPARK-20745
                 URL: https://issues.apache.org/jira/browse/SPARK-20745
             Project: Spark
          Issue Type: Bug
          Components: Input/Output
    Affects Versions: 2.1.1
            Reporter: Martin Mauch


We encountered a strange bug where Spark copies data over from one row to other rows. It might be related to named structs, at least the minimal repro we were able to achieve involves them: https://github.com/crealytics/spark_bug/blob/master/src/test/scala/spark/DataFrameConversionsSpec.scala
The interesting part is that Spark behaves correctly when the DataFrame is cached (see the 2nd example) and also if you run the failing example a second time (see 1st vs 3rd example).
You should be able to check out the above project and reproduce the problem with
sbt test



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org