You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "mcdull_zhang (Jira)" <ji...@apache.org> on 2022/03/13 12:19:00 UTC

[jira] [Created] (SPARK-38542) UnsafeHashedRelation should serialize numKeys out

mcdull_zhang created SPARK-38542:
------------------------------------

             Summary: UnsafeHashedRelation should serialize numKeys out
                 Key: SPARK-38542
                 URL: https://issues.apache.org/jira/browse/SPARK-38542
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.2.0
            Reporter: mcdull_zhang


At present, UnsafeHashedRelation does not write out numKeys during serialization, so the numKeys of UnsafeHashedRelation obtained by deserialization is equal to 0. The numFields of UnsafeRows returned by UnsafeHashedRelation.keys() are all 0, which can lead to missing or incorrect data.

 

For example, in SubqueryBroadcastExec, the HashedRelation.keys() function is called.
{code:java}
val broadcastRelation = child.executeBroadcast[HashedRelation]().value
val (iter, expr) = if (broadcastRelation.isInstanceOf[LongHashedRelation]) {
  (broadcastRelation.keys(), HashJoin.extractKeyExprAt(buildKeys, index))
} else {
  (broadcastRelation.keys(),
    BoundReference(index, buildKeys(index).dataType, buildKeys(index).nullable))
}{code}
 

 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org