You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/09/29 20:56:21 UTC

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6805: [HUDI-4949] optimize cdc read to avoid the problem of reusing buffer underlying the Row

alexeykudinkin commented on code in PR #6805:
URL: https://github.com/apache/hudi/pull/6805#discussion_r984015580


##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/cdc/HoodieCDCRDD.scala:
##########
@@ -516,7 +515,7 @@ class HoodieCDCRDD(
         val iter = loadFileSlice(fileSlice)
         iter.foreach { row =>
           val key = getRecordKey(row)
-          beforeImageRecords.put(key, serialize(row))
+          beforeImageRecords.put(key, serialize(row, copy = true))

Review Comment:
   Let's add a comment explaining why we're copying here (to avoid confusion)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org