You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "justaparth (via GitHub)" <gi...@apache.org> on 2023/04/27 23:16:59 UTC

[GitHub] [spark] justaparth commented on a diff in pull request #40686: [SPARK-43051][CONNECTOR] Add option to materialize zero values for fields without presence information

justaparth commented on code in PR #40686:
URL: https://github.com/apache/spark/pull/40686#discussion_r1179790306


##########
connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufDeserializer.scala:
##########
@@ -288,7 +289,21 @@ private[sql] class ProtobufDeserializer(
       var skipRow = false
       while (i < validFieldIndexes.length && !skipRow) {
         val field = validFieldIndexes(i)
-        val value = if (field.isRepeated || field.hasDefaultValue || record.hasField(field)) {
+
+        // If `materializeZeroValues` is true, the written field will contain the
+        // default zero value for its type (e.g. 0 for int, "" for string, etc:
+        // https://protobuf.dev/programming-guides/proto3/#default).
+        // Otherwise, the field will be null unless the serialized proto explicitly
+        // contains the field, or it has an explicit default (
+        // proto2 only https://protobuf.dev/programming-guides/proto2/#optional)
+        // Note that in proto3, the serialized proto will not contain the field
+        // if the value was explicitly set to its zero value. See:
+        // https://protobuf.dev/programming-guides/field_presence/#presence-in-proto3-apis

Review Comment:
   ok quick update, i took another pass at this based on some comments below and i think the code is much clearer in its intent now! 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org