You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/08/12 21:03:39 UTC

[GitHub] [beam] reuvenlax commented on a change in pull request #15327: [BEAM-12754] Only call getValue once per field per row

reuvenlax commented on a change in pull request #15327:
URL: https://github.com/apache/beam/pull/15327#discussion_r688083091



##########
File path: sdks/java/core/src/main/java/org/apache/beam/sdk/coders/RowCoderGenerator.java
##########
@@ -316,27 +318,43 @@ static void encodeDelegate(
 
       // Encode the field count. This allows us to handle compatible schema changes.
       VAR_INT_CODER.encode(value.getFieldCount(), outputStream);
-      // Encode a bitmap for the null fields to save having to encode a bunch of nulls.
-      NULL_LIST_CODER.encode(scanNullFields(value, hasNullableFields), outputStream);
-      for (int encodingPos = 0; encodingPos < value.getFieldCount(); ++encodingPos) {
-        @Nullable Object fieldValue = value.getValue(encodingPosToIndex[encodingPos]);
-        if (fieldValue != null) {
-          coders[encodingPos].encode(fieldValue, outputStream);
+
+      if (hasNullableFields) {
+        // If the row has null fields, extract the values out once so that both scanNullFields and
+        // the encoding can share it and avoid having to extract them twice.
+
+        List<Object> fieldValues = value.getValues();
+        // Encode a bitmap for the null fields to save having to encode a bunch of nulls.
+        NULL_LIST_CODER.encode(scanNullFields(fieldValues), outputStream);
+        for (int encodingPos = 0; encodingPos < fieldValues.size(); ++encodingPos) {
+          @Nullable Object fieldValue = fieldValues.get(encodingPosToIndex[encodingPos]);

Review comment:
       unfortunately I don't think this is the same. Row.getValue() currently does a deep conversion while Row.getValues() does not (e.g. see RowWithGetters.getValue v.s. getValues()).
   
   It's possible that Row.getValues() should be equivalent to calling getValue() on each element. I don't remember whether there was a good reason or not for it to be differentt.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org