You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2020/12/09 00:45:11 UTC

[GitHub] [hive] rbalamohan commented on a change in pull request #1753: HIVE-24503 : Optimize vector row serde by avoiding type check at run time.

rbalamohan commented on a change in pull request #1753:
URL: https://github.com/apache/hive/pull/1753#discussion_r538916121



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSerializeRow.java
##########
@@ -61,27 +61,16 @@
   private Field root;
 
   private static class Field {
-    Field[] children;
-
-    boolean isPrimitive;
-    Category category;
-    PrimitiveCategory primitiveCategory;
-    TypeInfo typeInfo;
-
-    int count;
-
-    ObjectInspector objectInspector;
-    int outputColumnNum;
-
+    Field[] children = null;
+    boolean isPrimitive = false;
+    Category category = null;
+    PrimitiveCategory primitiveCategory = null;
+    TypeInfo typeInfo = null;
+    int count = 0;
+    ObjectInspector objectInspector = null;
+    int outputColumnNum = -1;
+    VectorSerializeWriter writer = null;
     Field() {

Review comment:
       Can be removed.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorDeserializeRow.java
##########
@@ -933,12 +1207,20 @@ private void storeUnionRowColumn(ColumnVector colVector,
     unionColVector.isNull[batchIndex] = false;
     unionColVector.tags[batchIndex] = tag;
 
-    storeComplexFieldRowColumn(
+    deserializer.storeComplexFieldRowColumn(
         colVectorFields[tag],
         unionHelper.getFields()[tag],
         batchIndex,
         canRetainByteRef);
-    deserializeRead.finishComplexVariableFieldsType();
+    deserializer.deserializeRead.finishComplexVariableFieldsType();
+  }
+
+  abstract static class VectorBatchDeserializer {
+    abstract void store(ColumnVector colVector, Field field, int batchIndex, boolean canRetainByteRef,
+                            VectorDeserializeRow deserializer) throws IOException;

Review comment:
       Why VectorDeserializerRow needs be passed here again? ("this" references in other places as well). If you remove "static" class declaration in VectorBatchDeserializer children, you may not need to pass this.  And the patch would become lot lesser changes?

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSerializeRow.java
##########
@@ -274,44 +315,25 @@ private void serializeWrite(
       return;
     }
     isAllNulls = false;
+    field.writer.serialize(colVector, field, adjustedBatchIndex, this);
+  }
 
-    if (field.isPrimitive) {
-      serializePrimitiveWrite(colVector, field, adjustedBatchIndex);
-      return;
-    }
-    final Category category = field.category;
-    switch (category) {
-    case LIST:
-      serializeListWrite(
-          (ListColumnVector) colVector,
-          field,
-          adjustedBatchIndex);
-      break;
-    case MAP:
-      serializeMapWrite(
-          (MapColumnVector) colVector,
-          field,
-          adjustedBatchIndex);
-      break;
-    case STRUCT:
-      serializeStructWrite(
-          (StructColumnVector) colVector,
-          field,
-          adjustedBatchIndex);
-      break;
-    case UNION:
-      serializeUnionWrite(
-          (UnionColumnVector) colVector,
-          field,
-          adjustedBatchIndex);
-      break;
-    default:
-      throw new RuntimeException("Unexpected category " + category);
+  abstract static class VectorSerializeWriter {
+    abstract void serialize(Object colVector, Field field, int adjustedBatchIndex,
+                            VectorSerializeRow serializeRow) throws IOException;

Review comment:
       Same as earlier. VectorSerializeRow need not be passed here. Patch may need lesser changes if you remove static declaration on children.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org