You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/06/08 22:35:05 UTC

[GitHub] [pinot] Jackie-Jiang commented on a diff in pull request #8859: Proper null handling in all Pinot layers for raw and dictionary-encoded SV columns for all data types

Jackie-Jiang commented on code in PR #8859:
URL: https://github.com/apache/pinot/pull/8859#discussion_r892910832


##########
pinot-common/src/main/java/org/apache/pinot/common/request/context/FunctionContext.java:
##########
@@ -69,6 +70,25 @@ public void getColumns(Set<String> columns) {
     }
   }
 
+  /**
+   * Retrieve recursively all identifiers passed as arguments to the function.
+   */
+  public List<String> getAllIdentifiers() {

Review Comment:
   We should probably use `getColumns()` which can also dedup the identifiers



##########
pinot-common/src/main/java/org/apache/pinot/common/utils/DataSchema.java:
##########
@@ -345,6 +345,9 @@ public DataType toDataType() {
      * compatible with the type.
      */
     public Serializable convert(Object value) {
+      if (value == null) {

Review Comment:
   Consider doing the null check on the caller side. The passed in value should always be non-null



##########
pinot-core/src/main/java/org/apache/pinot/core/common/BlockValSet.java:
##########
@@ -30,6 +31,11 @@
  */
 public interface BlockValSet {
 
+  /**
+   * Returns the null value bitmap in the value set.
+   */
+  ImmutableRoaringBitmap getNullBitmap();

Review Comment:
   Annotate it as `nullable`



##########
pinot-core/src/main/java/org/apache/pinot/core/common/datatable/DataTableImplV4.java:
##########
@@ -0,0 +1,384 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.pinot.core.common.datatable;
+
+import java.io.ByteArrayOutputStream;
+import java.io.DataOutputStream;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.HashMap;
+import java.util.Map;
+import org.apache.pinot.common.response.ProcessingException;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.core.query.request.context.ThreadTimer;
+import org.roaringbitmap.buffer.MutableRoaringBitmap;
+
+
+/**
+ * Datatable V4 implementation.
+ * The layout of serialized V4 datatable looks like:
+ * +-----------------------------------------------+
+ * | 17 integers of header:                        |
+ * | VERSION                                       |
+ * | NUM_ROWS                                      |
+ * | NUM_COLUMNS                                   |
+ * | EXCEPTIONS SECTION START OFFSET               |
+ * | EXCEPTIONS SECTION LENGTH                     |
+ * | DICTIONARY_MAP SECTION START OFFSET           |
+ * | DICTIONARY_MAP SECTION LENGTH                 |
+ * | DATA_SCHEMA SECTION START OFFSET              |
+ * | DATA_SCHEMA SECTION LENGTH                    |
+ * | FIXED_SIZE_DATA SECTION START OFFSET          |
+ * | FIXED_SIZE_DATA SECTION LENGTH                |
+ * | VARIABLE_SIZE_DATA SECTION START OFFSET       |
+ * | VARIABLE_SIZE_DATA SECTION LENGTH             |
+ * | FIXED_SIZE_NULL_VECTOR SECTION START OFFSET   |
+ * | FIXED_SIZE_NULL_VECTOR SECTION LENGTH         |
+ * | VARIABLE_SIZE_NULL_VECTOR SECTION START OFFSET|
+ * | VARIABLE_SIZE_NULL_VECTOR SECTION LENGTH      |
+ * +-----------------------------------------------+
+ * | EXCEPTIONS SECTION                            |
+ * +-----------------------------------------------+
+ * | DICTIONARY_MAP SECTION                        |
+ * +-----------------------------------------------+
+ * | DATA_SCHEMA SECTION                           |
+ * +-----------------------------------------------+
+ * | FIXED_SIZE_DATA SECTION                       |
+ * +-----------------------------------------------+
+ * | VARIABLE_SIZE_DATA SECTION                    |
+ * +-----------------------------------------------+
+ * | FIXED_SIZE_NULL_VECTOR SECTION                |
+ * +-----------------------------------------------+
+ * | VARIABLE_SIZE_VECTOR SECTION SECTION          |
+ * +-----------------------------------------------+
+ * | METADATA LENGTH                               |
+ * | METADATA SECTION                              |
+ * +-----------------------------------------------+
+ */
+public class DataTableImplV4 extends DataTableImplV3 {

Review Comment:
   @walterddr Please take a look at this new format, and let's address the remaining inefficiency with this new format



##########
pinot-core/src/main/java/org/apache/pinot/core/common/RowBasedBlockValueFetcher.java:
##########
@@ -43,6 +44,10 @@ public Object[] getRow(int docId) {
     return row;
   }
 
+  public ImmutableRoaringBitmap getColumnNullBitmap(int colId) {

Review Comment:
   If null handling is enabled, we can directly set `null` in `getRow()` instead of relying on the caller to fill the null values



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org