You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "Jackie-Jiang (via GitHub)" <gi...@apache.org> on 2023/05/23 23:51:53 UTC

[GitHub] [pinot] Jackie-Jiang commented on a diff in pull request #10715: Fix #10713 by considering tableConfig.indexingConfig.sortedColumns as…

Jackie-Jiang commented on code in PR #10715:
URL: https://github.com/apache/pinot/pull/10715#discussion_r1203169463


##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/IndexReaderFactory.java:
##########
@@ -45,6 +45,18 @@ R createIndexReader(SegmentDirectory.Reader segmentReader, FieldIndexConfigs fie
     protected abstract R createIndexReader(PinotDataBuffer dataBuffer, ColumnMetadata metadata, C indexConfig)
         throws IOException, IndexReaderConstraintException;
 
+    /**
+     * Sometimes the index configuration indicates that the index should be disabled but the reader actually contains
+     * a buffer for the index type.
+     *
+     * By default, the buffer has priority over the configuration, so in case we have a buffer we would create an index

Review Comment:
   Sorted column config is actually quite tricky. It is used when creating the segment (on the minion side, or when sealing the consuming segment), but once the segment is created, it is no longer honored, and we rely on the metadata to determine whether a column is sorted. There could be multiple columns sorted, but they might not be configured as sorted column.
   When a column is sorted, but configured as no-dictionary column, we choose to ignore the no-dictionary config because it is almost always more efficient to use dictionary encoding for sorted column. The most common scenario would be user added a new column to an existing table, and wants it to be no-dictionary. For the existing segments, Pinot will backfill default value for the new column, so it contains only one value thus being sorted. In such case, we don't want to use no-dictionary because that won't be efficient.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org