You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/06/09 23:31:53 UTC

[GitHub] [pinot] somandal opened a new issue, #8875: Add support for querying raw Multi-value columns

somandal opened a new issue, #8875:
URL: https://github.com/apache/pinot/issues/8875

   Today though it is possible to declare a multi-value column as a raw column (without dictionary), when this is enabled then the segment fails to load because of the following pre-condition check (tried this for a multi-value INT type column) in the `BaseChunkSVForwardIndexReader`:
   
   ```
       if (valueType.isFixedWidth()) {
           Preconditions.checkState(_lengthOfLongestEntry == valueType.size());
       }
   ```
   
   In the above, the `_lengthOfLongestEntry` is set to be the `size of the valueType * maxNumberOfMultiValueEntries` for the column. The tests for `MultiValueFixedByteRawIndexCreatorTest` do not fail because when instantiating the reader, the DataType used is BYTES rather than the actual DataType which is stored in the metadata (which in this case turns out to be be DataType.INT). The above needs to be fixed.
   
   Just for testing purposes, on commenting out the above check, segment load is successful, but even `SELECT * from table` type queries fail with the following error:
   
   ```
   14:06:37.347 BaseCombineOperator - Caught exception while processing query: QueryContext{_tableName='testTable', _subquery=null, _selectExpressions=[*], _aliasList=[null], _filter=(column1 > '100000000' AND column2 BETWEEN '20000000' AND '1000000000' AND column3 != 'w' AND (column6 < '500000' OR column7 NOT IN ('225','407')) AND daysSinceEpoch = '1756015683'), _groupByExpressions=null, _havingFilter=null, _orderByExpressions=null, _limit=10, _offset=0, _queryOptions={}, _expressionOverrideHints={}, _explain=false}
   java.lang.UnsupportedOperationException: null
   	at org.apache.pinot.segment.spi.index.reader.ForwardIndexReader.getDictIdMV(ForwardIndexReader.java:100) ~[classes/:?]
   	at org.apache.pinot.core.operator.dociditerators.MVScanDocIdIterator.next(MVScanDocIdIterator.java:60) ~[classes/:?]
   	at org.apache.pinot.core.operator.dociditerators.MVScanDocIdIterator.advance(MVScanDocIdIterator.java:72) ~[classes/:?]
   	at org.apache.pinot.core.operator.dociditerators.OrDocIdIterator.advance(OrDocIdIterator.java:91) ~[classes/:?]
   	at org.apache.pinot.core.operator.dociditerators.AndDocIdIterator.next(AndDocIdIterator.java:51) ~[classes/:?]
   	at org.apache.pinot.core.operator.DocIdSetOperator.getNextBlock(DocIdSetOperator.java:72) ~[classes/:?]
   	at org.apache.pinot.core.operator.DocIdSetOperator.getNextBlock(DocIdSetOperator.java:38) ~[classes/:?]
   	at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:39) ~[classes/:?]
   	at org.apache.pinot.core.operator.ProjectionOperator.getNextBlock(ProjectionOperator.java:62) ~[classes/:?]
   	at org.apache.pinot.core.operator.ProjectionOperator.getNextBlock(ProjectionOperator.java:34) ~[classes/:?]
   	at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:39) ~[classes/:?]
   	at org.apache.pinot.core.operator.transform.PassThroughTransformOperator.getNextBlock(PassThroughTransformOperator.java:48) ~[classes/:?]
   	at org.apache.pinot.core.operator.transform.PassThroughTransformOperator.getNextBlock(PassThroughTransformOperator.java:31) ~[classes/:?]
   	at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:39) ~[classes/:?]
   	at org.apache.pinot.core.operator.query.SelectionOnlyOperator.getNextBlock(SelectionOnlyOperator.java:92) ~[classes/:?]
   	at org.apache.pinot.core.operator.query.SelectionOnlyOperator.getNextBlock(SelectionOnlyOperator.java:40) ~[classes/:?]
   	at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:39) ~[classes/:?]
   	at org.apache.pinot.core.operator.combine.BaseCombineOperator.processSegments(BaseCombineOperator.java:158) ~[classes/:?]
   	at org.apache.pinot.core.operator.combine.BaseCombineOperator$1.runJob(BaseCombineOperator.java:101) ~[classes/:?]
   	at org.apache.pinot.core.util.trace.TraceRunnable.run(TraceRunnable.java:40) ~[classes/:?]
   ```
   
   The reason for the above is that the `MVScanDocIdIterator` doesn't handle the raw multi-value cases and always assumed the multi-value column is dictionary encoded. Other queries also need to be tested to ensure that multi-value support for raw columns works correctly and doesn't assume that a dictionary is always available.
   
   cc @siddharthteotia 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia closed issue #8875: Add support for querying raw Multi-value columns

Posted by GitBox <gi...@apache.org>.
siddharthteotia closed issue #8875: Add support for querying raw Multi-value columns
URL: https://github.com/apache/pinot/issues/8875


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] jasperpotts commented on issue #8875: Add support for querying raw Multi-value columns

Posted by GitBox <gi...@apache.org>.
jasperpotts commented on issue #8875:
URL: https://github.com/apache/pinot/issues/8875#issuecomment-1198574061

   Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] somandal commented on issue #8875: Add support for querying raw Multi-value columns

Posted by GitBox <gi...@apache.org>.
somandal commented on issue #8875:
URL: https://github.com/apache/pinot/issues/8875#issuecomment-1151711497

   need to fix this to work on: https://github.com/apache/pinot/issues/7870


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on issue #8875: Add support for querying raw Multi-value columns

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on issue #8875:
URL: https://github.com/apache/pinot/issues/8875#issuecomment-1151713438

   Thanks @somandal. 
   
   Just FYI for others - this was found while working on fix for https://github.com/apache/pinot/issues/7870


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] jasperpotts commented on issue #8875: Add support for querying raw Multi-value columns

Posted by GitBox <gi...@apache.org>.
jasperpotts commented on issue #8875:
URL: https://github.com/apache/pinot/issues/8875#issuecomment-1198485726

   We just hit the same bug today with a long[] column with no dictionary. Is this fixed for v11?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on issue #8875: Add support for querying raw Multi-value columns

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on issue #8875:
URL: https://github.com/apache/pinot/issues/8875#issuecomment-1178331258

   Support has been added in https://github.com/apache/pinot/pull/8953 and  https://github.com/apache/pinot/pull/8993
   
   Closing as completed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] somandal commented on issue #8875: Add support for querying raw Multi-value columns

Posted by GitBox <gi...@apache.org>.
somandal commented on issue #8875:
URL: https://github.com/apache/pinot/issues/8875#issuecomment-1198557850

   hey @jasperpotts this fix is available on master. The next Pinot release is still in progress and this fix will be part of that release once it is out.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org