You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/12/12 18:45:19 UTC
[GitHub] [pinot] somandal opened a new issue, #9972: MV duplicate handling for forwardIndexDisabled columns
somandal opened a new issue, #9972:
URL: https://github.com/apache/pinot/issues/9972
Support for disabling the forward index was added (details can be found in this issue: https://github.com/apache/pinot/issues/6473). As part of our analysis, we found that for MV columns with duplicate entries within a row, regenerating the forward index to include the duplicated entries is not possible today. More details about this issue can be found in [this document](https://docs.google.com/document/d/1MNLLhYCg5e-UFBQ6wTBODd41sDsbjevwRfwoGuNowWw/edit?usp=sharing). To correctly regenerate the forward index for a MV column with duplicates within a row the information about the frequency of duplicated keys per row need to be tracked in an on-disk file. Opening this issue to track adding support for this.
Until this is fixed, MV columns with duplicates will need to be backfilled if the forward index is to be enabled at a later point in time. Or customers need to assess that they do not need the duplicates per row, in which case reload code path will create the forward index without duplicates per row.
cc @siddharthteotia
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] somandal commented on issue #9972: MV duplicate (within a row) handling for forwardIndexDisabled columns
Posted by GitBox <gi...@apache.org>.
somandal commented on issue #9972:
URL: https://github.com/apache/pinot/issues/9972#issuecomment-1347432372
Right, but we don't plan to solve the reordering issue as of now. If ordering matters users shouldn't disable the forward index as trying to fix the ordering issue will probably take as much space as the forward index itself would take since we'd need to store ordering information.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] Jackie-Jiang commented on issue #9972: MV duplicate (within a row) handling for forwardIndexDisabled columns
Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on issue #9972:
URL: https://github.com/apache/pinot/issues/9972#issuecomment-1347429092
Also, the ordering of the values within the MV entry will also be lost after re-generating from inverted index. Currently certain functions (e.g. scalar functions under `ArrayFunctions`) treat MV as array.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org