You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/12/08 19:24:10 UTC

[GitHub] [incubator-pinot] mayankshriv opened a new issue #6334: SegmentPurger does not handle schema evolution gracefully

mayankshriv opened a new issue #6334:
URL: https://github.com/apache/incubator-pinot/issues/6334


   We ran into an issue where SegmentPurger failed due to schema evolution as follows:
   
   - New columns were added into the schema.
   - Table index was updated to have inverted index on some of the newly added columns.
   - An explicit backfill was not performed.
   
   When the SegmentPurger tried to purge older segments, it failed with the following error:
   `java.lang.IllegalStateException: Cannot create inverted index for column: <xxx> because it is not in schema`
   
   This is likely because SegmentPurger used the schema in the segment as opposed to the schema in the controller.
   It would be desirable for SegmentPurger to gracefully handle this scenario.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mayankshriv commented on issue #6334: SegmentPurger does not handle schema evolution gracefully

Posted by GitBox <gi...@apache.org>.
mayankshriv commented on issue #6334:
URL: https://github.com/apache/incubator-pinot/issues/6334#issuecomment-754109980


   No, SegmentPurger uses the table config from controller (to identify that a it needs to build inverted index for a column), but it uses the schema in the segment and does not find the newly added column (as neither segment reload nor backfill happened), and hence the error.
   Hope this answers your question.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on issue #6334: SegmentPurger does not handle schema evolution gracefully

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on issue #6334:
URL: https://github.com/apache/incubator-pinot/issues/6334#issuecomment-740917091


   If the schema was updated with the new columns, then the schema in the controller would have the new columns right? Perhaps you meant the other way around (i.e. "used the schema in controller as opposed to the schema in the segment") ?
   
   Speaking of which, I think it will be super useful to retain the schema evolution in zookeeper (i.e. versioned schemas with some metadata on when an update was done). It can be used to make decisions such as those by segment purger. In this case, the purger could also have decided to backfill the columns with default values, for example.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org