You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2019/05/20 20:57:33 UTC

[GitHub] [incubator-pinot] icefury71 opened a new issue #4225: Make Pinot schema evolution easier

icefury71 opened a new issue #4225: Make Pinot schema evolution easier
URL: https://github.com/apache/incubator-pinot/issues/4225
 
 
   This has been referenced in a few issues already:
   https://github.com/apache/incubator-pinot/issues/74
   https://github.com/apache/incubator-pinot/issues/4029
   
   But I'm creating a new issue to highlight the end-end problem.
   
   Here are the current steps to perform schema evolution in Pinot:
   1) Create a new schema with default values for new columns being added
   2) Update schema using Controller API
   3) Rolling restart the Pinot servers to "reload" the segments to reflect the default value.
   
   In general, restarting a Pinot server is very expensive (can take anywhere between 5 to 30 mins depending on number of segments). If we need to evolve the schema frequently, this becomes a huge operational overhead.
   
   An alternate way to resolve this issue is to backfill the old segments but this is an expensive process as well.
   
   **A better approach:**
   a) Segments which have been committed / ONLINE
   We can try to lookup the new schema "on the fly" during query processing using some technique (for eg: @mayankshriv suggested using virtual columns for old segments populated with default value). That way we don't depend on any server restart.
   
   b) Segments which are currently open / CONSUMING
   We need to solve this issue: https://github.com/apache/incubator-pinot/issues/151  (how to reflect new schema in an open segment).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org