You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/10/20 18:20:08 UTC

[GitHub] [pinot] ksnijjer opened a new issue #7607: Allow changing existing Pinot schema for certain scenarios

ksnijjer opened a new issue #7607:
URL: https://github.com/apache/pinot/issues/7607


   Currently it seems like Pinot is very rigid about **only** allowing schema changes where a new column has been added. If you need to make any changes to existing columns for e.g changing column from int to long or timestamp format change, which don't affect existing data you are still forced to delete and recreate the table/schema which is very limiting and depending on amount of data that will need to be reprocessed perhaps not even feasible.
   
   Here is a sample scenario, I created a schema where one of the datetimeSpec columns has a ingestion transform
   `{
         "name": "CreationTimeMillis",
         "dataType": "LONG",
         "defaultNullValue": 1592703025,
         "transformFunction": "fromDateTime(CreateTime, \"yyyy-MM-dd'T'HH:mm:ssZ\")",
         "format": "1:MILLISECONDS:EPOCH",
         "granularity": "1:MILLISECONDS"
       }`
   now apparently time format specification here is incorrect and I need to change this to
   
   `{
         "name": "CreationTimeInMillis",
         "dataType": "LONG",
         "defaultNullValue": 1592703058,
         "transformFunction": "fromDateTime(CreateTime, 'yyyy-MM-dd''T''HH:mm:ssZ')",
         "format": "1:MILLISECONDS:EPOCH",
         "granularity": "1:MILLISECONDS"
       }`
   but this is now considered as a backward incompatible change.
   
   Can we add some additional intelligence in the logic so such changes which don't affect underlying data are allowed to go through ?
   
   cc @mayankshriv 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] klsince commented on issue #7607: Allow changing existing Pinot schema for certain scenarios

Posted by GitBox <gi...@apache.org>.
klsince commented on issue #7607:
URL: https://github.com/apache/pinot/issues/7607#issuecomment-948897213


   yeah, changing `defaultNullValue` should regenerate the column while reloading the segments. But need to loosen the sanity check for updateSchema rest API, which only allows to add new columns today. Basically, I think we may skip check on `defaultNullValue` [here](https://github.com/apache/pinot/blob/master/pinot-spi/src/main/java/org/apache/pinot/spi/data/Schema.java#L689)
   
   Currently, changes of `transformFunction` doesn't trigger the regeneration, and we may support it by extending the logic around [here](https://github.com/apache/pinot/blob/master/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/loader/defaultcolumn/BaseDefaultColumnHandler.java#L234) and keeping track of the transformFunc in the segment metadata (as defined [here](https://github.com/apache/pinot/blob/master/pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/V1Constants.java#L94)) to check discrepancy.
   
   pls lemme know your thoughts, and any interest to take a stab on them. thx! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mayankshriv commented on issue #7607: Allow changing existing Pinot schema for certain scenarios

Posted by GitBox <gi...@apache.org>.
mayankshriv commented on issue #7607:
URL: https://github.com/apache/pinot/issues/7607#issuecomment-947927349


   Thanks @ksnijjer for filing the issue. I agree, in this case, user should be allowed to fix the issue. I can see it would be hard for the tool to distinguish such case from real back incompatible ones. One possibility is to allow for an override. However, that would come with the onus on user to ensure that the change is not backward incompatible otherwise.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on issue #7607: Allow changing existing Pinot schema for certain scenarios

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on issue #7607:
URL: https://github.com/apache/pinot/issues/7607#issuecomment-948111714


   With index removal support, we should be able to support changing default null value. @klsince Can you please take a look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] ksnijjer commented on issue #7607: Allow changing existing Pinot schema for certain scenarios

Posted by GitBox <gi...@apache.org>.
ksnijjer commented on issue #7607:
URL: https://github.com/apache/pinot/issues/7607#issuecomment-947935543


   Yeah I think a mix of user level override (with the implicit risk) as well as code level automation for e.g allow data type expansion for a column should make it more usable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org