You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/08/06 19:09:45 UTC

[GitHub] [incubator-druid] jihoonson commented on issue #8249: ability to let user configure segment version in indexing task

jihoonson commented on issue #8249: ability to let user configure segment version in indexing task
URL: https://github.com/apache/incubator-druid/issues/8249#issuecomment-518803775
 
 
   I think it could depend on the lock granularity (segment lock vs time chunk lock) and the rollup mode (perfect rollup vs best-effort rollup). 
   
   I understand your use case could need this kind of feature. But before we talk about implementation details, I'm wondering this is really a good idea. Even though Hadoop task already supports the custom segment version, I feel like it's a hacky way to avoid the segment versioning system of Druid which could be hard to use and even dangerous if something happens (like they might see some stale data unexpectedly). Also, it's very weird to me if indexing tasks could generate segments overshadowed by the existing segments. It could be just waste of time and resources I guess.
   
   > We have an use case (for Parallel index task and Local Index task) where the overshadowing should happen based on when the data was generated by the ETL pipelines and not when Druid indexing is running for those which could many times run in different order for many reasons e.g. Druid tasks may fail and are resubmitted.
   
   I guess you're using a sort of workflow scheduler tool and, ideally, this issue should be addressed in the tool. Do we need this because it's too hard or complex to guarantee the proper job execution order in the tool? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org