You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/04/02 22:58:11 UTC

[GitHub] [incubator-druid] jihoonson commented on issue #6319: [Proposal] Minimizing the lock granularity for indexing

jihoonson commented on issue #6319: [Proposal] Minimizing the lock granularity for indexing
URL: https://github.com/apache/incubator-druid/issues/6319#issuecomment-479243572
 
 
   @jon-wei thank you for your review!
   
   > When using the segment-level locks, can you add more description to this part that points out that the new segments with the same segmentGranularity would have the same version string as the segments being overshadowed?
   
   Added.
   
   > In the Changes to TaskLockbox section, can you clarify the difference between shared and mixed locks?
   
   Changed statements. The shared lock is a type of lock which was added in https://github.com/apache/incubator-druid/pull/4550.
   
   > For private final Set<Integer> overshadowingSegments;, I think it's worth mentioning that an alternative design would be to store the complete overshadowing graph here
   
   Added.
   
   > In this proposal, what happens in the following case?
   > * Suppose we have two segments as partitions of a DAY granularity time chunk, partitions 0,1
   > * Two compaction tasks are issued on those segments simultaneously, which would compact 0,1 to a new single DAY granularity segment
   > * The two tasks both attempt to get a lock on (0,1)
   > * One task gets a lock on (0,1) first and creates segment with partition 2, its overshadowedSegments are (0,1)
   > * Now that the locks are free, what does the second task do? Would it try to compact (0,1) again?
   
   Good question! In this case, the compaction task would check the input segments are the latest ones before starting reindexing. So, the second is supposed to fail because its input segments won't be the latest ones anymore.
   
   Hadoop task also does a similar check. So, maybe it's fine.. But I think it makes sense to add a check to make second overwriting task failed immediately because there's no point of waiting for getting a lock.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org