You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/03/26 22:53:44 UTC

[GitHub] [druid] jihoonson opened a new issue #9571: Race in SegmentAllocateAction with segment lock

jihoonson opened a new issue #9571: Race in SegmentAllocateAction with segment lock
URL: https://github.com/apache/druid/issues/9571
 
 
   ### Affected Version
   
   0.16, 0.17
   
   ### Description
   
   Before 0.16, everything was simple and there was only one type of task lock, i.e., time chunk lock. With time chunk lock, only one task can work on the same datasource and the same time period at any time. As a result, `SegmentAllocateAction` could safely assume that there is only one task asking a new segment allocation for the same datasource and the same time period.
   
   However, with segment lock, multiple tasks can work on the same datasource and the same time period simultaneously. Multiple tasks can request new segment allocations at the same time, which introduces a race condition in segment allocation. 
   
   In `SegmentAllocateAction`, it first looks for segments of overlapping intervals with the request interval populated based on the given segment granularity. This is to avoid creating segments of overlapping intervals. To do so, [it first retrieves all overlapping _used segments_ with the request interval from the metadata store](https://github.com/apache/druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/common/actions/SegmentAllocateAction.java#L195-L196). However, this is not enough with segment lock because there could be other tasks running which created _pending segments_ of overlapping intervals. If there were such tasks, they could end up creating overlapping segments which breaks the timeline of brokers and the coordinator.
   
   I think this check probably should include pending segments as well, and be done in `TaskLockbox` with a proper synchronization.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org