You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2019/06/20 01:24:39 UTC

[GitHub] [incubator-pinot] npawar opened a new issue #4345: Config to make realtime non-winner servers download the segment instead of build

npawar opened a new issue #4345: Config to make realtime non-winner servers download the segment instead of build
URL: https://github.com/apache/incubator-pinot/issues/4345
 
 
   During realtime consumption, when the rows/time threshold is reached, one winner is chosen among all replicas. This winner builds the segment, uploads it to the controller. After this, the segment metadata is updated, and ideal state is updated. The ideal state update sends a CONSUMING-> ONLINE state transition to all replicas. Based on the code in LLRealtimeSegmentDataManager::goOnlineFromConsuming, each replica is given 10 minutes for the consuming thread to die. During this time, the consumers are either asked to catchup and build segment or discard and download segment. After this is completed (build or download), the replica can mark itself ONLINE. 
   The time it takes for the replica to be marked ONLINE, varies depending on which of these paths it was asked to take. For the winner server, the time will be the least, as it has already built segment, and it will come up ONLINE the fastest. For servers asked to download, the time will be slightly more than the winner (mostly dependent on network speed). For servers asked to build their own segment, this time will be dependent on segment build time. Segment builds can become time consuming operations, depending on data size, indexing, sorting, consumption time, etc.  As a result, they can take much longer to come up ONLINE.
   We might encounter a significant amount of time, when only 1 replica is ONLINE to serve traffic, and hence have to lift all the load by itself.
   
   On way to solve this problem is to add a config, to make non-winner servers always download the segment instead of rebuilding. This is advantageous in more ways, as we will avoid the heap penalty of rebuilding, sorting, making inverted indexes, and simply download the ready made segment.
   
   
   @mcvsubbu @Jackie-Jiang 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org