You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2019/10/05 00:30:47 UTC

[GitHub] [incubator-pinot] mcvsubbu opened a new pull request #4679: [#4667] Fix auto-tuning algorithm to update when parameters are changed

mcvsubbu opened a new pull request #4679: [#4667] Fix auto-tuning algorithm to update when parameters are changed
URL: https://github.com/apache/incubator-pinot/pull/4679
 
 
   When any of the parameters of segment auto-tuning are changed, we currently
   miss picking them up, since we cache the FlushThreshodUpdater in
   memory in the controller. A controller restart will pick up the
   new parameters, but we can do better.
   
   Changed the auto-tuning mechanism to take parameters on every call, so
   that we can recognize chantges and act accordingly.
   
   Extra logic added:
   We used to consider that we hit the time limit any time when the
   number of rows in committing segment is lower than the target
   we set for it. Now, we also check that the target segment size
   must be lower than the desired size. If the target segment size
   is higher (most likely because the operator set it higher),
   we need to fall through to the computation based on ratio.
   
   Further,
   
   When we hit the time limit in the committing segment, it may
   be the case that the new time limit is even lower than the time we
   spend consuming the segment being committed.
   In that case, we should reduce the number of rows consumed by
   the committing segment (as per the average consumption rate)
   before applying the standard multiplier when we hit the time limit.
   
   Some examples are useful, since the logic is a bit involved:
   
   Assume segment size was set to 200M, and time limit was 3h, and we
   set the number of rows to 4M, and let a segment start consuming.
   
   Case-1:
   While it is consuming, the operator  changes the optimal segment
   size to 180 M.
   The segment comes back with a 190M size after hitting the time limit.
   and consumes 3.8M rows.
   
   Previously, we would have increased the number of rows as 3.8M * 1.1
   
   After this change, we will fall through to computing the rows using the
   ratio (and effectively reduce the number of rows).
   
   Case 2:
   The operator changes the time limit to 1 hr.
   The segment comes back the same way as before, 190M size, 3.8M rows
   consumed in 3 hrs.
   
   Previously, we would have treated this as a time limit hit, and
   increased the number of rows to 3.8M * 1.1
   
   After this change, we will assume that the segment actually consumed
   (3.8M * 1 hr)/3hrs (i.e. approx 1.3M rows, and then apply the multiplier,
   getting a value of about 1.4M rows target.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org