You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/05/19 17:15:07 UTC

[GitHub] [beam] lukecwik edited a comment on pull request #11715: [BEAM-9977] Implement GrowableOffsetRangeTracker

lukecwik edited a comment on pull request #11715:
URL: https://github.com/apache/beam/pull/11715#issuecomment-630955102


   > > Should we be using the RangeEndEstimator when providing progress/splitting for ranges not ending at `Long.MAX_VALUE`?
   > > Lets say the range estimate is bad and is `MAX_VALUE - 3` but the real end is `5000`, then after a split we end up with `[0, (MAX_VALUE - 3) * 0.5)` and `[(MAX_VALUE - 3) * 0.5, MAX_VALUE)`. We may quickly learn that the residual is empty and then lose all effective progress on the primary.
   > 
   > I can see the benefit of using `RangeEndEstimator` for the finite range here. But as long as we don't modify the range end to estimate end or use estimate ed in `tryClaim`, we still cannot say the residual is empty.
   
   That is true but I was thinking it would make better splitting decisions instead of creating a bunch of empty splits trimming the range down. The advantage of not using the estimator is that we don't have to invoke since it could be expensive for the user and in many situations will produce a value greater than `to`.
   
   We can leave it out for now unless some compelling use case comes up.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org